idnits 2.17.1 

draft-ietf-dhc-dhcpv6-failover-design-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([RFC3315],
     [I-D.ietf-dhc-dhcpv6-failover-requirements]), which it shouldn't.  Please
     replace those with straight textual mentions of the documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 1452 has weird spacing: '... accept    acc...'

  == Line 1454 has weird spacing: '... accept    acc...'

  == Line 1455 has weird spacing: '... accept    acc...'

  -- The document date (September 13, 2013) is 3868 days in the past.  Is
     this intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Obsolete normative reference: RFC 3315 (Obsoleted by RFC 8415)

  ** Obsolete normative reference: RFC 3633 (Obsoleted by RFC 8415)

  == Outdated reference: A later version (-02) exists of
     draft-ietf-dhc-dhcpv6-load-balancing-00


     Summary: 3 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Dynamic Host Configuration (DHC)                            T. Mrugalski
3	Internet-Draft                                                       ISC
4	Intended status: Standards Track                              K. Kinnear
5	Expires: March 17, 2014                                            Cisco
6	                                                      September 13, 2013

8	                         DHCPv6 Failover Design
9	                draft-ietf-dhc-dhcpv6-failover-design-04

11	Abstract

13	   DHCPv6 defined in [RFC3315] does not offer server redundancy.  This
14	   document defines a design for DHCPv6 failover, a mechanism for
15	   running two servers on the same network with capability for either
16	   server to take over clients' leases in case of server failure or
17	   network partition.  This is a DHCPv6 Failover design document, it is
18	   not a protocol specification document.  It is a second document in a
19	   planned series of three documents.  DHCPv6 failover requirements are
20	   specified in [I-D.ietf-dhc-dhcpv6-failover-requirements].  A protocol
21	   specification document is planned to follow this document.

23	Status of This Memo

25	   This Internet-Draft is submitted in full conformance with the
26	   provisions of BCP 78 and BCP 79.

28	   Internet-Drafts are working documents of the Internet Engineering
29	   Task Force (IETF).  Note that other groups may also distribute
30	   working documents as Internet-Drafts.  The list of current Internet-
31	   Drafts is at http://datatracker.ietf.org/drafts/current/.

33	   Internet-Drafts are draft documents valid for a maximum of six months
34	   and may be updated, replaced, or obsoleted by other documents at any
35	   time.  It is inappropriate to use Internet-Drafts as reference
36	   material or to cite them other than as "work in progress."

38	   This Internet-Draft will expire on March 17, 2014.

40	Copyright Notice

42	   Copyright (c) 2013 IETF Trust and the persons identified as the
43	   document authors.  All rights reserved.

45	   This document is subject to BCP 78 and the IETF Trust's Legal
46	   Provisions Relating to IETF Documents
47	   (http://trustee.ietf.org/license-info) in effect on the date of
48	   publication of this document.  Please review these documents
49	   carefully, as they describe your rights and restrictions with respect
50	   to this document.  Code Components extracted from this document must
51	   include Simplified BSD License text as described in Section 4.e of
52	   the Trust Legal Provisions and are provided without warranty as
53	   described in the Simplified BSD License.

55	Table of Contents

57	   1.  Requirements Language . . . . . . . . . . . . . . . . . . . .   3
58	   2.  Glossary  . . . . . . . . . . . . . . . . . . . . . . . . . .   4
59	   3.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   5
60	     3.1.  Design Requirements . . . . . . . . . . . . . . . . . . .   6
61	     3.2.  Features out of Scope: Load Balancing . . . . . . . . . .   6
62	   4.  Protocol Overview . . . . . . . . . . . . . . . . . . . . . .   7
63	     4.1.  Failover State Machine Overview . . . . . . . . . . . . .   8
64	     4.2.  Messages  . . . . . . . . . . . . . . . . . . . . . . . .  10
65	   5.  Connection Management . . . . . . . . . . . . . . . . . . . .  12
66	     5.1.  Creating Connections  . . . . . . . . . . . . . . . . . .  12
67	     5.2.  Endpoint Identification . . . . . . . . . . . . . . . . .  13
68	   6.  Resource Allocation . . . . . . . . . . . . . . . . . . . . .  14
69	     6.1.  Proportional Allocation . . . . . . . . . . . . . . . . .  14
70	     6.2.  Independent Allocation  . . . . . . . . . . . . . . . . .  17
71	     6.3.  Choosing Allocation Algorithm . . . . . . . . . . . . . .  17
72	   7.  Information model . . . . . . . . . . . . . . . . . . . . . .  18
73	   8.  Failover Mechanisms . . . . . . . . . . . . . . . . . . . . .  23
74	     8.1.  Time Skew . . . . . . . . . . . . . . . . . . . . . . . .  23
75	     8.2.  Lazy updates  . . . . . . . . . . . . . . . . . . . . . .  23
76	     8.3.  MCLT concept  . . . . . . . . . . . . . . . . . . . . . .  24
77	       8.3.1.  MCLT example  . . . . . . . . . . . . . . . . . . . .  25
78	     8.4.  Unreachability detection  . . . . . . . . . . . . . . . .  26
79	     8.5.  Re-allocating Leases  . . . . . . . . . . . . . . . . . .  27
80	     8.6.  Sending Binding Update  . . . . . . . . . . . . . . . . .  28
81	     8.7.  Receiving Binding Update  . . . . . . . . . . . . . . . .  29
82	     8.8.  Conflict Resolution . . . . . . . . . . . . . . . . . . .  30
83	     8.9.  Acknowledging Reception . . . . . . . . . . . . . . . . .  32
84	   9.  Endpoint States . . . . . . . . . . . . . . . . . . . . . . .  32
85	     9.1.  State Machine Operation . . . . . . . . . . . . . . . . .  32
86	     9.2.  State Machine Initialization  . . . . . . . . . . . . . .  35
87	     9.3.  STARTUP State . . . . . . . . . . . . . . . . . . . . . .  35
88	       9.3.1.  Operation in STARTUP State  . . . . . . . . . . . . .  36
89	       9.3.2.  Transition Out of STARTUP State . . . . . . . . . . .  36
90	     9.4.  PARTNER-DOWN State  . . . . . . . . . . . . . . . . . . .  38
91	       9.4.1.  Operation in PARTNER-DOWN State . . . . . . . . . . .  38
92	       9.4.2.  Transition Out of PARTNER-DOWN State  . . . . . . . .  39
93	     9.5.  RECOVER State . . . . . . . . . . . . . . . . . . . . . .  39
94	       9.5.1.  Operation in RECOVER State  . . . . . . . . . . . . .  39
95	       9.5.2.  Transition Out of RECOVER State . . . . . . . . . . .  40
96	     9.6.  RECOVER-WAIT State  . . . . . . . . . . . . . . . . . . .  41
97	       9.6.1.  Operation in RECOVER-WAIT State . . . . . . . . . . .  41
98	       9.6.2.  Transition Out of RECOVER-WAIT State  . . . . . . . .  41
99	     9.7.  RECOVER-DONE State  . . . . . . . . . . . . . . . . . . .  42
100	       9.7.1.  Operation in RECOVER-DONE State . . . . . . . . . . .  42
101	       9.7.2.  Transition Out of RECOVER-DONE State  . . . . . . . .  42
102	     9.8.  NORMAL State  . . . . . . . . . . . . . . . . . . . . . .  43
103	       9.8.1.  Operation in NORMAL State . . . . . . . . . . . . . .  43
104	       9.8.2.  Transition Out of NORMAL State  . . . . . . . . . . .  44
105	     9.9.  COMMUNICATIONS-INTERRUPTED State  . . . . . . . . . . . .  44
106	       9.9.1.  Operation in COMMUNICATIONS-INTERRUPTED State . . . .  45
107	       9.9.2.  Transition Out of COMMUNICATIONS-INTERRUPTED State  .  45
108	     9.10. POTENTIAL-CONFLICT State  . . . . . . . . . . . . . . . .  47
109	       9.10.1.  Operation in POTENTIAL-CONFLICT State  . . . . . . .  47
110	       9.10.2.  Transition Out of POTENTIAL-CONFLICT State . . . . .  47
111	     9.11. RESOLUTION-INTERRUPTED State  . . . . . . . . . . . . . .  48
112	       9.11.1.  Operation in RESOLUTION-INTERRUPTED State  . . . . .  49
113	       9.11.2.  Transition Out of RESOLUTION-INTERRUPTED State . . .  49
114	     9.12. CONFLICT-DONE State . . . . . . . . . . . . . . . . . . .  49
115	       9.12.1.  Operation in CONFLICT-DONE State . . . . . . . . . .  49
116	       9.12.2.  Transition Out of CONFLICT-DONE State  . . . . . . .  50
117	   10. Proposed extensions . . . . . . . . . . . . . . . . . . . . .  50
118	     10.1.  Active-active mode . . . . . . . . . . . . . . . . . . .  50
119	   11. Dynamic DNS Considerations  . . . . . . . . . . . . . . . . .  50
120	     11.1.  Relationship between failover and dynamic DNS update . .  51
121	     11.2.  Exchanging DDNS Information  . . . . . . . . . . . . . .  52
122	     11.3.  Adding RRs to the DNS  . . . . . . . . . . . . . . . . .  54
123	     11.4.  Deleting RRs from the DNS  . . . . . . . . . . . . . . .  54
124	     11.5.  Name Assignment with No Update of DNS  . . . . . . . . .  55
125	   12. Reservations and failover . . . . . . . . . . . . . . . . . .  55
126	   13. Security Considerations . . . . . . . . . . . . . . . . . . .  57
127	   14. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  57
128	   15. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  57
129	   16. References  . . . . . . . . . . . . . . . . . . . . . . . . .  58
130	     16.1.  Normative References . . . . . . . . . . . . . . . . . .  58
131	     16.2.  Informative References . . . . . . . . . . . . . . . . .  58
132	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  59

134	1.  Requirements Language

136	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
137	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
138	   document are to be interpreted as described in RFC 2119 [RFC2119].

140	2.  Glossary

142	   This is a supplemental glossary that should be combined with
143	   definitions in Section 3 of
144	   [I-D.ietf-dhc-dhcpv6-failover-requirements].

146	   o  auto-partner-down - a capability where a failover server will move
147	      from COMMUNICATIONS-INTERRUPTED state to PARTNER-DOWN state
148	      automatically, without operator intervention.

150	   o  DDNS - Dynamic DNS.  Typically used as an acronym referring to
151	      dynamic update of the DNS.

153	   o  Failover endpoint - The failover protocol allows for there to be a
154	      unique failover 'endpoint' for each failover relationship in which
155	      a failover server participates.  The failover relationship is
156	      defined by a relationship name, and includes the failover partner
157	      IP address, the role this server takes with respect to that
158	      partner (primary or secondary), and the prefixes associated with
159	      that relationship.  Note that a single prefix can only be
160	      associated with a single failover relationship.  This failover
161	      endpoint can take actions and hold unique states.  Typically,
162	      there is one failover endpoint per partner (server), although
163	      there may be more.  'Server' and 'failover endpoint' are
164	      synonymous only if the server participates in only one failover
165	      relationship.  However, for the sake of simplicity 'Server' is
166	      used throughout the document to refer to a failover endpoint
167	      unless to do so would be confusing.

169	   o  Failover communication - all messages exchanged between partners.

171	   o  Independent Allocation - an allocation algorithm that splits the
172	      available pool of resources between the primary and secondary
173	      servers that is particularly well suited for vast pools (i.e. when
174	      available resources are not expected to deplete).  See Section 6.2
175	      for details.

177	   o  Lease - an association of a DHCPv6 client with an IPv6 address or
178	      delegated prefix.

180	   o  Partner - name of the other DHCPv6 server that participates in
181	      failover relationship.  When the role (primary or secondary) is
182	      not important, the other server is referred to as a "failover
183	      partner" or simply partner.

185	   o  Primary Server - First out of two DHCPv6 servers that participate
186	      in a failover relationship.  In active-passive mode this is the
187	      server that handles most of the client traffic.  Its failover
188	      partner is referred to as secondary server.

190	   o  Proportional Allocation - an allocation algorithm that splits the
191	      available resources between the primary and secondary servers and
192	      maintains proportions between available resources on both.  It is
193	      particularly well suited for more limited resources.  See
194	      Section 6.1 for details.

196	   o  Resource - Any type of resource that is managed by DHCPv6.
197	      Currently there are three types of such resources defined: a non-
198	      temporary IPv6 address, a temporary IPv6 address, and an IPv6
199	      prefix.  Other resource types may be defined in the future.

201	   o  Responsive - A server that is responsive, will respond to DHCPv6
202	      client requests.

204	   o  Secondary Server - Second of two DHCPv6 servers that participate
205	      in a failover relationship.  Its failover partner is referred to
206	      as the primary server.  In active-passive mode this server (the
207	      secondary) typically does not handle client traffic and acts as a
208	      backup.

210	   o  Server - A DHCPv6 server that implements DHCPv6 failover.
211	      'Server' and 'failover endpoint' are synonymous only if the server
212	      participates in only one failover relationship.

214	   o  Unresponsive - A server that is unresponsive will not respond to
215	      DHCPv6 client requests.

217	3.  Introduction

219	   The failover protocol design provides a means for cooperating DHCPv6
220	   servers to work together to provide a DHCPv6 service with
221	   availability that is increased beyond that which could be provided by
222	   a single DHCPv6 server operating alone.  It is designed to protect
223	   DHCPv6 clients against server unreachability, including server
224	   failure and network partition.  It is possible to deploy exactly two
225	   servers that are able to continue providing a lease on an IPv6
226	   address [RFC3315] or on an IPv6 prefix [RFC3633] without the DHCPv6
227	   client experiencing lease expiration or a reassignment of a lease to
228	   a different IPv6 address (or prefix) in the event of failure by one
229	   or the other of the two servers.

231	   This protocol defines active-passive mode, sometimes also called a
232	   hot standby model.  This means that during normal operation one
233	   server is active (i.e. actively responds to clients' requests) while
234	   the second is passive (i.e. it does receive clients' requests, but
235	   does not respond to them and only maintains a copy of lease database
236	   and is ready to take over incoming queries in case of primary server
237	   failure).  Active-active mode (i.e. both servers actively handling
238	   clients' requests) is currently not supported for the sake of
239	   simplicity.  Such a mode is likely to be defined as an extension at a
240	   later time and will probably be based on
241	   [I-D.ietf-dhc-dhcpv6-load-balancing].

243	   The failover protocol is designed to provide lease stability for
244	   leases with lease times beyond a short period.  Due in part to the
245	   additional overhead required as well as requirements to handle time
246	   skew between failover partners (See Section 8.1), failover is not
247	   suitable for leases shorter than 30 seconds.  The DHCPv6 Failover
248	   protocol MUST NOT be used for leases shorter than 30 seconds.

250	   This design attempts to fulfill all DHCPv6 failover requirements
251	   defined in [I-D.ietf-dhc-dhcpv6-failover-requirements].

253	3.1.  Design Requirements

255	   The following requirements are not related to failover the mechanism
256	   in general, but rather to this particular design.

258	   1.  Minimize Asymmetry - while there are two distinct roles in
259	       failover (primary and secondary server), the differences between
260	       those two roles should be as small as possible.  This will yield
261	       a simpler design as well as a simpler implementation of that
262	       design.

264	3.2.  Features out of Scope: Load Balancing

266	   While it is tempting to extend DHCPv6 failover mechanism to also
267	   offer load balancing, as DHCPv4 failover did, this design does not do
268	   that.  Here is the reasoning for this decision.  In general case (not
269	   related to failover) load balancing solutions are used when each
270	   server is not able to handle total incoming traffic.  However, by the
271	   very definition, DHCPv6 failover is supposed to assume service
272	   availability despite failure of one server.  That leads to the
273	   conclusion that each server must be able to handle all of the
274	   traffic.  Therefore in properly provisioned setup, load balancing is
275	   not needed.

277	   It is likely that active-active mode that is essentially a load
278	   balancing will be defined as an extension in the near future.

280	4.  Protocol Overview

282	   The DHCPv6 Failover Protocol is defined as a communication between
283	   failover partners with all associated algorithms and mechanisms.
284	   Failover communication is conducted over a TCP connection established
285	   between the partners.  The protocol reuses the framing format
286	   specified in Section 5.1 of DHCPv6 Bulk Leasequery [RFC5460], but
287	   uses different message types.  New failover-specific message types
288	   are listed in Section 4.2.  All information is sent over the
289	   connection as typical DHCPv6 messages that convey DHCPv6 options,
290	   following the format defined in Section 22.1 of [RFC3315].

292	   After initialization, the primary server establishes a TCP connection
293	   with its partner.  The primary server sends a CONNECT message with
294	   initial parameters.  Secondary server responds with CONNECTACK.

296	   If the primary server cannot immediately establish a connection with
297	   its partner, it will continue to attempt to establish a connection.
298	   See Section 5.1 for details.

300	   Depending on the failover state of each partner, they MUST initiate
301	   one of the binding update procedures.  Each server MAY send an UPDREQ
302	   message to request its partner to send all updates that have not been
303	   sent yet (this case applies when the partner has an existing database
304	   and wants to update it).  Alternatively, a server MAY choose to send
305	   an UPDREQALL message to request a full lease database transmission
306	   including all leases (this case applies in case of booting up a new
307	   server after installation, corruption or complete loss of database,
308	   or other catastrophic failure).

310	   Servers exchange lease information by using BNDUPD messages.
311	   Depending on the local and remote state of a lease, a server may
312	   either accept or reject the update.  Reception of lease update
313	   information is confirmed by responding with a BNDACK message with
314	   appropriate status.  The majority of the messages sent over a
315	   failover TCP connection consists of BNDUPD and BNDACK messages.

317	   A subset of available resources (addresses or prefixes) is reserved
318	   for secondary server use.  This is required for handling a case where
319	   both servers are able to communicate with clients, but unable to
320	   communicate with each other.  After the initial connection is
321	   established, the secondary server requests a pool of available
322	   addresses or prefixes by sending a POOLREQ message.  The primary
323	   server assigns addresses or prefixes to the secondary by sending a
324	   series of BNDUPD messages.  When this process is complete, the
325	   primary server sends a POOLRESP message to the secondary server.  The
326	   secondary server may initiate such pool request at any time when in
327	   communication with primary server.

329	   Failover servers use a lazy update mechanism to update their failover
330	   partner about changes to their lease state database.  After a server
331	   performs any modifications to its lease state database (assign a new
332	   lease, extend, release or expire existing lease), it sends its
333	   response to the client's request first (performing the "regular"
334	   DHCPv6 operation) and then informs its failover partner using a
335	   BNDUPD message.  This BNDUPD message SHOULD be sent soon after the
336	   response is sent to the DHCPv6 client, but there is no specific
337	   requirement of a minimum time in which to do so.

339	   The major problem with a lazy update mechanism is when the server
340	   crashes after sending a response to client, but before sending the
341	   lazy update to its partner (or when communication between partners is
342	   interrupted).  To solve this problem, the concept known as the
343	   Maximum Client Lead Time (initially designed for DHCPv4 failover) is
344	   used.  The MCLT is the maximum amount of time that one server can
345	   extend a lease for a client's binding beyond the time known by its
346	   failover partner.  See Section 8.3 for a detailed description how the
347	   MCLT affects assigned lifetimes.

349	   Servers verify each others availability by periodically exchanging
350	   CONTACT messages.  See Section 8.4 for discussion about detecting a
351	   partner's unreachability.

353	   A server that is being shut down transmits a DISCONNECT message,
354	   closes the connection with its failover partner and stops operation.
355	   A Server SHOULD transmit any pending lease updates before
356	   transmitting DISCONNECT message.

358	4.1.  Failover State Machine Overview

360	   The following section provides a simplified description of all
361	   states.  For the sake of clarity and simplicity, it omits important
362	   details.  For a complete description, see Section 9.  In case of a
363	   disagreement between the simplified and complete description, please
364	   follow Section 9.

366	   Each server MUST be in one of the well defines states.  Depending on
367	   its current state a server may be either responsive (responds to
368	   clients' queries) or unresponsive (clients' queries are ignored).

370	   A server starts its operation in the short-lived STARTUP state.  A
371	   server determines its partner reachability and state and sets its own
372	   state based on that determination.  It typically returns back to the
373	   state it was in before shutdown, though the details can be
374	   complicated.  See Section 9.3.2.

376	   During typical operation when servers maintain communication, both
377	   are in NORMAL state.  In that state only the primary responds to
378	   clients' requests.  The secondary server is unresponsive.

380	   If a server discovers that its partner is no longer reachable, it
381	   goes to COMMUNICATIONS-INTERRUPTED state.  A server must be extra
382	   cautious as it can't distinguish if its partner is down or just
383	   communication between servers is interrupted.  Since communication
384	   between partners is not possible, a server must act on the assumption
385	   that its partner is up.  A failover server must follow a defined
386	   procedure, in particular, it MUST NOT extend any lease more than the
387	   MCLT beyond its partner's knowledge of the lease expiration time.
388	   This imposes an additional burden on the server, in that clients will
389	   return to the server for lease renewals more frequently than they
390	   would otherwise.  Therefore it is not recommended to operate for
391	   prolonged periods in this state.  Once communication is
392	   reestablished, a server may go into NORMAL, POTENTIAL-CONFLICT or
393	   PARTNER-DOWN state.  It may also stay in COMMUNICATIONS-INTERRUPTED
394	   state if certain conditions are met.

396	   Once a server is switched into PARTNER-DOWN (when auto-partner-down
397	   is used or as a result of administrative action), it can extend
398	   leases, regardless of the original server that initially granted the
399	   lease.  In that state server handles leases from its own pool, but
400	   once its own pool is depleted is also able to serve pool from its
401	   downed partner.  Some MCLT restrictions no longer apply, but the MCLT
402	   still affects whether or not a particular lease can be given to a
403	   different client.  See Section 9.4.1 for details.  Operation in this
404	   mode is less demanding for the server that remains operational, than
405	   in COMMUNICATIONS-INTERRUPTED state, but PARTNER-DOWN does not offer
406	   any kind of redundancy.  Even when in PARTNER-DOWN state, a failover
407	   server continues to attempt to connect with its failover partner.

409	   A server switches into RECOVER state when any of a variety of
410	   conditions are encountered:

412	   o  When a backup server contacts its failover partner for the first
413	      time.

415	   o  When either server discovers that its failover partner has
416	      contacted it before but it has no local record of this contact.
417	      If the record of previous contact is held in the lease-state
418	      database, then this situation implies that the server has lost its
419	      lease state database.

421	   o  When its failover partner is in PARTNER-DOWN state.

423	   Any of these conditions signal that the server needs to refresh its
424	   lease-state database from its partner.  Once this operation is
425	   complete, it switches to RECOVER-WAIT and later to RECOVER-DONE.  See
426	   Section 9.6.2.

428	   Once servers reestablish connection, they discover each others'
429	   state.  Depending on the conditions, they may return to NORMAL or
430	   move to POTENTINAL-CONFLICT if the partner is in a state that doesn't
431	   allow a simple re-integration of the server's lease state databases.
432	   It is a goal of this protocol to minimize the possibility that
433	   POTENTIAL-CONFLICT state is ever entered.  Servers running in
434	   POTENTIAL-CONFLICT do not respond to clients' requests and work only
435	   on resolving potential conflicts.  Once outstanding lease updates are
436	   exchanged, servers move to CONFLICT-DONE or NORMAL states.

438	   Servers that are recovering from potential conflicts and loose
439	   communication, switch to RESOLUTION-INTERRUPTED.

441	   A server that is being shut down sends a DISCONNECT message.  See
442	   Section 4.2.  A server that receives a DISCONNECT message moves into
443	   COMMUNICATIONS-INTERRUPTED state.

445	4.2.  Messages

447	   The failover protocol is centered around the message exchanges used
448	   by one server to update its partner and respond to received updates.
449	   It should be noted that no specific formats or message type values
450	   are assigned in this document.  Appropriate implementation details
451	   will be specified in a separate protocol specification document.  The
452	   following list enumerates these messages:

454	   o  BNDUPD - The binding update message is used to send the binding
455	      lease changes to the partner.  One message may contain one or more
456	      lease updates.  The partner is expected to respond with a BNDACK
457	      message.

459	   o  BNDACK - The binding acknowledgement is used for confirmation of
460	      the received BNDUPD message.  It may contain a positive or
461	      negative response (e.g. due to detected lease conflict).

463	   o  POOLREQ - The Pool Request message is used by one server
464	      (typically secondary) to request allocation of resources
465	      (addresses or prefixes) from its partner.  The partner responds
466	      with POOLRESP.

468	   o  POOLRESP - The Pool Response message is used by one server
469	      (typically primary) to indicate that it has responded to its
470	      partner's request for resources allocation.

472	   o  UPDREQ - The update request message is used by one server to
473	      request that its partner send all binding database changes that
474	      have not been sent and confirmed already.  Requested partner is
475	      expected to respond with zero or more BNDUPD messages, followed by
476	      UPDDONE that signals end of updates.

478	   o  UPDREQALL - The update request all is used by one server to
479	      request that all binding database information be sent in order to
480	      recover from a total loss of its binding database by the
481	      requesting server.  Requested server responds with zero or more
482	      BNDUPD messages, followed by UPDDONE that signal end of updates.

484	   o  UPDDONE - The update done message is used by the server responding
485	      to an UPDREQ or UPDREQALL to indicate that all requested updates
486	      have been sent by the responding server and acked by the
487	      requesting server.

489	   o  CONNECT - The connect message is used by the primary server to
490	      establish a high level connection with the other server, and to
491	      transmit several important configuration data items between the
492	      servers.  The partner is expected to confirm by responding with
493	      CONNECTACK message.

495	   o  CONNECTACK - The connect acknowledgement message is used by the
496	      secondary server to respond to a CONNECT message from the primary
497	      server.

499	   o  DISCONNECT - The disconnect message is used by either server when
500	      closing a connection and shutting down.  No response is required
501	      for this message.

503	   o  STATE - The state message is used by either server to inform its
504	      partner about a change of failover state.  In some cases it may be
505	      used to also inform the partner about current state, e.g. after
506	      connection is established in COMMUNICATIONS-INTERRUPTED or
507	      PARTNER-DOWN states.

509	   o  CONTACT - The contact message is used by either server to ensure
510	      that the other server continues to see the connection as
511	      operational.  It MUST be transmitted periodically over every
512	      established connection if other message traffic is not flowing,
513	      and it MAY be sent at any time.

515	5.  Connection Management

517	5.1.  Creating Connections

519	   Every primary server implementing the failover protocol MUST attempt
520	   to connect to all of its partners periodically, where the period is
521	   implementation dependent and SHOULD be configurable.  In the event
522	   that a connection has been rejected by a CONNECTACK message with a
523	   reject-reason option contained in it or a DISCONNECT message, a
524	   server SHOULD reduce the frequency with which it attempts to connect
525	   to that server but it MUST continue to attempt to connect
526	   periodically.

528	   Every secondary server implementing the failover protocol MUST listen
529	   for connection attempts from the primary server.

531	   When a connection attempt succeeds, the primary server which has
532	   initiated the connection attempt MUST send a CONNECT message down the
533	   connection.

535	   When a connection attempt is received, the only information that the
536	   receiving server has is the IP address of the partner initiating a
537	   connection.  If it has any relationships with the connecting server
538	   for which it is a secondary server, it should just await the CONNECT
539	   message to determine which relationship this connection is to serve.

541	   If it has no secondary relationships with the connecting server, it
542	   MUST drop the connection.  The goal is to limit the resources
543	   expended dealing with attempts to create a spurious failover
544	   connection.

546	   To summarize -- a primary server MUST use a connection that it has
547	   initiated in order to send a CONNECT message.  Every server that is a
548	   secondary server in a relationship simply listens for connection
549	   attempts from the primary server.

551	   Once a connection is established, the primary server MUST send a
552	   CONNECT message across the connection.  A secondary server MUST wait
553	   for the CONNECT message from a primary server.  If the secondary
554	   server doesn't receive a CONNECT message from the primary server in
555	   an installation dependent amount of time, it MAY drop the connection.

557	   Every CONNECT message includes a TLS-request option, and if the
558	   CONNECTACK message does not reject the CONNECT message and the TLS-
559	   reply option says TLS MUST be used, then the servers will immediately
560	   enter into TLS negotiation.

562	   Once TLS negotiation is complete, the primary server MUST resend the
563	   CONNECT message on the newly secured TLS connection and then wait for
564	   the CONNECTACK message in response.  The TLS-request and TLS-reply
565	   options MUST NOT appear in either this second CONNECT or its
566	   associated CONNECTACK message as they had in the first messages.

568	   The second message sent over a new connection (either a bare TCP
569	   connection or a connection utilizing TLS) is a STATE message.  Upon
570	   the receipt of this message, the receiver can consider communications
571	   up.

573	5.2.  Endpoint Identification

575	   The proper operation of the failover protocol requires more than the
576	   transmission of messages between one server and the other.  Each
577	   endpoint might seem to be a single DHCPv6 server, but in fact there
578	   are situations where additional flexibility in configuration is
579	   useful.  A failover endpoint is always associated with a set of
580	   DHCPv6 prefixes that are configured on the DHCPv6 server where the
581	   endpoint appears.  A DHCPv6 prefix MUST NOT be associated with more
582	   than one failover endpoint.

584	   The failover protocol SHOULD be configured with one failover
585	   relationship between each pair of failover servers.  In this case
586	   there is one failover endpoint for that relationship on each failover
587	   partner.  This failover relationship MUST have a unique name.

589	   There is typically little need for additional relationships between
590	   any two servers but there MAY be more than one failover relationship
591	   between two servers -- however each MUST have a unique relationship
592	   name.

594	   Any failover endpoint can take actions and hold unique states.

596	   This document frequently describes the behavior of the protocol in
597	   terms of primary and secondary servers, not primary and secondary
598	   failover endpoints.  However, it is important to remember that every
599	   'server' described in this document is in reality a failover endpoint
600	   that resides in a particular process, and that several failover end-
601	   points may reside in the same server process.

603	   It is not the case that there is a unique failover endpoint for each
604	   prefix that participates in a failover relationship.  On one server,
605	   there is (typically) one failover endpoint per partner, regardless of
606	   how many prefixes are managed by that combination of partner and
607	   role.  Conversely, on a particular server, any given prefix will be
608	   associated with exactly one failover endpoint.

610	   When a connection is received from the partner, the unique failover
611	   endpoint to which the message is directed is determined solely by the
612	   IP address of the partner, the relationship-name, and the role of the
613	   receiving server.

615	6.  Resource Allocation

617	   Currently there are two allocation algorithms defined for resources
618	   (addresses or prefixes).  Additional allocation schemes may be
619	   defined as future extensions.

621	   1.  Proportional Allocation - This allocation algorithm is a direct
622	       application of the algorithm defined in [dhcpv4-failover] to
623	       DHCPv6.  Remaining available resources are split between the
624	       primary and secondary servers in a configured proportion.
625	       Released resources are always returned to the primary server.
626	       Primary and secondary servers may initiate a rebalancing
627	       procedure when disparity between resources available to each
628	       server reaches a preconfigured threshold.  Only resources that
629	       are not leased to any clients are "owned" by one of the servers.
630	       This algorithm is particularly well suited for scenarios where
631	       amount of available resources is limited, as may be the case with
632	       prefix delegation.  See Section 6.1 for details.

634	   2.  Independent Allocation - This allocation algorithm also assumes
635	       that available resources are split between primary and secondary
636	       servers.  In this case, however, resources are assigned to a
637	       specific server for all time, regardless if they are available or
638	       currently used.  This algorithm is much simpler than proportional
639	       allocation, because resource imbalance doesn't have to be checked
640	       and there is no rebalancing for independent allocation.  This
641	       algorithm is particularly well suited for scenarios where the
642	       there is an abundance of available resources which is typically
643	       the case for DHCPv6 address allocation.  See Section 6.2 for
644	       details.

646	6.1.  Proportional Allocation

648	   In this allocation scheme, each server has its own pool of available
649	   resources.  Remaining available resources are split between the
650	   primary and secondary servers in a configured proportion.  Note that
651	   a resource is not "owned" by a particular server throughout its
652	   entire lifetime.  Only a resource which is available is "owned" by a
653	   particular server -- once it has been leased to a client, it is not
654	   owned by either failover partner.  When it finally becomes available
655	   again, it will be owned initially by the primary server, and it may
656	   or may not be allocated to the secondary server by the primary
657	   server.

659	   The flow of a resource is as follows: initially a resource is owned
660	   by the primary server.  It may be allocated to the secondary server
661	   if it is available, and then it is owned by the secondary server.
662	   Either server can allocate available resources which they own to
663	   clients, in which case they cease to own them.  When the client
664	   releases the resource or the lease on it expires, it will again
665	   become available and will be owned by the primary.

667	   A resource will not become owned by the server which allocated it
668	   initially when it is released or the lease expires because, in
669	   general, that server will have had to replenish its pool of available
670	   resources well in advance of any likely lease expirations.  Thus,
671	   having a particular resource cycle back to the secondary might well
672	   put the secondary more out of balance with respect to the primary
673	   instead of enhancing the balance of available addresses or prefixes
674	   between them.

676	   Pools governed by proportional allocation are used for allocation
677	   when the server is in all states, except PARTNER-DOWN.  In PARTNER-
678	   DOWN state the healthy partner can allocate from either pool (both
679	   its own, and its partner's after some time constraints have elapsed).
680	   This allocation and maintenance of these address pools is an area of
681	   some sensitivity, since the goal is to maintain a more or less
682	   constant ratio of available addresses between the two servers.

684	   The initial allocation when the servers first integrate is triggered
685	   by the POOLREQ message from the secondary to the primary.  This is
686	   followed (at some point) by the POOLRESP message where the primary
687	   tells the secondary that it received and processed the POOLREQ
688	   message.  The primary sends the allocated resources to the secondary
689	   via BNDUPD messages.  The POOLRESP message may be sent before,
690	   during, or at the completion of the BNDUPD message exchanges that
691	   were triggered by the POOLREQ message.  The POOLREQ/POOLRESP message
692	   exchange is a trigger to the primary to perform a scan of its
693	   database and to ensure that the secondary has enough resources (based
694	   on some configured ratio).

696	   The primary server SHOULD examine some or all of its database from
697	   time to time to determine if resources should be shifted between the
698	   primary and secondary (in either direction).  The POOLREQ/POOLRESP
699	   message exchange allows the secondary server to explicitly request
700	   that the primary server examine the entirety of its database to
701	   ensure that the secondary has the appropriate resources available.

703	   Servers frequently have several kinds of resources available on a
704	   particular network segment.  The failover protocol assumes that both
705	   primary and secondary servers are configured in such a way that each
706	   knows the type and number of resources on every network segment
707	   participating in the failover protocol.  The primary server is
708	   responsible for allocating the secondary server the correct
709	   proportion of available resources of each kind.

711	   The resources are delegated to the secondary using the BNDUPD message
712	   with a state of FREE_BACKUP, which indicates the resource is now
713	   available for allocation by the secondary.  Once the message is sent,
714	   the primary MUST NOT use these resources for allocation to DHCPv6
715	   clients.

717	   Available resources can be delegated back to the primary server in
718	   certain cases.  BNDUPD will contain state FREE for leases that were
719	   previously in FREE_BACKUP state.

721	   The POOLREQ/POOLRESP message exchange initiated by the secondary is
722	   valid at any time both partners remain in contact, and the primary
723	   server SHOULD, whenever it receives the POOLREQ message, scan its
724	   database of prefixes and determine if the secondary needs more
725	   resources from any of the prefixes.

727	   In order to support a reasonably dynamic balance of the resources
728	   between the failover partners, the primary server needs to do
729	   additional work to ensure that the secondary server has as many
730	   resources as it needs (but that it doesn't have more than it needs).

732	   The primary server SHOULD examine the balance of available resources
733	   between the primary and secondary for a particular prefix whenever
734	   the number of available resources for either the primary or secondary
735	   changes by more than a configured limit.  The primary server SHOULD
736	   adjust the available resource balance as required to ensure the
737	   configured resource balance, excepting that the primary server SHOULD
738	   employ some threshold mechanism to such a balance adjustment in order
739	   to minimize the overhead of maintaining this balance.

741	   An example of a threshold approach is: do not attempt to re-balance
742	   the prefixes on the primary and secondary until the out of balance
743	   value exceeds a configured value.

745	   The primary server can, at any time, send an available resource to
746	   the secondary using a BNDUPD with the state FREE_BACKUP.  The primary
747	   server can attempt to take an available resource away from the
748	   secondary by sending a BNDUPD with the state FREE.  If the secondary
749	   accepts the BNDUPD, then the resource is now available to the primary
750	   and not available to the secondary.  Of course, the secondary MUST
751	   reject that BNDUPD if it has already used that resource for a DHCP
752	   client.

754	6.2.  Independent Allocation

756	   In this allocation scheme, available resources are permanently (until
757	   server configuration changes) split between servers.  Available
758	   resources are split between the primary and secondary servers as part
759	   of initial connection establishment.  Once resources are allocated to
760	   each server, there is no need to reassign them.  The resource
761	   allocation is algorithmic in nature, and does not require a message
762	   exchange for each resource allocated.  This algorithm is simpler than
763	   proportional allocation since it does not require a rebalancing
764	   mechanism.  It assumes that the pool assigned to each server will
765	   never deplete.  That is often a reasonable assumption for IPv6
766	   addresses (e.g. servers are often assigned a /64 pool that contains
767	   many more addresses than existing electronic devices on Earth).  This
768	   allocation mechanism SHOULD be used for IPv6 addresses, unless the
769	   configured address pool is small or is otherwise administratively
770	   limited.

772	   Once each server is assigned a resource pool during initial
773	   connection establishment, it may allocate assigned resources to
774	   clients.  Once a client releases a resource or its lease is expired,
775	   the returned resource returns to the pool for the server that leased
776	   it.  Resources never changes servers.

778	   Resources using the independent allocation approach are ignored when
779	   a server processes a POOLREQ message.

781	   During COMMUNICATION-INTERRUPTED events, a partner MAY continue
782	   extending existing leases when requested by clients.  A healthy
783	   partner MUST NOT lease resources that were assigned to its downed
784	   partner and later released by a client unless it is in PARTNER-DOWN
785	   state.  When it is in PARTNER-DOWN state, a server SHOULD use its own
786	   pool first and then it MAY start making new assignments from its
787	   downed partner's pool.  As the assumption is that independent
788	   allocation should be used only when available resources are vast and
789	   not expected to be fully used at any given time, it is very unlikely
790	   that the server will ever need to use its downed partner pools.  This
791	   makes a recovery even after prolonged down-time much easier.

793	6.3.  Choosing Allocation Algorithm

795	   All implementations SHOULD support both the proportional allocation
796	   algorithm and the independent allocation algorithm.  The specific
797	   requirements for support (i.e., which algorithm(s) MUST be
798	   supported), and the assignment of a specific algorithm to a specific
799	   allocation domain, would be documented in any protocol specifications
800	   that follow from this document.

802	   The proportional allocation mechanism is more flexible as it can
803	   dynamically rebalance available resources between servers.  That
804	   balance creates an additional burden for the servers and generates
805	   more traffic between servers.  The proportional algorithm can be
806	   considered more efficient at managing available resources, compared
807	   to the independent algorithm.  That is an important aspect when
808	   working in a network that is nearing address and/or prefix depletion.

810	   Independent allocation can be used when the number of available
811	   resources are large and there is no realistic danger of running out
812	   of resources.  Use of the independent allocation makes communication
813	   between partners simpler.  It also makes recovery easier and
814	   potential conflict less likely to appear.

816	   Typically independent allocation is used for IPv6 addresses, because
817	   even for /64 pools a server will never run out of addresses to
818	   assign, so there is no need to rebalance.  For the prefix delegation
819	   mechanism, available resources are typically much smaller, so there
820	   is a danger of running out of prefixes.  Therefore typically
821	   proportional allocation will be used for prefix delegations.
822	   Independent allocation still may be used, but the implication must be
823	   well understood.  For example in a network that delegates /64
824	   prefixes out of a /48 prefix (so there can be up to 65536 prefixes
825	   delegated) and a 1000 requesting routers, it is safe to use
826	   independent allocation.

828	   It should be stressed that the independent allocation algorithm
829	   SHOULD NOT be used when the number of resources is limited and there
830	   is a realistic danger of depleting resources.  If this recommendation
831	   is violated, it may lead to a case when one server denies clients due
832	   to pool depletion despite the fact that the other partner still has
833	   many resources available.

835	   With independent allocation it is very unlikely for a remaining
836	   healthy server to allocate resources from its unavailable partner's
837	   pool.  That makes recovery easier and any potential conflicts are
838	   less likely to appear.

840	7.  Information model

842	   In most DHCP servers a resource (an IP address or a prefix) can take
843	   on several different binding-status values, sometimes also called
844	   lease states.  While no two DHCP server implementations probably have
845	   exactly the same possible binding-status values, [RFC3315] enforces
846	   some commonality among the general semantics of the binding-status
847	   values used by various DHCP server implementations.

849	   In order to transmit binding database updates between one server and
850	   another using the failover protocol, some common denominator binding-
851	   status values must be defined.  It is not expected that these values
852	   correspond with any actual implementation of the DHCP protocol in a
853	   DHCP server, but rather that the binding-status values defined in
854	   this document should be a common denominator of those in use by many
855	   DHCP server implementations.

857	   The lease binding-status values defined for the failover protocol are
858	   listed below.  Unless otherwise noted below, there MAY be client
859	   information associated with each of these binding-status value.

861	   ACTIVE  -- The lease is assigned to a client.  Client identification
862	      data MUST appear.

864	   EXPIRED  -- indicates that a client's binding on a given lease has
865	      expired.  When the partner acks the BNDUPD of an expired lease,
866	      the server sets its internal state to FREE*. Client identification
867	      SHOULD appear.

869	   RELEASED  -- indicates that a client sent in RELEASE message.  When
870	      the partner acks the BNDUPD of a released lease, the server sets
871	      its internal state to FREE*. Client identification SHOULD appear.

873	   FREE*  -- Once a lease is expired or released, its state becomes
874	      FREE*. Depending on which algorithm and which pool was used to
875	      allocate a given lease, FREE* may either mean FREE or FREE_BACKUP.
876	      Implementations do not have to implement this FREE* state, but may
877	      choose to switch to the destination state directly.  For a clarity
878	      of representation, this transitional FREE* state is treated as a
879	      separate state.

881	   FREE  -- Is used when a DHCP server needs to communicate that a
882	      resource is unused by any client, but it was not just released,
883	      expired or reset by a network administrator.  When the partner
884	      acks the BNDUPD of a FREE lease, the server marks the lease as
885	      available for assignment by the primary server.  Note that on a
886	      secondary server running in PARTNER-DOWN state, after waiting the
887	      MCLT, the resource MAY be allocated to a client by the secondary
888	      server.  Client identification MAY appear and indicates the last
889	      client to have used this resource as a hint.

891	   FREE_BACKUP  -- indicates that this resource can be allocated by the
892	      secondary server to a client at any time.  Note that the primary
893	      server running in PARTNER-DOWN state, after waiting the MCLT, the
894	      resource MAY be allocated to a client by the primary server if
895	      proportional algorithm was used.  Client identification MAY appear
896	      and indicates the last client to have used this resource as a
897	      hint.

899	   ABANDONED  -- indicates that a lease is considered unusable by the
900	      DHCP system.  The primary reason for entering such state is
901	      reception of DECLINE message for said lease.  Client
902	      identification MAY appear.

904	   RESET  -- indicates that this resource was made available by operator
905	      command.  This is a distinct state so that the reason that the
906	      resource became FREE can be determined.  Client identification MAY
907	      appear.

909	   The lease state machine has been presented in Figure 1.  Most states
910	   are stationary, i.e. the lease stays in a given state until external
911	   event triggers transition to another state.  The only transitive
912	   state is FREE*. Once it is reached, the state machine immediately
913	   transitions to either FREE or FREE_BACKUP state.

915	                               +---------+
916	                /------------->|  ACTIVE |<--------------\
917	                |              +---------+               |
918	                |                |  |  |                 |
919	                |       /--(8)--/  (3)  \--(9)-\         |
920	                |      |            |           |        |
921	                |      V            V           V        |
922	                |  +-------+   +--------+   +---------+  |
923	                |  |EXPIRED|   |RELEASED|   |ABANDONED|  |
924	                |  +-------+   +--------+   +---------+  |
925	                |      |            |            |       |
926	                |      |            |           (10)     |
927	                |      |            |            V       |
928	                |      |            |       +---------+  |
929	                |      |            |       |  RESET  |  |
930	                |      |            |       +---------+  |
931	                |      |            |            |       |
932	                |       \--(4)--\  (4)  /--(4)--/        |
933	                |                |  |  |                 |
934	               (1)               V  V  V                (2)
935	                |              /---------\               |
936	                |              |  FREE*  |               |
937	                |              \---------/               |
938	                |                 |   |                  |
939	                |         /-(5)--/     \-(6)-\           |
940	                |        |                    |          |
941	                |        V                    V          |
942	                |    +-------+         +-----------+     |
943	                \----|  FREE |<--(7)-->|FREE_BACKUP|-----/
944	                     +-------+         +-----------+

946	                             FREE* transition

948	                       Figure 1: Lease State Machine

950	   Transitions between states are results of the following events:

952	      1.  Primary server allocates a lease.

954	      2.  Secondary server allocates a lease.

956	      3.  Client sends RELEASE and the lease is released.

958	      4.  Partner acknowledges state change.  This transition MAY also
959	      occur if the server is in PARTNER-DOWN state and the MCLT has
960	      passed since the entry in RELEASED, EXPIRED, or RESET states.

962	      5.  The lease belongs to a pool that is governed by the
963	      proportional allocation, or independent allocation is used and
964	      this lease belongs to primary server pool.

966	      6.  The lease belongs to a pool that is governed by the
967	      independent allocation and the lease belongs to the secondary
968	      server.

970	      7.  Pool rebalance event occurs (POOLREQ/POOLRESP messages are
971	      exchanged).  Addresses (or prefixes) belonging to the primary
972	      server can be assigned to the secondary server pool (transition
973	      from FREE to FREE_BACKUP) or vice versa.

975	      8.  The lease has expired.

977	      9.  DECLINE message is received or a lease is deemed unusable for
978	      other reasons.

980	      10.  An administrative action is taken to recover an abandoned
981	      lease back to usable state.  This transition MAY occur due to an
982	      implementation specific handling on ABANDONED resource.  One
983	      possible example of such use is a Neighbor Discovery or ICMPv6
984	      Echo check if the address is still in use.

986	   The resource that is no longer in use (due to expiration or release),
987	   becomes FREE*. Depending of what allocation algorithm is used, the
988	   resource that is no longer is use, returns to primary (FREE) or
989	   secondary pool (FREE_BACKUP).  The conditions for specific
990	   transitions are depicted in Figure 2.

992	                 +----------------+---------+-----------+
993	                 | \Resource owner|         |           |
994	                 |  \----------\  | Primary | Secondary |
995	                 |Algorithm     \ |         |           |
996	                 +----------------+---------+-----------+
997	                 | Proportional   | FREE    |  FREE     |
998	                 | Independent    | FREE    |FREE_BACKUP|
999	                 +----------------+---------+-----------+

1001	                     Figure 2: FREE* State Transitions

1003	   In case of servers operating in active-passive mode, while a majority
1004	   of the resources are owned by the primary server, the secondary
1005	   server will need a portion of the resources to serve new clients
1006	   while operating in COMMUNICATION-INTERRUPTED state and also in
1007	   PARTNER-DOWN state before it can take over the entire address pool
1008	   (after the expiry of MCLT).

1010	   The secondary server cannot simply take over the entire resource pool
1011	   immediately, since it could also be that both servers are able to
1012	   communicate with DHCP clients, but unable to communicate with each
1013	   other.

1015	   The size of the resource pool allocated to the secondary is specified
1016	   as a percentage of the currently available resources.  Thus, as the
1017	   number of available resources changes on the primary server, the
1018	   number of resources available to the secondary server MUST also
1019	   change, although the frequency of the changes made to the secondary
1020	   server's pool of address resources SHOULD be low enough to not use
1021	   significant processing power or network bandwidth.

1023	   The required size of this private pool allocated to the secondary
1024	   server is based only on the arrival rate of new DHCP clients and the
1025	   length of expected downtime of the primary server, and is not
1026	   directly influenced by the total number of DHCP clients supported by
1027	   the server pair.

1029	8.  Failover Mechanisms

1031	   This section lays out an overview of the communication between
1032	   partners and other mechanisms required for failover operation.  As
1033	   this is a design document, not a protocol specification, high level
1034	   ideas are presented without implementation specific details (e.g. on-
1035	   wire protocol formats).

1037	8.1.  Time Skew

1039	   Partners exchange information about known lease states.  To reliably
1040	   compare a known lease state with an update received from a partner,
1041	   servers must be able to reliably compare the times stored in the
1042	   known lease state with the times received in the update.  Although a
1043	   simple approach would be to require both partners to use synchronized
1044	   time, e.g. by using NTP, such a service may not always be available
1045	   in some scenarios that failover expects to cover.  Therefore a
1046	   mechanism to measure and track relative time differences between
1047	   servers is necessary.  To do so, each message MUST contain
1048	   information about the time of the transmission in the time context of
1049	   the transmitter.  The transmitting server MUST set this as close to
1050	   the actual transmission as possible.  Transmission here is when data
1051	   is added to the send queue of the socket (or the equivalent), as the
1052	   application may not know about the time of the actual transmission of
1053	   the "wire".  The receiving partner MUST store its own timestamp of
1054	   reception as close to the actual reception as possible.  The received
1055	   timestamp information is then compared with local timestamp.

1057	   To account for packet delay variation (jitter), the measured
1058	   difference is not used directly, but rather the moving average of
1059	   last TIME_SKEW_PKTS_AVG packets time difference is calculated.  This
1060	   averaged value is referred to as the time skew.  Note that the time
1061	   skew algorithm allows cooperation between servers with completely
1062	   desynchronized clocks as well as those whose desynchronization itself
1063	   is not constant.

1065	8.2.  Lazy updates

1067	   Lazy update refers to the requirement placed on a server implementing
1068	   a failover protocol to update its failover partner whenever the
1069	   binding database changes.  A failover protocol which didn't support
1070	   lazy update would require the failover partner update to complete
1071	   before a DHCPv6 server could respond to a DHCPv6 client request.
1072	   Such approach is often referred to as 'lockstep' and is the opposite
1073	   of lazy updates.  The lazy update mechanism allows a server to
1074	   allocate a new or extend an existing lease and then update its
1075	   failover partner as time permits.

1077	   Although the lazy update mechanism does not introduce additional
1078	   delays in server response times, it introduces other difficulties.
1079	   The key problem with lazy update is that when a server fails after
1080	   updating a client with a particular lease time and before updating
1081	   its partner, the partner will believe that a lease has expired even
1082	   though the client still retains a valid lease on that address or
1083	   prefix.  It is also possible that the partner will have no record at
1084	   all of the lease of the resource to the client.

1086	8.3.  MCLT concept

1088	   In order to handle problem introduced by lazy updates (see
1089	   Section 8.2), a period of time known as the "Maximum Client Lead
1090	   Time" (MCLT) is defined and must be known to both the primary and
1091	   secondary servers.  Proper use of this time interval places an upper
1092	   bound on the difference allowed between the lease time provided to a
1093	   DHCPv6 client by a server and the lease time known by that server's
1094	   failover partner.

1096	   The MCLT is typically much less than the lease time that a server has
1097	   been configured to offer a client, and so some strategy must exist to
1098	   allow a server to offer the configured lease time to a client.
1099	   During a lazy update the updating server typically updates its
1100	   partner with a potential expiration time which is longer than the
1101	   lease time previously given to the client and which is longer than
1102	   the lease time that the server has been configured to give a client.
1103	   This allows that server to give a longer lease time to the client the
1104	   next time the client renews its lease, since the time that it will
1105	   give to the client will not exceed the MCLT beyond the potential
1106	   expiration time acknowledged by its partner.

1108	   The fundamental relationship on which much of the correctness of this
1109	   protocol depends is that the lease expiration time known to a DHCPv6
1110	   client MUST NOT be greater by more than the MCLT beyond the potential
1111	   expiration time known to that server's failover partner.

1113	   The remainder of this section makes the above fundamental
1114	   relationship more explicit.

1116	   This protocol requires a DHCPv6 server to deal with several different
1117	   lease intervals and places specific restrictions on their
1118	   relationships.  The purpose of these restrictions is to allow the
1119	   other server in the pair to be able to make certain assumptions in
1120	   the absence of an ability to communicate between servers.

1122	   The different times are:

1124	   desired valid lifetime:

1126	      The desired valid lifetime is the lease interval that a DHCPv6
1127	      server would like to give to a DHCPv6 client in the absence of any
1128	      restrictions imposed by the failover protocol.  Its determination
1129	      is outside of the scope of this protocol.  Typically this is the
1130	      result of external configuration of a DHCPv6 server.

1132	   actual valid lifetime:
1133	      The actual valid lifetime is the lease interval that a DHCPv6
1134	      server gives out to a DHCPv6 client.  It may be shorter than the
1135	      desired valid lifetime (as explained below).

1137	   potential valid lifetime:
1138	      The potential valid lifetime is the potential lease expiration
1139	      interval the local server tells to its partner in a BNDUPD
1140	      message.

1142	   acknowledged potential valid lifetime:
1143	      The acknowledged potential valid lifetime is the potential lease
1144	      interval the partner server has most recently acknowledged in a
1145	      BNDACK message.

1147	8.3.1.  MCLT example

1149	   The following example demonstrates the MCLT concept in practice.  The
1150	   values used are arbitrarily chosen are and not a recommendation for
1151	   actual values.  The MCLT in this case is 1 hour.  The desired valid
1152	   lifetime is 3 days, and its renewal time is half the valid lifetime.

1154	   When a server makes an offer for a new lease on an IP address to a
1155	   DHCPv6 client, it determines the desired valid lifetime (in this
1156	   case, 3 days).  It then examines the acknowledged potential valid
1157	   lifetime (which in this case is zero) and determines the remainder of
1158	   the time left to run, which is also zero.  It adds the MCLT to this
1159	   value.  Since the actual valid lifetime cannot be allowed to exceed
1160	   the remainder of the current acknowledged potential valid lifetime
1161	   plus the MCLT, the offer made to the client is for the remainder of
1162	   the current acknowledged potential valid lifetime (i.e. zero) plus
1163	   the MCLT.  Thus, the actual valid lifetime is 1 hour (the MCLT).

1165	   Once the server has sent the REPLY to the DHCPv6 client, it will
1166	   update its failover partner with the lease information.  However, the
1167	   desired potential valid lifetime will be composed of one half of the
1168	   current actual valid lifetime added to the desired valid lifetime.
1169	   Thus, the failover partner is updated with a BNDUPD with a potential
1170	   valid lifetime of 1/2 hour + 3 days.

1172	   When the primary server receives a BNDACK to its update of the
1173	   secondary server's (partner's) potential valid lifetime, it records
1174	   that as the acknowledged potential valid lifetime.  A server MUST NOT
1175	   send a BNDACK in response to a BNDUPD message until it is sure that
1176	   the information in the BNDUPD message has been updated in its lease
1177	   database.  See Section 8.9.  Thus, the primary server in this case
1178	   can be sure that the secondary server has recorded the potential
1179	   lease interval in its stable storage when the primary server receives
1180	   a BNDACK message from the secondary server.

1182	   When the DHCPv6 client attempts to renew at T1 (approximately one
1183	   half an hour from the start of the lease), the primary server again
1184	   determines the desired valid lifetime, which is still 3 days.  It
1185	   then compares this with the original acknowledged potential valid
1186	   lifetime (1/2 hour + 3 days) and adjusts for the time passed since
1187	   the secondary was last updated (1/2 hour).  Thus the time remaining
1188	   of the acknowledged potential valid interval is 3 days.  Adding the
1189	   MCLT to this yields 3 days plus 1 hour, which is more than the
1190	   desired valid lifetime of 3 days.  So the client is renewed for the
1191	   desired valid lifetime -- 3 days.

1193	   When the primary DHCPv6 server updates the secondary DHCPv6 server
1194	   after the DHCPv6 client's renewal REPLY is complete, it will
1195	   calculate the desired potential valid lifetime as the T1 fraction of
1196	   the actual client valid lifetime (1/2 of 3 days this time = 1.5
1197	   days).  To this it will add the desired client valid lifetime of 3
1198	   days, yielding a total desired potential valid lifetime of 4.5 days.
1199	   In this way, the primary attempts to have the secondary always "lead"
1200	   the client in its understanding of the client's valid lifetime so as
1201	   to be able to always offer the client the desired client valid
1202	   lifetime.

1204	   Once the initial actual client valid lifetime of the MCLT is past,
1205	   the protocol operates effectively like the DHCPv6 protocol does today
1206	   in its behavior concerning valid lifetimes.  However, the guarantee
1207	   that the actual client valid lifetime will never exceed the remaining
1208	   acknowledged partner server potential valid lifetime by more than the
1209	   MCLT allows full recovery from a variety of failures.

1211	8.4.  Unreachability detection

1213	   Each partner MUST maintain a FO_SEND timer for each failover
1214	   connection.  The FO_SEND timer is reset every time any message is
1215	   transmitted.  If the timer reaches the FO_SEND_MAX value, a CONTACT
1216	   message is transmitted and timer is reset.  The CONTACT message may
1217	   be transmitted at any time.  An implementation MAY use additional
1218	   mechanisms to detect partner unreachability.

1220	   Implementers are advised to keep in mind that the timer based CONTACT
1221	   message mechanism is not perfect and may not detect some failures.

1223	   In particular, if the partner is using one interface to reach clients
1224	   ("downlink") and another to reach its partner ("uplink"), it is
1225	   possible that communication with the clients will break, yet the
1226	   mechanism will still claim full reachability.  For that reason it is
1227	   beneficial to share the same interface for client traffic and
1228	   communication with the failover partner.  That approach may have
1229	   drawbacks in some network topologies.

1231	8.5.  Re-allocating Leases

1233	   When in PARTNER-DOWN state there is a waiting period after which a
1234	   resource can be re-allocated to another client.  For resources which
1235	   are available when the server enters PARTNER-DOWN state, the period
1236	   is the MCLT from the entry into PARTNER-DOWN state.  For resources
1237	   which are not available when the server enters PARTNER-DOWN state,
1238	   the period is the MCLT after the later of the following times: the
1239	   potential valid lifetime, the most recently transmitted potential
1240	   valid lifetime, the most recently received acknowledged potential
1241	   valid lifetime, and the most recently transmitted acknowledged
1242	   potential valid lifetime.  If this time would be earlier than the
1243	   current time plus the MCLT, then the time the server entered PARTNER-
1244	   DOWN state plus the maximum-client-lead-time is used.

1246	   In any other state, a server cannot reallocate a resource from one
1247	   client to another without first notifying its partner (through a
1248	   BNDUPD message) and receiving acknowledgement (through a BNDACK
1249	   message) that its partner is aware that that first client is not
1250	   using the resource.

1252	   This could be modeled in the following way.  Though this specific
1253	   implementation is in no way required, it may serve to better
1254	   illustrate the concept.

1256	   An "available" resource on a server may be allocated to any client.
1257	   A resource which was leased to a client and which expired or was
1258	   released by that client would take on a new state, EXPIRED or
1259	   RELEASED respectively.  The partner server would then be notified
1260	   that this resource was EXPIRED or RELEASED through a BNDUPD.  When
1261	   the sending server received the BNDACK for that resource showing it
1262	   was FREE, it would move the resource from EXPIRED or RELEASED to
1263	   FREE, and it would be available for allocation by the primary server
1264	   to any clients.

1266	   A server MAY reallocate a resource in the EXPIRED or RELEASED state
1267	   to the same client with no restrictions provided it has not sent a
1268	   BNDUPD message to its partner.  This situation would exist if the
1269	   lease expired or was released after the transition into PARTNER-DOWN
1270	   state, for instance.

1272	8.6.  Sending Binding Update

1274	   This and the following section is written as though every BNDUPD
1275	   message contains only a single binding update transaction in order to
1276	   reduce the complexity of the discussion.  Servers MAY generate
1277	   messages with multiple binding update transactions in them, and their
1278	   partner servers MAY process these messages.  Before multiple binding
1279	   update transactions are to be sent and processed over a failover
1280	   connection, their use MUST be negotiated during the CONNECT and
1281	   CONNECTACK connection establishment processing.

1283	   Each server updates its failover partner about recent changes in
1284	   lease states.  Each update MUST include at least the following
1285	   information:

1287	   1.   resource type - non-temporary address or a prefix.  Resource
1288	        type can be indicated by the container that conveys the actual
1289	        resource (e.g. an IA_NA option indicates non-temporary IPv6
1290	        address);

1292	   2.   resource information - the actual address or prefix.  That is
1293	        conveyed using the appropriate option, e.g. an IAADDR for an
1294	        address or an IAPREFIX for a prefix;

1296	   3.   valid life time sent to client*;

1298	   4.   IAID - Identity Association used by the client, while obtaining
1299	        a given lease.  (Note1: one client may use many IAIDs
1300	        simultaneously.  Note2: IAID for IA, TA and PD are orthogonal
1301	        number spaces.)*;

1303	   5.   Next Expected Client Transmission (renewal time) - time interval
1304	        since Client Last Transmission Time, when a response from a
1305	        client is expected*;

1307	   6.   potential valid life time - a lifetime that the server is
1308	        willing to set if there were no MCLT/failover restrictions
1309	        imposed*;

1311	   7.   preferred life time sent to client - the actual value sent back
1312	        to the client*;

1314	   8.   CLTT - Client Last Transaction Time, a timestamp of the last
1315	        received transmission from a client*;

1317	   9.   Client DUID*.

1319	   10.  Resource state.

1321	   11.  start time of state (especially for non-client updates).

1323	   Items marked with asterisk MUST appear only if the lease is/was
1324	   associated with a client.  Otherwise it MUST NOT appear.

1326	   The BNDUPD message MAY contain additional information related to the
1327	   updated lease.  The additional information MAY include, but is not
1328	   limited to:

1330	   1.  assigned FQDN name, defined in [RFC4704];

1332	   2.  Options Requested by the client, i.e. content of the ORO;

1334	   3.  Relay Data option from DHCPv6 Leasequery, see [RFC5007]
1335	       Section 4.1.2.4

1337	   4.  Any other options the updating partner deems useful.

1339	   The receiving partner MAY store any additional information received,
1340	   but it MAY choose to ignore it as well.  Some information may be
1341	   useful, so it is a good idea to keep or update it.  One reason is
1342	   FQDN information.  A server SHOULD be prepared to clean up DNS
1343	   information once the lease expires or is released.  See Section 11
1344	   for a detailed discussion about Dynamic DNS.  Another reason the
1345	   partner may be interested in keeping additional data is a better
1346	   support for leasequery [RFC5007] or bulk leasequery [RFC5460], which
1347	   features queries based on Relay-ID, by link address and by Remote-ID.

1349	8.7.  Receiving Binding Update

1351	   When a server receives a BNDUPD message, it needs to decide how to
1352	   process the binding update transaction it contains and whether that
1353	   transaction represents a conflict of any sort.  The conflict
1354	   resolution process MUST be used on the receipt of every BNDUPD
1355	   message, not just those that are received while in POTENTIAL-CONFLICT
1356	   state, in order to increase the robustness of the protocol.

1358	   There are three sorts of conflicts:

1360	   1.  Two clients, one resource - This is the duplicate resource
1361	       allocation conflict.  There two different clients each allocated
1362	       the same resource.  See Section 8.8.

1364	   2.  Two resources, one client conflict - This conflict exists when a
1365	       client on one server is associated with a one resource, and on
1366	       the other server with a different resource in the same or related
1367	       prefix.  This does not refer to the case where a single client
1368	       has resources in multiple different prefixes or administrative
1369	       domains (i.e. a mobile client that changed its location), but
1370	       rather the case where on the same prefix the client has a lease
1371	       on one IP address in one server and on a different IP address on
1372	       the other server.

1374	       This conflict may or may not be a problem for a given DHCP server
1375	       implementation and policy.  If implementations and policies
1376	       allow, both resources can be assigned to a given client.  In the
1377	       event that a DHCP server requires that a DHCP client have only
1378	       one outstanding lease of a given type, the conflict MUST be
1379	       resolved by accepting the lease which has the latest CLTT.

1381	       It should be further clarified that DHCPv6 protocol makes
1382	       assignments based on a (client DUID, resource type, IAID)
1383	       triplet.  The possibility of using different IAIDs was omitted in
1384	       this paragraph for clarity.  If one client is assigned multiple
1385	       resources of the same type, but with different IAIDs, there is no
1386	       conflict.  Also, IAID values for different resource types are
1387	       orthogonal, i.e. an IA_NA with IAID=1 is different than an IA_PD
1388	       with IAID=1 and there is no conflict.

1390	   3.  binding-status conflict - This is normal conflict, where one
1391	       server is updating the other with newer information.  See
1392	       Section 8.8 for details of how to resolve these conflicts.

1394	   4.  configuration conflict -- This kind of conflict stems from a
1395	       differing configuration on one server than on the other server.
1396	       It may be transient (last until both servers can process a new
1397	       configuration) or it may be chronic.  It cannot be resolved by
1398	       communications over the failover connection, but must be resolved
1399	       (if it is not transient) by administrator action to resolve the
1400	       conflicts.

1402	8.8.  Conflict Resolution

1404	   The server receiving a lease update from its partner must evaluate
1405	   the received lease information to see if it is consistent with
1406	   already known state and decide which information - the previously
1407	   known or that just received - is "better".  The server should take
1408	   into consideration the following aspects: if the lease is already
1409	   assigned to a specific client, who had contact with client recently,
1410	   start time of the lease, etc.

1412	   When analyzing a BNDUPD message from a partner server, if there is
1413	   insufficient information in the BNDUPD to process it, then reject the
1414	   BNDUPD with reject-reason "Missing binding information".

1416	   If the resource in the BNDUPD is not a resource associated with the
1417	   failover endpoint which received the BNDUPD message, then reject it
1418	   with reject-reason "Illegal IP address or prefix (not part of any
1419	   address or prefix pool)".

1421	   Every BNDUPD message SHOULD contain a client-last-transaction-time
1422	   option, which MUST, if it appears, be the time that the server last
1423	   interacted with the DHCP client.  It MUST NOT be, for instance, the
1424	   time that the lease on an IP address expired.  If there has been no
1425	   interaction with the DHCP client in question (or there is no DHCP
1426	   client presently associated with this resource), then there will be
1427	   no client-last-transaction-time option in the BNDUPD message.

1429	   The list in Figure 3 presents the conflict resolution outcome.  To
1430	   "accept" a BNDUPD means to update the server's bindings database with
1431	   the information contained in the BNDUDP and once the update is
1432	   complete, send a BNDACK message corresponding to the BNDUPD message.
1433	   To "reject" a BNDUPD means to leave the server's binding database
1434	   unchanged and to respond to the BNDUPD with BNDACK with a reject-
1435	   reason option included.

1437	   When interpreting the information in the following table (Figure 3),
1438	   for those rules that are listed with "time" -- if a BNDUPD doesn't
1439	   have a client-last-transaction-time value, then it MUST NOT be
1440	   considered later than the client-last-transaction-time in the
1441	   receiving server's binding.  If the BNDUPD contains a client-last-
1442	   transaction-time value and the receiving server's binding does not,
1443	   then the client-last-transaction-time value in the BNDUPD MUST be
1444	   considered later than the server's.

1446	                             binding-status in received BNDUPD.
1447	   binding-status
1448	   in receiving                                      FREE        RESET
1449	   server           ACTIVE   EXPIRED   RELEASED   FREE_BACKUP  ABANDONED

1451	   ACTIVE           accept(5) time(2)   time(1)    time(2)      accept
1452	   EXPIRED          time(1)   accept    accept     accept       accept
1453	   RELEASED         time(1)   time(1)   accept     accept       accept
1454	   FREE/FREE_BACKUP accept    accept    accept     accept       accept
1455	   RESET            time(3)   accept    accept     accept       accept
1456	   ABANDONED        reject(4) reject(4) reject(4)  reject(4)    accept

1458	                       Figure 3: Conflict Resolution

1460	   time(1): If the client-last-transaction-time in the BNDUPD is later
1461	   than the client-last-transaction-time in the receiving server's
1462	   binding, accept it, else reject it.

1464	   time(2): If the current time is later than the receiving server's
1465	   lease-expiration-time, accept it, else reject it.

1467	   time(3): If the client-last-transaction-time in the BNDUPD is later
1468	   than the start-time-of-state in the receiving server's binding,
1469	   accept it, else reject it.

1471	   (1,2,3): If rejecting, use reject reason "Outdated binding
1472	   information".

1474	   (4): Use reject reason "Less critical binding information".

1476	   (5): If the clients in a BNDUPD message and in a receiving server's
1477	   binding differ, then if the receiving server is a secondary accept
1478	   it, else reject it with a reject reason of "Fatal conflict exists:
1479	   address in use by other client".

1481	   The lease update may be accepted or rejected.  Rejection SHOULD NOT
1482	   change the flag in a lease that says that it should be transmitted to
1483	   the failover partner.  If this flag is set, then it should be
1484	   transmitted, but if it is not already set, the rejection of a lease
1485	   state update SHOULD NOT trigger an automatic update of the failover
1486	   partner sending the rejected update.  The potential for update storms
1487	   is too great, and in the unusual case where the servers simply can't
1488	   agree, that disagreement is better than an update storm.

1490	8.9.  Acknowledging Reception

1492	   Upon acceptance of a binding lease, the server MUST notify its
1493	   partner that it updated its database.  A server MUST NOT send the
1494	   BNDACK before its database is updated.  A BNDACK MUST contain at
1495	   lease the minimum set of information required to unambiguously
1496	   identify the BNDUPD that triggered the BNDACK.

1498	9.  Endpoint States

1500	9.1.  State Machine Operation

1502	   Each server (or, more accurately, failover endpoint) can take on a
1503	   variety of failover states.  These states play a crucial role in
1504	   determining the actions that a server will perform when processing a
1505	   request from a DHCPv6 client as well as dealing with changing
1506	   external conditions (e.g., loss of connection to a failover partner).

1508	   The failover state in which a server is running controls the
1509	   following behaviors:

1511	   o  Responsiveness -- the server is either responsive to DHCPv6 client
1512	      requests or it is not.

1514	   o  Allocation Pool -- which pool of addresses (or prefixes) can be
1515	      used for advertisement on receipt of a SOLICIT or allocation on
1516	      receipt of a REQUEST message.

1518	   o  MCLT -- ensure that valid lifetimes are not beyond what the
1519	      partner has acked plus the MCLT (or not).

1521	   A server will transition from one failover state to another based on
1522	   the specific values held by the following state variables:

1524	   o  Current failover state.

1526	   o  Communications status (OK or not OK).

1528	   o  Partner's failover state (if known).

1530	   Whenever any of the above state variables changes state, the state
1531	   machine is invoked, which may then trigger a change in the current
1532	   failover state.  Thus, whenever the communications status changes,
1533	   the state machine processing is invoked.  This may or may not result
1534	   in a change in the current failover state.

1536	   Whenever a server transitions to a new failover state, the new state
1537	   MUST be communicated to its failover partner in a STATE message if
1538	   the communications status is OK.  In addition, whenever a server
1539	   makes a transition into a new state, it MUST record the new state,
1540	   its current understanding of its partner's state, and the time at
1541	   which it entered the new state in stable storage.

1543	   The following state transition diagram gives a condensed view of the
1544	   state machine.  If there is a difference between the words describing
1545	   a particular state and the diagram below, the words should be
1546	   considered authoritative.

1548	   In the state transition diagram below, the "+" or "-" in the upper
1549	   right corner of each state is a notation about whether communication
1550	   is ongoing with the other server.

1552	       +---------------+  V  +--------------+
1553	       |    RECOVER -|+|  |  |   STARTUP  - |
1554	       |(unresponsive) |  +->+(unresponsive)|
1555	       +------+--------+     +--------------+
1556	       +-Comm. OK             +-----------------+
1557	       |     Other State:     |  PARTNER DOWN - +<---------------------+
1558	       |    RESOLUTION-INTER. | (responsive)    |                      ^
1559	      All     POTENTIAL-      +----+------------+                      |
1560	     Others   CONFLICT------------ | --------+                         |
1561	       |      CONFLICT-DONE     Comm. OK     |     +--------------+    |
1562	    UPDREQ or                 Other State:   |  +--+ RESOLUTION - |    |
1563	    UPDREQALL                  |       |     |  |  | INTERRUPTED  |    |
1564	    Rcv UPDDONE             RECOVER    All   |  |  | (responsive) |    |
1565	       |  +---------------+    |      Others |  |  +------------+-+    |
1566	       +->+RECOVER-WAIT +-| RECOVER    |     |  |         ^     |      |
1567	          |(unresponsive) |  WAIT or   |     |  Comm.     |    Ext.    |
1568	          +-----------+---+  DONE      |     |  OK     Comm.   Cmd---->+
1569	   Comm.---+     Wait MCLT     |       V     V  V     Failed           |
1570	   Changed |          V    +---+   +---+-----+--+-+       |            |
1571	    |  +---+----------++   |       |  POTENTIAL + +-------+            |
1572	    |  |RECOVER-DONE +-|  Wait     |  CONFLICT    +------+             |
1573	    +->+(unresponsive) |  for      |(unresponsive)|   Primary          |
1574	       +------+--------+  Other  +>+----+--------++   resolve    Comm. |
1575	        Comm. OK          State: |      |        ^    conflict  Changed|
1576	   +---Other State:-+   RECOVER  |   Secondary   |       V       V   | |
1577	   |    |           |     DONE   |    resolve    |  ++----------+---++ |
1578	   | All Others:  POTENT.  |     |   conflict    |  |CONFLICT-DONE-|+| |
1579	   | Wait for    CONFLICT--|-----+      |        |  | (responsive)   | |
1580	   | Other State:          V            V        |  +------+---------+ |
1581	   | NORMAL or RECOVER    ++------------+---+    | Other State: NORMAL |
1582	   |    |       DONE      |     NORMAL    + +<--------------+          |
1583	   |    +--+----------+-->+  (responsive)   +-------External Command-->+
1584	   |       ^          ^   +--------+--------+                          |
1585	   |       |          |            |             |                     |
1586	   |   Wait for   Comm. OK  Comm. Failed         |                     |
1587	   |    Other      Other           |             |             External
1588	   |    State:     State:          |             |             Command
1589	   | RECOVER-DONE  NORMAL     Start Safe      Comm. OK            or
1590	   |       |     COMM. INT.  Period Timer    Other State:        Safe
1591	   |    Comm. OK.     |            V          All Others        Period
1592	   |   Other State:   |  +---------+--------+    |            expiration
1593	   |     RECOVER      +--+ COMMUNICATIONS - +----+                     |
1594	   |       +-------------+   INTERRUPTED    |                          |
1595	   RECOVER               |  (responsive)    +------------------------->+
1596	   RECOVER-WAIT--------->+------------------+

1598	                 Figure 4: Failover Endpoint State Machine

1600	9.2.  State Machine Initialization

1602	   The state machine is characterized by storage (in stable storage) of
1603	   at least the following information:

1605	   o  Current failover state.

1607	   o  Previous failover state.

1609	   o  Start time of current failover state.

1611	   o  Partner's failover state.

1613	   o  Start time of partner's failover state.

1615	   o  Time most recent packet received from partner.

1617	   The state machine is initialized by reading these data items from
1618	   stable storage and restoring their values from the information saved.
1619	   If there is no information in stable storage concerning these items,
1620	   then they should be initialized as follows:

1622	   o  Current failover state: Primary: PARTNER-DOWN, Secondary: RECOVER

1624	   o  Previous failover state: None.

1626	   o  Start time of current failover state: Current time.

1628	   o  Partner's failover state: None until reception of STATE message.

1630	   o  Start time of partner's failover state: None until reception of
1631	      STATE message.

1633	   o  Time most recent packet received from partner: None until packet
1634	      received.

1636	9.3.  STARTUP State

1638	   The STARTUP state affords an opportunity for a server to probe its
1639	   partner server, before starting to service DHCP clients.  When in the
1640	   STARTUP state, a server attempts to learn its partner's state and
1641	   determine (using that information if it is available) what state it
1642	   should enter.

1644	   The STARTUP state is not shown with any specific state transitions in
1645	   the state machine diagram (Figure 4) because the processing during
1646	   the STARTUP state can cause the server to transition to any of the
1647	   other states, so that specific state transition arcs would only
1648	   obscure other information.

1650	9.3.1.  Operation in STARTUP State

1652	   The server MUST NOT be responsive to DHCPv6 clients in STARTUP state.

1654	   Whenever a STATE message is sent to the partner while in STARTUP
1655	   state the STARTUP flag MUST be set in the message and the previously
1656	   recorded failover state MUST be placed in the server-state option.

1658	9.3.2.  Transition Out of STARTUP State

1660	   The following algorithm is followed every time the server initializes
1661	   itself, and enters STARTUP state.

1663	   Step 1:

1665	   If there is any record in stable storage of a previous failover state
1666	   for this server, set PREVIOUS-STATE to the last recorded value in
1667	   stable storage, and go to Step 2.

1669	   If there is no record of any previous failover state in stable
1670	   storage for this server, then set the PREVIOUS-STATE to RECOVER and
1671	   set the TIME-OF-FAILURE to 0.  This will allow two servers which
1672	   already have lease information to synchronize themselves prior to
1673	   operating.

1675	   In some cases, an existing server will be commissioned as a failover
1676	   server and brought back into operation where its partner is not yet
1677	   available.  In this case, the newly commissioned failover server will
1678	   not operate until its partner comes online -- but it has operational
1679	   responsibilities as a DHCP server nonetheless.  To properly handle
1680	   this situation, a server SHOULD be configurable in such a way as to
1681	   move directly into PARTNER-DOWN state after the startup period
1682	   expires if it has been unable to contact its partner during the
1683	   startup period.

1685	   Step 2:

1687	   Implementations will differ in the ways that they deal with the state
1688	   machine for failover endpoint states.  In many cases, state
1689	   transitions will occur when communications goes from "OK" to failed,
1690	   or from failed to "OK", and some implementations will implement a
1691	   portion of their state machine processing based on these changes.

1693	   In these cases, during startup, if the previous state is one where
1694	   communications was "OK", then set the previous state to the state
1695	   that is the result of the communications failed state transition when
1696	   in that state (if such transition exists -- some states don't have a
1697	   communications failed state transition, since they allow both
1698	   communications OK and failed).

1700	   Step 3:

1702	   Start the STARTUP state timer.  The time that a server remains in the
1703	   STARTUP state (absent any communications with its partner) is
1704	   implementation dependent but SHOULD be short.  It SHOULD be long
1705	   enough for a TCP connection to be created to a heavily loaded partner
1706	   across a slow network.

1708	   Step 4:

1710	   Attempt to create a TCP connection to the failover partner.

1712	   Step 5:

1714	   Wait for "communications OK".

1716	   When and if communications become "okay", clear the STARTUP flag, and
1717	   set the current state to the PREVIOUS-STATE.

1719	   If the partner is in PARTNER-DOWN state, and if the time at which it
1720	   entered PARTNER-DOWN state (as received in the start-time-of-state
1721	   option in the STATE message) is later than the last recorded time of
1722	   operation of this server, then set CURRENT-STATE to RECOVER.  If the
1723	   time at which it entered PARTNER-DOWN state is earlier than the last
1724	   recorded time of operation of this server, then set CURRENT-STATE to
1725	   POTENTIAL-CONFLICT.

1727	   Then, transition to the current state and take the "communications
1728	   OK" state transition based on the current state of this server and
1729	   the partner.

1731	   Step 6:

1733	   If the startup time expires the server SHOULD transition to the
1734	   PREVIOUS-STATE.

1736	9.4.  PARTNER-DOWN State

1738	   PARTNER-DOWN state is a state either server can enter.  When in this
1739	   state, the server assumes that it is the only server operating and
1740	   serving the client base.  If one server is in PARTNER-DOWN state, the
1741	   other server MUST NOT be operating.

1743	   A server can enter PARTNER-DOWN state either as a result of operator
1744	   intervention (when an operator determines that the server's partner
1745	   is, indeed, down), or as a result of an optional auto-partner-down
1746	   capability where PARTNER-DOWN state is entered automatically after a
1747	   server has been in COMMUNICATIONS-INTERRUPTED state for a pre-
1748	   determined period of time.

1750	9.4.1.  Operation in PARTNER-DOWN State

1752	   The server MUST be responsive in PARTNER-DOWN state, regardless if it
1753	   is primary or secondary.

1755	   It will allow renewal of all outstanding leases on resources.  For
1756	   those resources for which the server is using proportional
1757	   allocation, it will allocate resources from its own pool, and after a
1758	   fixed period of time (the MCLT interval) has elapsed from entry into
1759	   PARTNER-DOWN state, it may allocate IP addresses from the set of all
1760	   available pools.  Server SHOULD fully deplete its own pool, before
1761	   starting allocations from its downed partner's pool.

1763	   Any resource tagged as available for allocation by the other server
1764	   (at entry to PARTNER-DOWN state) MUST NOT be allocated to a new
1765	   client until the MCLT beyond the entry into PARTNER-DOWN state has
1766	   elapsed.

1768	   A server in PARTNER-DOWN state MUST NOT allocate a resource to a DHCP
1769	   client different from that to which it was allocated at the entrance
1770	   to PARTNER-DOWN state until the MCLT beyond the maximum of the
1771	   following times: client expiration time, most recently transmitted
1772	   potential-expiration-time, most recently received ack of potential-
1773	   expiration-time from the partner, and most recently acked potential-
1774	   expiration-time to the partner.  If this time would be earlier than
1775	   the current time plus the maximum-client-lead-time, then the time the
1776	   server entered PARTNER-DOWN state plus the maximum-client-lead-time
1777	   is used.

1779	   The server is not restricted by the MCLT when offering lease times
1780	   while in PARTNER-DOWN state.

1782	   In the unlikely case when there are two servers operating in a
1783	   PARTNER-DOWN state, there is a chance of duplicate leases assigned.

1785	   This leads to a POTENTIAL-CONFLICT (unresponsive) state when they re-
1786	   establish contact.  The duplicate lease issue can be postponed to a
1787	   large extent by the server granting new leases first from its own
1788	   pool.  Therefore the server operating in PARTNER-DOWN state MUST use
1789	   its own pool first for new leases before assigning any leases from
1790	   its downed partner pool.

1792	9.4.2.  Transition Out of PARTNER-DOWN State

1794	   When a server in PARTNER-DOWN state succeeds in establishing a
1795	   connection to its partner, its actions are conditional on the state
1796	   and flags received in the STATE message from the other server as part
1797	   of the process of establishing the connection.

1799	   If the STARTUP bit is set in the server-flags option of a received
1800	   STATE message, a server in PARTNER-DOWN state MUST NOT take any state
1801	   transitions based on reestablishing communications.  Essentially, if
1802	   a server is in PARTNER-DOWN state, it ignores all STATE messages from
1803	   its partner that have the STARTUP bit set in the server-flags option
1804	   of the STATE message.

1806	   If the STARTUP bit is not set in the server-flags option of a STATE
1807	   message received from its partner, then a server in PARTNER-DOWN
1808	   state takes the following actions based on the state of the partner
1809	   as received in a STATE message (either immediately after establishing
1810	   communications or at any time later when a new state is received)

1812	   o  If the partner is in: [ NORMAL, COMMUNICATIONS-INTERRUPTED,
1813	      PARTNER-DOWN, POTENTIAL-CONFLICT, RESOLUTION-INTERRUPTED, or
1814	      CONFLICT-DONE ] state, then transition to POTENTIAL-CONFLICT state

1816	   o  If the partner is in: [ RECOVER, RECOVER-WAIT ] state stay in
1817	      PARTNER-DOWN state

1819	   o  If the partner is in: [ RECOVER-DONE ] state transition into
1820	      NORMAL state

1822	9.5.  RECOVER State

1824	   This state indicates that the server has no information in its stable
1825	   storage or that it is re-integrating with a server in PARTNER-DOWN
1826	   state after it has been down.  A server in this state MUST attempt to
1827	   refresh its stable storage from the other server.

1829	9.5.1.  Operation in RECOVER State

1831	   The server MUST NOT be responsive in RECOVER state.

1833	   A server in RECOVER state will attempt to reestablish communications
1834	   with the other server.

1836	9.5.2.  Transition Out of RECOVER State

1838	   If the other server is in POTENTIAL-CONFLICT, RESOLUTION-INTERRUPTED,
1839	   or CONFLICT-DONE state when communications are reestablished, then
1840	   the server in RECOVER state will move to POTENTIAL-CONFLICT state
1841	   itself.

1843	   If the other server is in any other state, then the server in RECOVER
1844	   state will request an update of missing binding information by
1845	   sending an UPDREQ message.  If the server has determined that it has
1846	   lost its stable storage because it has no record of ever having
1847	   talked to its partner, while its partner does have a record of
1848	   communicating with it, it MUST send an UPDREQALL message, otherwise
1849	   it MUST send an UPDREQ message.

1851	   It will wait for an UPDDONE message, and upon receipt of that message
1852	   it will transition to RECOVER-WAIT state.

1854	   If communications fails during the reception of the results of the
1855	   UPDREQ or UPDREQALL message, the server will remain in RECOVER state,
1856	   and will re-issue the UPDREQ or UPDREQALL when communications are re-
1857	   established.

1859	   If an UPDDONE message isn't received within an implementation
1860	   dependent amount of time, and no BNDUPD messages are being received,
1861	   the connection SHOULD be dropped.

1863	                   A                                        B
1864	                 Server                                  Server

1866	                   |                                        |
1867	                RECOVER                               PARTNER-DOWN
1868	                   |                                        |
1869	                   | >--UPDREQ-------------------->         |
1870	                   |                                        |
1871	                   |        <---------------------BNDUPD--< |
1872	                   | >--BNDACK-------------------->         |
1873	                  ...                                      ...
1874	                   |                                        |
1875	                   |        <---------------------BNDUPD--< |
1876	                   | >--BNDACK-------------------->         |
1877	                   |                                        |
1878	                   |        <--------------------UPDDONE--< |
1879	                   |                                        |
1880	              RECOVER-WAIT                                  |
1881	                   |                                        |
1882	                   | >--STATE-(RECOVER-WAIT)------>         |
1883	                   |                                        |
1884	                   |                                        |
1885	          Wait MCLT from last known                         |
1886	             time of failover operation                     |
1887	                   |                                        |
1888	              RECOVER-DONE                                  |
1889	                   |                                        |
1890	                   | >--STATE-(RECOVER-DONE)------>         |
1891	                   |                                     NORMAL
1892	                   |        <-------------(NORMAL)-STATE--< |
1893	                NORMAL                                      |
1894	                   | >---- State-(NORMAL)--------------->   |
1895	                   |                                        |
1896	                   |                                        |

1898	                 Figure 5: Transition out of RECOVER state

1900	   If at any time while a server is in RECOVER state communications
1901	   fails, the server will stay in RECOVER state.  When communications
1902	   are restored, it will restart the process of transitioning out of
1903	   RECOVER state.

1905	9.6.  RECOVER-WAIT State

1907	   This state indicates that the server has sent an UPDREQ or UPDREQALL
1908	   and has received the UPDDONE message indicating that it has received
1909	   all outstanding binding update information.  In the RECOVER-WAIT
1910	   state the server will wait for the MCLT in order to ensure that any
1911	   processing that this server might have done prior to losing its
1912	   stable storage will not cause future difficulties.

1914	9.6.1.  Operation in RECOVER-WAIT State

1916	   The server MUST NOT be responsive in RECOVER-WAIT state.

1918	9.6.2.  Transition Out of RECOVER-WAIT State

1920	   Upon entry to RECOVER-WAIT state the server MUST start a timer whose
1921	   expiration is set to a time equal to the time the server went down
1922	   (if known) or the time the server started (if the down-time is
1923	   unknown) plus the maximum-client-lead-time.  When this timer expires,
1924	   the server will transition into RECOVER-DONE state.

1926	   This is to allow any IP addresses that were allocated by this server
1927	   prior to loss of its client binding information in stable storage to
1928	   contact the other server or to time out.

1930	   If this is the first time this server has run failover -- as
1931	   determined by the information received from the partner, not
1932	   necessarily only as determined by this server's stable storage (as
1933	   that may have been lost), then the waiting time discussed above may
1934	   be skipped, and the server MAY transition immediately to RECOVER-DONE
1935	   state.

1937	   If the server has never before run failover, then there is no need to
1938	   wait in this state -- but, again, to determine if this server has run
1939	   failover it is vital that the information provided by the partner be
1940	   utilized, since the stable storage of this server may have been lost.

1942	   If communications fails while a server is in RECOVER-WAIT state, it
1943	   has no effect on the operation of this state.  The server SHOULD
1944	   continue to operate its timer, and if the timer expires during the
1945	   period where communications with the other server have failed, then
1946	   the server SHOULD transition to RECOVER-DONE state.  This is rare --
1947	   failover state transitions are not usually made while communications
1948	   are interrupted, but in this case there is no reason to inhibit the
1949	   timer.

1951	9.7.  RECOVER-DONE State

1953	   This state exists to allow an interlocked transition for one server
1954	   from RECOVER state and another server from PARTNER-DOWN or
1955	   COMMUNICATIONS-INTERRUPTED state into NORMAL state.

1957	9.7.1.  Operation in RECOVER-DONE State

1959	   A server in RECOVER-DONE state SHOULD be unresponsive, but MAY
1960	   respond to RENEW requests but MUST only change the state of resources
1961	   that appear in the RENEW request.  It MUST NOT allocate any
1962	   additional resources when in RECOVER-DONE state.

1964	9.7.2.  Transition Out of RECOVER-DONE State

1966	   When a server in RECOVER-DONE state determines that its partner
1967	   server has entered NORMAL or RECOVER-DONE state, then it will
1968	   transition into NORMAL state.

1970	   If communication fails while in RECOVER-DONE state, a server will
1971	   stay in RECOVER-DONE state.

1973	9.8.  NORMAL State

1975	   NORMAL state is the state used by a server when it is communicating
1976	   with the other server, and any required resynchronization has been
1977	   performed.  While some bindings database synchronization is performed
1978	   in NORMAL state, potential conflicts are resolved prior to entry into
1979	   NORMAL state as is binding database data loss.

1981	   When entering NORMAL state, a server will send to the other server
1982	   all currently unacknowledged binding updates as BNDUPD messages.

1984	   When the above process is complete, if the server entering NORMAL
1985	   state is a secondary server, then it will request resources
1986	   (addresses and/or prefixes) for allocation using the POOLREQ message.

1988	9.8.1.  Operation in NORMAL State

1990	   Primary server is responsive in NORMAL state.  Secondary is
1991	   unresponsive in NORMAL state.

1993	   When in NORMAL state a primary server will operate in the following
1994	   manner:

1996	   Lease time calculations
1997	      As discussed in Section 8.3, the lease interval given to a DHCP
1998	      client can never be more than the MCLT greater than the most
1999	      recently received potential-expiration-time from the failover
2000	      partner or the current time, whichever is later.

2002	      As long as a server adheres to this constraint, the specifics of
2003	      the lease interval that it gives to a DHCP client or the value of
2004	      the potential-expiration-time sent to its failover partner are
2005	      implementation dependent.

2007	   Lazy update of partner server
2008	      After sending a REPLY that includes a lease update to a client,
2009	      the server servicing a DHCP client request attempts to update its
2010	      partner with the new binding information.

2012	   Reallocation of resources between clients
2013	      Whenever a client binding is released or expires, a BNDUPD message
2014	      must be sent to the partner, setting the binding state to RELEASED
2015	      or EXPIRED.  However, until a BNDACK is received for this message,
2016	      the resource cannot be allocated to another client.  It cannot be
2017	      allocated to the same client again if a BNDUPD was sent, otherwise
2018	      it can.  See Section 8.5 for details.

2020	   In NORMAL state, each server receives binding updates from its
2021	   partner server in BNDUPD messages.  It records these in its client
2022	   binding database in stable storage and then sends a corresponding
2023	   BNDACK message to its partner server.

2025	9.8.2.  Transition Out of NORMAL State

2027	   If an external command is received by a server in NORMAL state
2028	   informing it that its partner is down, then transition into PARTNER-
2029	   DOWN state.  Generally, this would be an unusual situation, where
2030	   some external agency knew the partner server was down prior to the
2031	   failover server discovering it on its own.

2033	   If a server in NORMAL state fails to receive acks to messages sent to
2034	   its partner for an implementation dependent period of time, it MAY
2035	   move into COMMUNICATIONS-INTERRUPTED state.  This situation might
2036	   occur if the partner server was capable of maintaining the TCP
2037	   connection between the server and also capable of sending a CONTACT
2038	   message periodically, but was (for some reason) incapable of
2039	   processing BNDUPD messages.

2041	   If the communications is determined to not be "ok" (as defined in
2042	   Section 8.4), then transition into COMMUNICATIONS-INTERRUPTED state.

2044	   If a server in NORMAL state receives any messages from its partner
2045	   where the partner has changed state from that expected by the server
2046	   in NORMAL state, then the server should transition into
2047	   COMMUNICATIONS-INTERRUPTED state and take the appropriate state
2048	   transition from there.  For example, it would be expected for the
2049	   partner to transition from POTENTIAL-CONFLICT into NORMAL state, but
2050	   not for the partner to transition from NORMAL into POTENTIAL-CONFLICT
2051	   state.

2053	   If a server in NORMAL state receives a DISCONNECT message from its
2054	   partner, the server should transition into COMMUNICATIONS-INTERRUPTED
2055	   state.

2057	9.9.  COMMUNICATIONS-INTERRUPTED State

2059	   A server goes into COMMUNICATIONS-INTERRUPTED state whenever it is
2060	   unable to communicate with its partner.  Primary and secondary
2061	   servers cycle automatically (without administrative intervention)
2062	   between NORMAL and COMMUNICATIONS-INTERRUPTED state as the network
2063	   connection between them fails and recovers, or as the partner server
2064	   cycles between operational and non-operational.  No duplicate
2065	   resource allocation can occur while the servers cycle between these
2066	   states.

2068	   When a server enters COMMUNICATIONS-INTERRUPTED state, if it has been
2069	   configured to support an automatic transition out of COMMUNICATIONS-
2070	   INTERRUPTED state and into PARTNER-DOWN state (i.e., a auto-partner-
2071	   down has been configured), then a timer MUST be started for the
2072	   length of the configured auto-partner-down period.

2074	   A server transitioning into the COMMUNICATIONS-INTERRUPTED state from
2075	   the NORMAL state SHOULD raise some alarm condition to alert
2076	   administrative staff to a potential problem in the DHCP subsystem.

2078	9.9.1.  Operation in COMMUNICATIONS-INTERRUPTED State

2080	   In this state a server MUST respond to all DHCP client requests.
2081	   When allocating new leases, each server allocates from its own pool,
2082	   where the primary MUST allocate only FREE resources, and the
2083	   secondary MUST allocate only FREE_BACKUP resources.  When responding
2084	   to RENEW messages, each server will allow continued renewal of a DHCP
2085	   client's current lease on a resource irrespective of whether that
2086	   lease was given out by the receiving server or not, although the
2087	   renewal period MUST NOT exceed the maximum client lead time (MCLT)
2088	   beyond the latest of: 1) the potential valid lifetime already
2089	   acknowledged by the other server, or 2) now, or 3) the potential
2090	   valid lifetime received from the partner server.

2092	   However, since the server cannot communicate with its partner in this
2093	   state, the acknowledged potential valid lifetime will not be updated
2094	   in any new bindings.  This is likely to eventually cause the actual
2095	   valid lifetimes to converge to the MCLT (unless this is greater than
2096	   the desired-client-lease-time).

2098	   The server should continue to try to establish a connection with its
2099	   partner.

2101	9.9.2.  Transition Out of COMMUNICATIONS-INTERRUPTED State

2103	   If the safe period timer expires while a server is in the
2104	   COMMUNICATIONS-INTERRUPTED state, it will transition immediately into
2105	   PARTNER-DOWN state.

2107	   If an external command is received by a server in COMMUNICATIONS-
2108	   INTERRUPTED state informing it that its partner is down, it will
2109	   transition immediately into PARTNER-DOWN state.

2111	   If communications is restored with the other server, then the server
2112	   in COMMUNICATIONS-INTERRUPTED state will transition into another
2113	   state based on the state of the partner:

2115	   o  NORMAL or COMMUNICATIONS-INTERRUPTED: Transition into the NORMAL
2116	      state.

2118	   o  RECOVER: Stay in COMMUNICATIONS-INTERRUPTED state.

2120	   o  RECOVER-DONE: Transition into NORMAL state.

2122	   o  PARTNER-DOWN, POTENTIAL-CONFLICT, CONFLICT-DONE, or RESOLUTION-
2123	      INTERRUPTED: Transition into POTENTIAL-CONFLICT state.

2125	   The following figure illustrates the transition from NORMAL to
2126	   COMMUNICATIONS-INTERRUPTED state and then back to NORMAL state again.

2128	      Primary                                Secondary
2129	       Server                                  Server

2131	       NORMAL                                  NORMAL
2132	         | >--CONTACT------------------->         |
2133	         |        <--------------------CONTACT--< |
2134	         |         [TCP connection broken]        |
2135	    COMMUNICATIONS          :              COMMUNICATIONS
2136	      INTERRUPTED           :                INTERRUPTED
2137	         |      [attempt new TCP connection]      |
2138	         |         [connection succeeds]          |
2139	         |                                        |
2140	         | >--CONNECT------------------->         |
2141	         |        <-----------------CONNECTACK--< |
2142	         |                                     NORMAL
2143	         |        <-------------------STATE-----< |
2144	       NORMAL                                     |
2145	         | >--STATE--------------------->         |
2146	         |
2147	         | >--BNDUPD-------------------->         |
2148	         |        <---------------------BNDACK--< |
2149	         |                                        |
2150	         |        <---------------------BNDUPD--< |
2151	         | >------BNDACK---------------->         |
2152	        ...                                      ...
2153	         |                                        |
2154	         |        <--------------------POOLREQ--< |
2155	         | >--POOLRESP------------------>         |
2156	         |                                        |
2157	         | >--BNDUPD-(#1)--------------->         |
2158	         |        <---------------------BNDACK--< |
2159	         |                                        |
2160	         | >--BNDUPD-(#2)--------------->         |
2161	         |        <---------------------BNDACK--< |
2162	         |                                        |

2164	    Figure 6: Transition from NORMAL to COMMUNICATIONS-INTERRUPTED and
2165	          back (example with 2 addresses allocated to secondary)

2167	9.10.  POTENTIAL-CONFLICT State

2169	   This state indicates that the two servers are attempting to
2170	   reintegrate with each other, but at least one of them was running in
2171	   a state that did not guarantee automatic reintegration would be
2172	   possible.  In POTENTIAL-CONFLICT state the servers may determine that
2173	   the same resource has been offered and accepted by two different
2174	   clients.

2176	   It is a goal of this protocol to minimize the possibility that
2177	   POTENTIAL-CONFLICT state is ever entered.

2179	   When a primary server enters POTENTIAL-CONFLICT state it should
2180	   request that the secondary send it all updates of which it is
2181	   currently unaware by sending an UPDREQ message to the secondary
2182	   server.

2184	   A secondary server entering POTENTIAL-CONFLICT state will wait for
2185	   the primary to send it an UPDREQ message.

2187	9.10.1.  Operation in POTENTIAL-CONFLICT State

2189	   Any server in POTENTIAL-CONFLICT state MUST NOT process any incoming
2190	   DHCP requests.

2192	9.10.2.  Transition Out of POTENTIAL-CONFLICT State

2194	   If communications fails with the partner while in POTENTIAL-CONFLICT
2195	   state, then the server will transition to RESOLUTION-INTERRUPTED
2196	   state.

2198	   Whenever either server receives an UPDDONE message from its partner
2199	   while in POTENTIAL-CONFLICT state, it MUST transition to a new state.
2200	   The primary MUST transition to CONFLICT-DONE state, and the secondary
2201	   MUST transition to NORMAL state.  This will cause the primary server
2202	   to leave POTENTIAL-CONFLICT state prior to the secondary, since the
2203	   primary sends an UPDREQ message and receives an UPDDONE before the
2204	   secondary sends an UPDREQ message and receives its UPDDONE message.

2206	   When a secondary server receives an indication that the primary
2207	   server has made a transition from POTENTIAL-CONFLICT to CONFLICT-DONE
2208	   state, it SHOULD send an UPDREQ message to the primary server.

2210	       Primary                                Secondary
2211	       Server                                  Server
2212	         |                                        |
2213	   POTENTIAL-CONFLICT                    POTENTIAL-CONFLICT
2214	         |                                        |
2215	         | >--UPDREQ-------------------->         |
2216	         |                                        |
2217	         |        <---------------------BNDUPD--< |
2218	         | >--BNDACK-------------------->         |
2219	        ...                                      ...
2220	         |                                        |
2221	         |        <---------------------BNDUPD--< |
2222	         | >--BNDACK-------------------->         |
2223	         |                                        |
2224	         |        <--------------------UPDDONE--< |
2225	   CONFLICT-DONE                                  |
2226	         | >--STATE--(CONFLICT-DONE)---->         |
2227	         |        <---------------------UPDREQ--< |
2228	         |                                        |
2229	         | >--BNDUPD-------------------->         |
2230	         |        <---------------------BNDACK--< |
2231	        ...                                      ...
2232	         | >--BNDUPD-------------------->         |
2233	         |        <---------------------BNDACK--< |
2234	         |                                        |
2235	         | >--UPDDONE------------------->         |
2236	         |                                     NORMAL
2237	         |        <------------STATE--(NORMAL)--< |
2238	      NORMAL                                      |
2239	         | >--STATE--(NORMAL)----------->         |
2240	         |                                        |
2241	         |        <--------------------POOLREQ--< |
2242	         | >------POOLRESP-------------->         |
2243	         |                                        |

2245	              Figure 7: Transition out of POTENTIAL-CONFLICT

2247	9.11.  RESOLUTION-INTERRUPTED State

2249	   This state indicates that the two servers were attempting to
2250	   reintegrate with each other in POTENTIAL-CONFLICT state, but
2251	   communications failed prior to completion of re-integration.

2253	   The RESOLUTION-INTERRUPTED state exists because servers are not
2254	   responsive in POTENTIAL-CONFLICT state, and if one server drops out
2255	   of service while both servers are in POTENTIAL-CONFLICT state, the
2256	   server that remains in service will not be able to process DHCP
2257	   client requests and there will be no DHCP service available.  The
2258	   RESOLUTION-INTERRUPTED state is the state that a server moves to if
2259	   its partner disappears while it is in POTENTIAL-CONFLICT state.

2261	   When a server enters RESOLUTION-INTERRUPTED state it SHOULD raise an
2262	   alarm condition to alert administrative staff of a problem in the
2263	   DHCP subsystem.

2265	9.11.1.  Operation in RESOLUTION-INTERRUPTED State

2267	   In this state a server MUST respond to all DHCP client requests.
2268	   When allocating new resources, each server SHOULD allocate from its
2269	   own pool (if that can be determined), where the primary SHOULD
2270	   allocate only FREE resources, and the secondary SHOULD allocate only
2271	   FREE_BACKUP resources.  When responding to renewal requests, each
2272	   server will allow continued renewal of a DHCP client's current lease
2273	   independent of whether that lease was given out by the receiving
2274	   server or not, although the renewal period MUST NOT exceed the
2275	   maximum client lead time (MCLT) beyond the latest of: 1) the
2276	   potential valid lifetime already acknowledged by the other server or
2277	   2) now or 3) potential valid lifetime received from the partner
2278	   server.

2280	   However, since the server cannot communicate with its partner in this
2281	   state, the acknowledged potential valid lifetime will not be updated
2282	   in any new bindings.

2284	9.11.2.  Transition Out of RESOLUTION-INTERRUPTED State

2286	   If an external command is received by a server in RESOLUTION-
2287	   INTERRUPTED state informing it that its partner is down, it will
2288	   transition immediately into PARTNER-DOWN state.

2290	   If communications is restored with the other server, then the server
2291	   in RESOLUTION-INTERRUPTED state will transition into POTENTIAL-
2292	   CONFLICT state.

2294	9.12.  CONFLICT-DONE State

2296	   This state indicates that during the process where the two servers
2297	   are attempting to re-integrate with each other, the primary server
2298	   has received all of the updates from the secondary server.  It makes
2299	   a transition into CONFLICT-DONE state in order that it may be totally
2300	   responsive to the client load.  There is no operational difference
2301	   between CONFLICT-DONE and NORMAL for primary as in both states it
2302	   responds to all clients' requests.  The distinction between CONFLICT-
2303	   DONE and NORMAL states will be more apparent when load balancing
2304	   extension will be defined.

2306	9.12.1.  Operation in CONFLICT-DONE State
2307	   A primary server in CONFLICT-DONE state is fully responsive to all
2308	   DHCP clients (similar to the situation in COMMUNICATIONS-INTERRUPTED
2309	   state).

2311	   If communications fails, remain in CONFLICT-DONE state.  If
2312	   communications becomes OK, remain in CONFLICT-DONE state until the
2313	   conditions for transition out become satisfied.

2315	9.12.2.  Transition Out of CONFLICT-DONE State

2317	   If communications fails with the partner while in CONFLICT-DONE
2318	   state, then the server will remain in CONFLICT-DONE state.

2320	   When a primary server determines that the secondary server has made a
2321	   transition into NORMAL state, the primary server will also transition
2322	   into NORMAL state.

2324	10.  Proposed extensions

2326	   The following section discusses possible extensions to the proposed
2327	   failover mechanism.  Listed extensions must be sufficiently simple to
2328	   not further complicate failover protocol.  Any proposals that are
2329	   considered complex will be defined as stand-alone extensions in
2330	   separate documents.

2332	10.1.  Active-active mode

2334	   A very simple way to achieve active-active mode is to remove the
2335	   restriction that secondary server MUST NOT respond to SOLICIT and
2336	   REQUEST messages.  Instead it could respond, but MUST have lower
2337	   preference than primary server.  Clients discovering available
2338	   servers will receive ADVERTISE messages from both servers, but are
2339	   expected to select the primary server as it has higher preference
2340	   value configured.  The following REQUEST message will be directed to
2341	   primary server.

2343	   The benefit of this approach, compared to the "basic" active--passive
2344	   solution is that there is no delay between primary failure and the
2345	   moment when secondary starts serving requests.

2347	11.  Dynamic DNS Considerations

2349	   DHCP servers (and clients) can use DNS Dynamic Updates as described
2350	   in RFC 2136 [RFC2136] to maintain DNS name-mappings as they maintain
2351	   DHCP leases.  Many different administrative models for DHCP-DNS
2352	   integration are possible.  Descriptions of several of these models,
2353	   and guidelines that DHCP servers and clients should follow in
2354	   carrying them out, are laid out in RFC 4704 [RFC4704].

2356	   The nature of the failover protocol introduces some issues concerning
2357	   dynamic DNS (DDNS) updates that are not part of non-failover
2358	   environments.  This section describes these issues, and defines the
2359	   information which failover partners should exchange in order to
2360	   ensure consistent behavior.  The presence of this section should not
2361	   be interpreted as requiring an implementation of the DHCPv6 failover
2362	   protocol to also support DDNS updates.

2364	   The purpose of this discussion is to clarify the areas where the
2365	   failover and DHCP-DDNS protocols intersect for the benefit of
2366	   implementations which support both protocols, not to introduce a new
2367	   requirement into the DHCPv6 failover protocol.  Thus, a DHCPv6 server
2368	   which implements the failover protocol MAY also support dynamic DNS
2369	   updates, but if it does support dynamic DNS updates it SHOULD utilize
2370	   the techniques described here in order to correctly distribute them
2371	   between the failover partners.  See RFC 4704 [RFC4704] as well as RFC
2372	   4703 [RFC4703] for information on how DHCPv6 servers deal with
2373	   potential conflicts when updating DNS even without failover.

2375	   From the standpoint of the failover protocol, there is no reason why
2376	   a server which is utilizing the DDNS protocol to update a DNS server
2377	   should not be a partner with a server which is not utilizing the DDNS
2378	   protocol to update a DNS server.  However, a server which is not able
2379	   to support DDNS or is not configured to support DDNS SHOULD output a
2380	   warning message when it receives BNDUPD messages which indicate that
2381	   its failover partner is configured to support the DDNS protocol to
2382	   update a DNS server.  An implementation MAY consider this an error
2383	   and refuse to operate, or it MAY choose to operate anyway, having
2384	   warned the administrator of the problem in some way.

2386	11.1.  Relationship between failover and dynamic DNS update

2388	   The failover protocol describes the conditions under which each
2389	   failover server may renew a lease to its current DHCP client, and
2390	   describes the conditions under which it may grant a lease to a new
2391	   DHCP client.  An analogous set of conditions determines when a
2392	   failover server should initiate a DDNS update, and when it should
2393	   attempt to remove records from the DNS.  The failover protocol's
2394	   conditions are based on the desired external behavior: avoiding
2395	   duplicate address and prefix assignments; allowing clients to
2396	   continue using leases which they obtained from one failover partner
2397	   even if they can only communicate with the other partner; allowing
2398	   the secondary DHCP server to grant new leases even if it is unable to
2399	   communicate with the primary server.  The desired external DDNS
2400	   behavior for DHCP failover servers is similar to that described above
2401	   for the failover protocol itself:

2403	   1.  Allow timely DDNS updates from the server which grants a lease to
2404	       a client.  Recognize that there is often a DDNS update lifecycle
2405	       which parallels the DHCP lease lifecycle.  This is likely to
2406	       include the addition of records when the lease is granted, and
2407	       the removal of DNS records when the leased resource is
2408	       subsequently made available for allocation to a different client.

2410	   2.  Communicate enough information between the two failover servers
2411	       to allow one to complete the DDNS update 'lifecycle' even if the
2412	       other server originally granted the lease.

2414	   3.  Avoid redundant or overlapping DDNS updates, where both failover
2415	       servers are attempting to perform DDNS updates for the same
2416	       lease-client binding.

2418	   4.  Avoid situations where one partner is attempting to add RRs
2419	       related to a lease binding while the other partner is attempting
2420	       to remove RRs related to the same lease binding.

2422	   While DHCP servers configured for DDNS typically perform these
2423	   operations on both the AAAA and the PTR resource records, this is not
2424	   required.  It is entirely possible that a DHCP server could be
2425	   configured to only update the DNS with PTR records, and the DHCPv6
2426	   clients could be responsible for updating the DNS with their own AAAA
2427	   records.  In this case, the discussions here would apply only to the
2428	   PTR records.

2430	11.2.  Exchanging DDNS Information

2432	   In order for either server to be able to complete a DDNS update, or
2433	   to remove DNS records which were added by its partner, both servers
2434	   need to know the FQDN associated with the lease-client binding.  In
2435	   addition, to properly handle DDNS updates, additional information is
2436	   required.  All of the following information needs to be transmitted
2437	   between the failover partners:

2439	   1.  The FQDN that the client requested be associated with the
2440	       resource.  If the client doesn't request a particular FQDN and
2441	       one is synthesized by the failover server or if the failover
2442	       server is configured to replace a client requested FQDN with a
2443	       different FQDN, then the server generated value would be used.

2445	   2.  The FQDN that was actually placed in the DNS for this lease.  It
2446	       may differ from the client requested FQDN due to some form of
2447	       disambiguation or other DHCP server configuration (as described
2448	       above).

2450	   3.  The status of and DDNS operations in progress or completed.

2452	   4.  Information sufficient to allow the failover partner to remove
2453	       the FQDN from the DNS should that become necessary.

2455	   These data items are the minimum necessary set to reliably allow two
2456	   failover partners to successfully share the responsibility to keep
2457	   the DNS up to date with the resources allocated to clients.

2459	   This information would typically be included in BNDUPD messages sent
2460	   from one failover partner to the other.  Failover servers MAY choose
2461	   not to include this information in BNDUPD messages if there has been
2462	   no change in the status of any DDNS update related to the lease.

2464	   The partner server receiving BNDUPD messages containing the DDNS
2465	   information SHOULD compare the status information and the FQDN with
2466	   the current DDNS information it has associated with the lease
2467	   binding, and update its notion of the DDNS status accordingly.

2469	   Some implementations will instead choose to send a BNDUPD without
2470	   waiting for the DDNS update to complete, and then will send a second
2471	   BNDUPD once the DDNS update is complete.  Other implementations will
2472	   delay sending the partner a BNDUPD until the DDNS update has been
2473	   acknowledged by the DNS server, or until some time-limit has elapsed,
2474	   in order to avoid sending a second BNDUPD.

2476	   The FQDN option contains the FQDN that will be associated with the
2477	   AAAA RR (if the server is performing an AAAA RR update for the
2478	   client).  The PTR RR can be generated automatically from the IP
2479	   address or prefix value.  The FQDN may be composed in any of several
2480	   ways, depending on server configuration and the information provided
2481	   by the client in its DHCP messages.  The client may supply a hostname
2482	   which it would like the server to use in forming the FQDN, or it may
2483	   supply the entire FQDN.  The server may be configured to attempt to
2484	   use the information the client supplies, it may be configured with an
2485	   FQDN to use for the client, or it may be configured to synthesize an
2486	   FQDN.

2488	   Since the server interacting with the client may not have completed
2489	   the DDNS update at the time it sends the first BNDUPD about the lease
2490	   binding, there may be cases where the FQDN in later BNDUPD messages
2491	   does not match the FQDN included in earlier messages.  For example,
2492	   the responsive server may be configured to handle situations where
2493	   two or more DHCP client FQDNs are identical by modifying the most-
2494	   specific label in the FQDNs of some of the clients in an attempt to
2495	   generate unique FQDNs for them (a process sometimes called
2496	   "disambiguation").  Alternatively, at sites which use some or all of
2497	   the information which clients supply to form the FQDN, it's possible
2498	   that a client's configuration may be changed so that it begins to
2499	   supply new data.  The server interacting with the client may react by
2500	   removing the DNS records which it originally added for the client,
2501	   and replacing them with records that refer to the client's new FQDN.
2502	   In such cases, the server SHOULD include the actual FQDN that was
2503	   used in subsequent DDNS options in any BNDUPD messages exchanged
2504	   between the failover partners.  This server SHOULD include relevant
2505	   information in its BNDUPD messages.  This information may be
2506	   necessary in order to allow the non-responsive partner to detect
2507	   client configuration changes that change the hostname or FQDN data
2508	   which the client includes in its DHCP requests.

2510	11.3.  Adding RRs to the DNS

2512	   A failover server which is going to perform DDNS updates SHOULD
2513	   initiate the DDNS update when it grants a new lease to a client.  The
2514	   server which did not grant the lease SHOULD NOT initiate a DDNS
2515	   update when it receives the BNDUPD after the lease has been granted.
2516	   The failover protocol ensures that only one of the partners will
2517	   grant a lease to any individual client, so it follows that this
2518	   requirement will prevent both partners from initiating updates
2519	   simultaneously.  The server initiating the update SHOULD follow the
2520	   protocol in RFC 4704 [RFC4704].  The server may be configured to
2521	   perform a AAAA RR update on behalf of its clients, or not.
2522	   Ordinarily, a failover server will not initiate DDNS updates when it
2523	   renews leases.  In two cases, however, a failover server MAY initiate
2524	   a DDNS update when it renews a lease to its existing client:

2526	   1.  When the lease was granted before the server was configured to
2527	       perform DDNS updates, the server MAY be configured to perform
2528	       updates when it next renews existing leases.  The server which
2529	       granted the lease is the server which should initiate the DDNS
2530	       update.

2532	   2.  If a server is in PARTNER-DOWN state, it can conclude that its
2533	       partner is no longer attempting to perform an update for the
2534	       existing client.  If the remaining server has not recorded that
2535	       an update for the binding has been successfully completed, the
2536	       server MAY initiate a DDNS update.  It MAY initiate this update
2537	       immediately upon entry to PARTNER-DOWN state, it may perform this
2538	       in the background, or it MAY initiate this update upon next
2539	       hearing from the DHCP client.

2541	11.4.  Deleting RRs from the DNS

2543	   The failover server which makes a resource FREE* SHOULD initiate any
2544	   DDNS deletes, if it has recorded that DNS records were added on
2545	   behalf of the client.

2547	   A server not in PARTNER-DOWN state "makes a resource FREE" when it
2548	   initiates a BNDUPD with a binding-status of FREE, FREE_BACKUP,
2549	   EXPIRED, or RELEASED.  Its partner confirms this status by acking
2550	   that BNDUPD, and upon receipt of the BNDACK the server has "made the
2551	   resource FREE".  Conversely, a server in PARTNER-DOWN state "makes a
2552	   resource FREE" when it sets the binding-status to FREE, since in
2553	   PARTNER-DOWN state no communications is required with the partner.

2555	   It is at this point that it should initiate the DDNS operations to
2556	   delete RRs from the DDNS.  Its partner SHOULD NOT initiate DDNS
2557	   deletes for DNS records related to the lease binding as part of
2558	   sending the BNDACK message.  The partner MAY have issued BNDUPD
2559	   messages with a binding-status of FREE, EXPIRED, or RELEASED
2560	   previously, but the other server will have rejected these BNDUPD
2561	   messages.

2563	   The failover protocol ensures that only one of the two partner
2564	   servers will be able to make a resource FREE*. The server making the
2565	   resource FREE may be doing so while it is in NORMAL communication
2566	   with its partner, or it may be in PARTNER-DOWN state.  If a server is
2567	   in PARTNER-DOWN state, it may be performing DDNS deletes for RRs
2568	   which its partner added originally.  This allows a single remaining
2569	   partner server to assume responsibility for all of the DDNS activity
2570	   which the two servers were undertaking.

2572	   Another implication of this approach is that no DDNS RR deletes will
2573	   be performed while either server is in COMMUNICATIONS-INTERRUPTED
2574	   state, since no resource are moved into the FREE* state during that
2575	   period.

2577	11.5.  Name Assignment with No Update of DNS

2579	   In some cases, a DHCP server is configured to return a name to the
2580	   DHCPv6 client but not enter that name into the DNS.  This is
2581	   typically a name that it has discovered or generated from information
2582	   it has received from the client.  In this case this name information
2583	   SHOULD be communicated to the failover partner, if only to ensure
2584	   that they will return the same name in the event the partner becomes
2585	   the server to which the DHCPv6 client begins to interact.

2587	12.  Reservations and failover
2588	   Some DHCP servers support a capability to offer specific
2589	   preconfigured resources to DHCP clients.  These are real DHCP
2590	   clients, they do the entire DHCP protocol, but these servers always
2591	   offer the client a specific pre-configured resource, and they offer
2592	   that resource to no other clients.  Such a capability has several
2593	   names, but it is sometimes called a "reservation", in that the
2594	   resource is reserved for a particular DHCP client.

2596	   In a situation where there are two DHCP servers serving the same
2597	   prefix without using failover, the two DHCP server's need to have
2598	   disjoint resource pools, but identical reservations for the DHCP
2599	   clients.

2601	   In a failover context, both servers need to be configured with the
2602	   proper reservations in an identical manner, but if we stop there
2603	   problems can occur around the edge conditions where reservations are
2604	   made for resource that has already been leased to a different client.
2605	   Different servers handle this conflict in different ways, but the
2606	   goal of the failover protocol is to allow correct operation with any
2607	   server's approach to the normal processing of the DHCP protocol.

2609	   The general solution with regards to reservations is as follows.
2610	   Whenever a reserved resource becomes FREE (i.e., when first
2611	   configured or whenever a client frees it or it expires or is reset),
2612	   the primary server MUST show that resource as FREE (and thus
2613	   available for its own allocation) and it MUST send it to the
2614	   secondary server in a BNDUPD with a flag set showing that it is
2615	   reserved and with a status of FREE_BACKUP.

2617	   Note that this implies that a reserved resource goes through the
2618	   normal state changes from FREE to ACTIVE (and possibly back to FREE).
2619	   The failover protocol supports this approach to reservations, i.e.,
2620	   where the resource undergoes the normal state changes of any
2621	   resource, but it can only be offered to the client for which it is
2622	   reserved.

2624	   From the above, it follows that a reservation solely on the secondary
2625	   will not necessarily allow the secondary to offer that address to
2626	   client to whom it is reserved.  The reservation must also appear on
2627	   the primary as well for the secondary to be able to offer the
2628	   resource to the client to which it is reserved.

2630	   When the reservation on a resource is cancelled, if the resource is
2631	   currently FREE and the server is the primary, or FREE_BACKUP and the
2632	   server is the secondary, the server MUST send a BNDUPD to the other
2633	   server with the binding-status FREE and an indication that the
2634	   resource is no longer reserved.

2636	13.  Security Considerations

2638	   DHCPv6 failover is an extension of a standard DHCPv6 protocol, so all
2639	   security considerations from [RFC3315], Section 23 and [RFC3633],
2640	   Section 15 related to the server apply.

2642	   As traffic exchange between clients and server is not encrypted, an
2643	   attacker that penetrated the network and is able to intercept
2644	   traffic, will not gain any additional information by also sniffing
2645	   communication between partners.

2647	   An attacker that is able to impersonate one partner can efficiently
2648	   perform a denial of service attack on the remaining uncompromised
2649	   server.  Several techniques may be used: pretending that conflict
2650	   resolution is required, requesting rebalance, claiming that a valid
2651	   lease was released or declined etc.  For that reason the
2652	   communication between servers SHOULD support failover connections
2653	   over TLS, as explained in Section 5.1.  Such secure connections
2654	   SHOULD be optional and configurable by the administrator.

2656	   A server MUST NOT operate in PARTNER-DOWN if its partner is up.
2657	   Network administrators are expected to switch the remaining active
2658	   server to PARTNER-DOWN state only if they is sure that its partner
2659	   server is indeed down.  Failing to obey this requirement will result
2660	   in both servers likely assigning duplicate leases to different
2661	   clients.  Implementers should take that into consideration if they
2662	   decide to implement the auto-partner-down timer-based transition to
2663	   PARTNER-DOWN state.

2665	   Running a network protected by DHCPv6 failover requires more
2666	   resources than running without it.  In particular some of the
2667	   resources are allocated to the secondary server and they are not
2668	   usable in a normal (i.e. non failures) operation immediately, though
2669	   over time they will be rebalanced and end up on the server that needs
2670	   them.  While limiting this pool may be preferable from resource
2671	   utilization perspective, it must be a reasonably large pool, so the
2672	   secondary may take over once the primary becomes unavailable.

2674	14.  IANA Considerations

2676	   IANA is not requested to perform any actions at this time.

2678	15.  Acknowledgements

2680	   This document extensively uses concepts, definitions and other parts
2681	   of [dhcpv4-failover] document.  Authors would like to thank Shawn
2682	   Routher, Greg Rabil, Bernie Volz and Marcin Siodelski for their
2683	   significant involvement and contributions.  Authors would like to
2684	   thank VithalPrasad Gaitonde, Krzysztof Gierlowski, Krzysztof Nowicki
2685	   and Michal Hoeft for their insightful comments.

2687	   This work has been partially supported by Department of Computer
2688	   Communications (a division of Gdansk University of Technology) and
2689	   the Polish Ministry of Science and Higher Education under the
2690	   European Regional Development Fund, Grant No.  POIG.01.01.02-00-045/
2691	   09-00 (Future Internet Engineering Project).

2693	16.  References

2695	16.1.  Normative References

2697	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
2698	              Requirement Levels", BCP 14, RFC 2119, March 1997.

2700	   [RFC3315]  Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C.,
2701	              and M. Carney, "Dynamic Host Configuration Protocol for
2702	              IPv6 (DHCPv6)", RFC 3315, July 2003.

2704	   [RFC3633]  Troan, O. and R. Droms, "IPv6 Prefix Options for Dynamic
2705	              Host Configuration Protocol (DHCP) version 6", RFC 3633,
2706	              December 2003.

2708	   [RFC4703]  Stapp, M. and B. Volz, "Resolution of Fully Qualified
2709	              Domain Name (FQDN) Conflicts among Dynamic Host
2710	              Configuration Protocol (DHCP) Clients", RFC 4703, October
2711	              2006.

2713	   [RFC4704]  Volz, B., "The Dynamic Host Configuration Protocol for
2714	              IPv6 (DHCPv6) Client Fully Qualified Domain Name (FQDN)
2715	              Option", RFC 4704, October 2006.

2717	   [RFC5007]  Brzozowski, J., Kinnear, K., Volz, B., and S. Zeng,
2718	              "DHCPv6 Leasequery", RFC 5007, September 2007.

2720	16.2.  Informative References

2722	   [I-D.ietf-dhc-dhcpv6-failover-requirements]
2723	              Mrugalski, T. and K. Kinnear, "DHCPv6 Failover
2724	              Requirements", draft-ietf-dhc-dhcpv6-failover-
2725	              requirements-07 (work in progress), July 2013.

2727	   [I-D.ietf-dhc-dhcpv6-load-balancing]
2728	              Kostur, A., "DHC Load Balancing Algorithm for DHCPv6",
2729	              draft-ietf-dhc-dhcpv6-load-balancing-00 (work in
2730	              progress), December 2012.

2732	   [RFC2136]  Vixie, P., Thomson, S., Rekhter, Y., and J. Bound,
2733	              "Dynamic Updates in the Domain Name System (DNS UPDATE)",
2734	              RFC 2136, April 1997.

2736	   [RFC5460]  Stapp, M., "DHCPv6 Bulk Leasequery", RFC 5460, February
2737	              2009.

2739	   [dhcpv4-failover]
2740	              Droms, R., Kinnear, K., Stapp, M., Volz, B., Gonczi, S.,
2741	              Rabil, G., Dooley, M., and A. Kapur, "DHCP Failover
2742	              Protocol", draft-ietf-dhc-failover-12 (work in progress),
2743	              March 2003.

2745	Authors' Addresses

2747	   Tomasz Mrugalski
2748	   Internet Systems Consortium, Inc.
2749	   950 Charter Street
2750	   Redwood City, CA  94063
2751	   USA

2753	   Phone: +1 650 423 1345
2754	   Email: tomasz.mrugalski@gmail.com

2756	   Kim Kinnear
2757	   Cisco Systems, Inc.
2758	   1414 Massachusetts Ave.
2759	   Boxborough, Massachusetts  01719
2760	   USA

2762	   Phone: +1 (978) 936-0000
2763	   Email: kkinnear@cisco.com