Network Working Group                                     M. Tuexen, Ed.
Internet-Draft                        Muenster Univ. of Applied Sciences
Intended status: Informational                                    Q. Xie
Expires: May 18, 2007                                     Motorola, Inc.
                                                              R. Stewart
                                                                M. Shore
                                                     Cisco Systems, Inc.
                                                             J. Loughney
                                                   Nokia Research Center
                                                            A. Silverton
                                                           Motorola Labs
                                                       November 14, 2006


                Architecture for Reliable Server Pooling
                    draft-ietf-rserpool-arch-12.txt

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on May 18, 2007.

Copyright Notice

   Copyright (C) The Internet Society (2006).


Tuexen, et al.            Expires May 18, 2007                  [Page 1]

Internet-Draft            RSerPool Architecture            November 2006


Abstract

   This document describes an architecture and protocols for the
   management and operation of server pools supporting highly reliable
   applications, and for client access mechanisms to a server pool.


Table of Contents

   1  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1  The Problem Space . . . . . . . . . . . . . . . . . . . . .  3
     1.2  Overview  . . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.3  Terminology . . . . . . . . . . . . . . . . . . . . . . . .  5
     1.4  Abbreviations . . . . . . . . . . . . . . . . . . . . . . .  6
   2  Reliable Server Pooling Architecture  . . . . . . . . . . . . .  6
     2.1  RSerPool Functional Components  . . . . . . . . . . . . . .  7
       2.1.1  Pool Elements . . . . . . . . . . . . . . . . . . . . .  7
       2.1.2  ENRP Servers  . . . . . . . . . . . . . . . . . . . . .  7
       2.1.3  Pool Users  . . . . . . . . . . . . . . . . . . . . . .  8
     2.2  RSerPool Protocol Overview  . . . . . . . . . . . . . . . .  8
       2.2.1  Endpoint Handlespace Redundancy Protocol  . . . . . . .  8
       2.2.2  Aggregate Server Access Protocol  . . . . . . . . . . .  9
       2.2.3  PU <-> ENRP Server Communication  . . . . . . . . . . .  9
       2.2.4  PE <-> ENRP Server Communication  . . . . . . . . . . . 10
       2.2.5  PU <-> PE Communication . . . . . . . . . . . . . . . . 10
       2.2.6  ENRP Server <-> ENRP Server Communication . . . . . . . 11
       2.2.7  PE <->PE Communication  . . . . . . . . . . . . . . . . 12
     2.3  Failover Support  . . . . . . . . . . . . . . . . . . . . . 12
       2.3.1  Business Cards  . . . . . . . . . . . . . . . . . . . . 12
       2.3.2  Cookies . . . . . . . . . . . . . . . . . . . . . . . . 14
     2.4  Typical Interactions between RSerPool Components  . . . . . 14
   3  Examples  . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
     3.1  Two File Transfer Examples  . . . . . . . . . . . . . . . . 16
       3.1.1  The RSerPool Aware Client . . . . . . . . . . . . . . . 17
       3.1.2  The RSerPool Unaware Client . . . . . . . . . . . . . . 18
     3.2  Load Balancing Example  . . . . . . . . . . . . . . . . . . 19
     3.3  Telephony Signaling Example . . . . . . . . . . . . . . . . 20
       3.3.1  Decomposed GWC and GK Scenario  . . . . . . . . . . . . 21
       3.3.2  Collocated GWC and GK Scenario  . . . . . . . . . . . . 22
   4  Security Considerations . . . . . . . . . . . . . . . . . . . . 23
   5  IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 23
   6  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 23
   7. References  . . . . . . . . . . . . . . . . . . . . . . . . . . 24
     7.1  Normative References  . . . . . . . . . . . . . . . . . . . 24
     7.2  Informative References  . . . . . . . . . . . . . . . . . . 24
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25
   Intellectual Property and Copyright Statements . . . . . . . . . . 27


Tuexen, et al.            Expires May 18, 2007                  [Page 2]

Internet-Draft            RSerPool Architecture            November 2006


1  Introduction

1.1  The Problem Space

   Fault tolerance is a difficult and challenging problem space.  Most
   of the solutions in this space involve an extensive effort on the
   part of an application programmer and oftentimes the result is a
   proprietary solution.  There are a number of issues concerning
   developers of fault tolerant applications including:

   1.  How to find a server that provides the service desired?

   2.  If the server that is providing me the service dies, how do I
       find another one?

   3.  What type of redundancy model will I use, 2N or N+K?

   4.  How does a server providing a service share state with potential
       peer servers in case of a failure?

   5.  How does a server assure that when it fails (or dies), the
       clients will access the "best" server that is able to handle the
       failure (or if you will take over for the departed server)?

   6.  From an operations and maintenance standpoint how do we add or
       subtract capacity dynamically without reconfiguring our network?

   A fault tolerant application needs to deal with these issues and many
   more.  Often an application is developed and then later, it is
   realized that the application needs to be fault tolerant.  The
   response to this new requirement mandates either a hack or re-write
   of the application.

   So how can application writers solves these issues and makes it easy
   for the application designer to add fault tolerance without hacking
   or rewriting the application code?  We use layering to solve this
   problem.  A session layer is inserted below the application layer to
   provide a framework for fault tolerance.  This removes some of the
   complexity from the application writers hands thus freeing the
   application writer to concentrate on the application.  Note that not
   all of the issues listed above can be solved by the session layer
   framework alone, in particular the application will still need to
   deal with state sharing, however the session layer framework will
   also provide small tools when it can to help make even this job
   easier for the application writer.

   A second important point is that this layering no longer requires
   each application to be custom programmed for fault tolerance.  By


Tuexen, et al.            Expires May 18, 2007                  [Page 3]

Internet-Draft            RSerPool Architecture            November 2006


   running an application on top of the session layer fault tolerant
   services, there is no longer the need to design and implement fault
   tolerance one application at a time.  There are several benefits to
   this approach:

   1.  Time and cost savings for the developers of the application

   2.  Experts in the area have developed the session layer fault
       tolerance mechanism

   3.  An application can be developed without a fault tolerant
       requirement and later in the life cycle, if this requirement
       emerges, it can be met with Rserpool without a costly redesign.

   4.  Rserpool provides a set of APIs and hooks for the application
       developer to implement fault tolerance

   5.  Rserpool provides a simple building block to the application for
       rudimentary state sharing.

   The above summary is the overall goal of Rserpool.  We strive to
   remove the details and complexity of fault tolerance from the
   application writer and, when the session layer cannot solve the issue
   (such as state sharing), give the application writer some small
   building blocks on which they can solve the problem with minimal
   effort.

   In this document you will be introduced to a set of concepts for
   solving a number of these problems.  Often times the document will
   refer to a named element (e.g Pool User) in the architecture sending
   or receiving a message.  When seeing this, please note that this is
   NOT the application sending or receiving the query, but the session
   layer below.  Envision if you will the application opening up a
   special form of socket.  This socket will allow reading and writing
   of data, but underneath will have special properties that allow it to
   send and receive additional messages when the upper layer user
   requests some service such as sending a message or binding a name.
   Note again, the goal of RSERPOOL is NOT to solve all of the problems,
   but to instead solve a subset of the fault tolerant issues and at the
   same time provide a toolkit of standard utilities that will help an
   application solve the remaining items in an easier way.

1.2  Overview

   A server pool is defined as a set of one or more servers providing
   the same application functionality.  These servers are called Pool
   Elements (PEs).  PEs form the first class of entities in the RSerPool
   architecture.  Multiple PEs in a server pool can be used to provide


Tuexen, et al.            Expires May 18, 2007                  [Page 4]

Internet-Draft            RSerPool Architecture            November 2006


   fault tolerance or load sharing, for example.

   Each server pool is identified by a unique identifier which is simply
   a byte string, called the pool handle.  This allows binary
   identifiers to be used.

   These pool handles are not valid in the whole internet but only in
   smaller domains, called the operational scope.  Furthermore, the
   handle-space is assumed to be flat, so that multiple levels of query
   are not necessary to resolve a pool handle.

   The second class of entities in the RSerPool architecture is the
   class of Endpoint haNdlespace Redundancy Protocol (ENRP) servers.
   ENRP servers are designed to provide a fully distributed fault-
   tolerant real-time translation service that maps a pool handle to set
   of transport addresses pointing to a specific group of networked
   communication endpoints registered under that pool handle.  To be
   more precise, ENRP servers can resolve a pool handle to a list of
   information which allows the Pool User (PU) to access a PE of the
   server pool identified by the handle.  This information includes:

   o  A list of IPv4 and/or IPv6 addresses.

   o  A protocol field specifying the transport layer protocol.

   o  A port number associated with the transport protocol, e.g.  SCTP,
      TCP or UDP.

   Note that the RSerPool architecture supports both IPv4 and IPv6
   addressing.

   In each operational scope there must be at least one ENRP server.
   All ENRP servers within the operational scope have knowledge of all
   server pools within the operational scope.

   RFC3237 [RFC3237] also requires that the ENRP servers should not
   resolve a pool handle to a transport layer address of a PE which is
   not in operation.  Therefore each PE is supervised by one specific
   ENRP server, called the home ENRP server of that PE.  If it detects
   that the PE is out of service all other ENRP servers are informed.

1.3  Terminology

   This document uses the following terms:


Tuexen, et al.            Expires May 18, 2007                  [Page 5]

Internet-Draft            RSerPool Architecture            November 2006


   Home ENRP Server:  The ENRP server a Pool Element has registered
      with.  This ENRP server supervises the Pool Element.

   Operational scope:  The part of the network visible to pool users by
      a specific instance of the reliable server pooling protocols.

   Pool (or server pool):  A collection of servers providing the same
      application functionality.

   Pool handle:  A logical pointer to a pool.  Each server pool will be
      identifiable in the operational scope of the system by a unique
      pool handle.

   Pool element:  A server entity having registered to a pool.

   Pool user:  A server pool user.

   Pool element handle (or endpoint handle):  A logical pointer to a
      particular pool element in a pool, consisting of the pool handle
      and a destination transport address of the pool element.

   Handle space:  A cohesive structure of pool handles and relations
      that may be queried by an internal or external agent.

   ENRP server:  Entity which is responsible for managing and
      maintaining the handle space within the RSerPool operational
      scope.

1.4  Abbreviations

   ASAP:  Aggregate Server Access Protocol

   ENRP:  Endpoint haNdlespace Redundancy Protocol

   PE:  Pool element

   PU:  Pool user

   SCTP:  Stream Control Transmission Protocol

   TCP:  Transmission Control Protocol


2  Reliable Server Pooling Architecture

   In this section, we define a reliable server pool architecture.


Tuexen, et al.            Expires May 18, 2007                  [Page 6]

Internet-Draft            RSerPool Architecture            November 2006


2.1  RSerPool Functional Components

   There are three classes of entities in the RSerPool architecture:

   o  Pool Elements (PEs).

   o  ENRP Servers.

   o  Pool Users (PUs).

2.1.1  Pool Elements

   A server pool is defined as a set of one or more servers providing
   the same application functionality.  These servers are called Pool
   Elements (PEs).  PEs form the first class of entities in the RSerPool
   architecture.  Multiple PEs in a server pool can be used to provide
   fault tolerance or load sharing.

   Each server pool is identified by a unique identifier which is simply
   a byte string, called the pool handle.  This allows binary
   identifiers to be used.

   These pool handles are not valid in the whole internet but only in
   smaller domains, called the operational scope.  Furthermore, the
   handle-space is assumed to be flat, so that multiple levels of query
   are not necessary to resolve a pool handle.

2.1.2  ENRP Servers

   The second class of entities in the RSerPool architecture is the
   class of ENRP servers.  ENRP servers are designed to provide a fully
   distributed fault-tolerant real-time translation service that maps a
   pool handle to set of transport addresses pointing to a specific
   group of networked communication endpoints registered under that pool
   handle.  To be more precise, ENRP servers can resolve a pool handle
   to a list of information which allows the PU to access a PE of the
   server pool identified by the handle.  This information includes:

   o  A list of IPv4 and/or IPv6 addresses.

   o  A protocol field specifying the transport layer protocol.

   o  A port number associated with the transport protocol, e.g.  SCTP,
      TCP or UDP.

   Note that the RSerPool architecture supports both IPv4 and IPv6
   addressing.


Tuexen, et al.            Expires May 18, 2007                  [Page 7]

Internet-Draft            RSerPool Architecture            November 2006


   In each operational scope there must be at least one ENRP server.
   All ENRP servers within the operational scope have knowledge of all
   server pools within the operational scope.

   RFC3237 [RFC3237] also requires that the ENRP servers should not
   resolve a pool handle to a transport layer address of a PE which is
   not in operation.  Therefore each PE is supervised by one specific
   ENRP server, called the home ENRP server of that PE.  If it detects
   that the PE is out of service all other ENRP servers are informed.

2.1.3  Pool Users

   A third class of entities in the architecture is the Pool User (PU)
   class, consisting of the clients being served by the PEs of a server
   pool.

2.2  RSerPool Protocol Overview

   Based on the requirements in RFC3237 [RFC3237], the architecture of
   two new protocols is introduced in this document: ENRP (Endpoint
   haNdlespace Redundancy Protocol) and ASAP (Aggregate Server Access
   Protocol).  These are used because no existing protocols are suitable
   (for a detailed discussion of comparisons please see
   [I-D.ietf-rserpool-comp]).

2.2.1  Endpoint Handlespace Redundancy Protocol

   The ENRP servers use a protocol called Endpoint haNdlespace
   Redundancy Protocol (ENRP) for communication with each other to
   exchange information and updates about the server pools.

   ENRP guarantees the integrity of the RSerPool handlespace by
   providing the means for an ENRP server to

   o  update its peers regarding changes to the handlspace caused by the
      addition of a PE or the status change of an existing PE,

   o  monitor the health of its peers, and, if necessary, take over the
      responsibility of being the home ENRP server for a set of PEs when
      the ENRP server previously responsible for those PEs has failed,
      and

   o  audit the handlespace for inconsistencies and synchronize the
      handlespace amongst its peers when inconsistencies have been
      found.


Tuexen, et al.            Expires May 18, 2007                  [Page 8]

Internet-Draft            RSerPool Architecture            November 2006


2.2.2  Aggregate Server Access Protocol

   The PU wanting service from the pool uses the Aggregate Server Access
   Protocol (ASAP) to access members of the pool.  Depending on the
   level of support desired by the application, use of ASAP may be
   limited to an initial query for an active PE, or ASAP may be used to
   mediate all communication between the PU and PE, so that automatic
   failover from a failed PE to an alternate PE can be supported.

   ASAP uses pool handles for addressing which isolates a logical
   communication endpoint from its IP address(es), thus effectively
   eliminating the binding between the communication endpoint and its
   physical IP address(es) which normally constitutes a single point of
   failure.

   In addition, ASAP provides some mechanisms to support loadsharing
   between PEs within the same pool and to support the upper layer in
   case of a failover between PEs becomes necessary.

   ASAP is also used by a PE to join or leave a server pool.  The PE
   registers or deregisters itself by communicating with an ENRP server,
   which will normally be the home ENRP server.  ASAP allows dynamic
   system scalability, allowing the pool membership to change at any
   time.

   ASAP is used by a home ENRP server to supervise the PEs that have
   registered with that ENRP server.  If the home ENRP server detects
   that a PE is out of service via ASAP, it notifies its peers using
   ENRP as described previously.

2.2.3  PU <-> ENRP Server Communication

   The PU <-> ENRP server communication is used for resolving pool
   handles and uses ASAP.  The PU sends a pool handle to the ENRP server
   and gets back the information necessary for accessing a server in a
   server pool.

   This communication can be based on SCTP or TCP if the PU does not
   support SCTP.  The protocol stack for a PU is shown in Figure 1.


Tuexen, et al.            Expires May 18, 2007                  [Page 9]

Internet-Draft            RSerPool Architecture            November 2006


                       **********        ************
                       *   PU   *        *   ENRP   *
                       *        *        *  server  *
                       **********        ************

                       +--------+         +--------+
                       |  ASAP  |         |  ASAP  |
                       +--------+         +--------+
                       |SCTP/TCP|         |SCTP/TCP|
                       +--------+         +--------+
                       |   IP   |         |   IP   |
                       +--------+         +--------+

                Protocol stack between PU and ENRP server

                                 Figure 1

2.2.4  PE <-> ENRP Server Communication

   The PE <-> ENRP server communication is used for registration and
   deregistration of the PE in one or more pools and for the supervision
   of the PE by the home ENRP server.  This communication uses ASAP and
   is based on SCTP, the protocol stack is shown in the following
   figure.

                       ********        **********
                       *  PE  *        *  ENRP  *
                       *      *        * server *
                       ********        **********

                       +------+         +------+
                       | ASAP |         | ASAP |
                       +------+         +------+
                       | SCTP |         | SCTP |
                       +------+         +------+
                       |  IP  |         |  IP  |
                       +------+         +------+
               Protocol stack between PE and ENRP server

                                 Figure 2

2.2.5  PU <-> PE Communication

   The PU <-> PE communication can be divided into two parts:

   o  control channel


Tuexen, et al.            Expires May 18, 2007                 [Page 10]

Internet-Draft            RSerPool Architecture            November 2006


   o  data channel

   The data channel is used for the transmission of the upper layer
   data, the control channel is used to exchange RSerPool information.

   There are two supported scenarios:

   o  Multiplexed data and control channel.  Both channels are
      transported over one transport connection.  This can either be an
      SCTP association, with data and control channel are separated by
      the PPID, or a TCP connection, with data and control channel being
      handled by a TCP mapping layer.

   o  Data channel and no control channel.  There is no restriction on
      the transport protocol in this case.  Note that certain enhanced
      failover services (e.g. business cards, state cookies, message
      failover described in Section 2.3) are not available when this
      method is used.

   For a given pool, all PUs and PEs should make the same choice for the
   style of interaction between each other: that is, for a given pool,
   either all PEs and PUs in that pool use a multiplexed control/data
   channel for PU-PE communication, or all PEs and PUs in that pool use
   a data channel only for PU-PE communication.

   When the multiplexed data and control channel is used, enhanced
   failover services may be provided, including:

   o  The PE can send a business card to the PU which provides
      information as to which other PE the PU should failover to in case
      of a failure in the serving PE (a last will and testament so to
      speak).

   o  The PE can send cookies to the PU.  The PU would store only the
      last cookie and send it to the new PE in case of a failover.  This
      provides a small hook that a PE can use to propagate small amounts
      of state information upon failure.

   See Section 2.3 for further details.

2.2.6  ENRP Server <-> ENRP Server Communication

   The communication between ENRP servers is used to share the knowledge
   about all server pools between all ENRP servers in an operational
   scope.

   For this communication ENRP over SCTP is used and the protocol stack
   is shown in Figure 3.


Tuexen, et al.            Expires May 18, 2007                 [Page 11]

Internet-Draft            RSerPool Architecture            November 2006


                      **********      **********
                      *  ENRP  *      *  ENRP  *
                      * server *      * server *
                      **********      **********

                       +------+        +------+
                       | ENRP |        | ENRP |
                       +------+        +------+
                       | SCTP |        | SCTP |
                       +------+        +------+
                       |  IP  |        |  IP  |
                       +------+        +------+
                  Protocol stack between ENRP servers

                                 Figure 3

   When a ENRP server initializes a UDP multicast message may be
   transmitted for initial detection of other ENRP servers in the
   operational scope.  The other ENRP servers send a response using a
   unicast UDP message.

2.2.7  PE <->PE Communication

   This is a special case of the PU <-> PE communication.  In this case
   the PU is also a PE in a server pool, this means that one PE is
   acting like a PU during the communication setup.

   The difference between a pure PU <-> PE communication is that the PE
   acting as a PU can send the PE the information that it is actually a
   PE of a pool.  This means that the pool handle is transferred via the
   control channel.  See Section 2.3 for further details.

2.3  Failover Support

   If the PU detects the failure of a PE it may fail over to a different
   PE.  The selection to a new PE should be made such that most likely
   the new PE is not affected by the failed one.

   There are some mechanisms provided by RSerPool to support the
   failover to a new PE.

2.3.1  Business Cards

   A PE can send a business card to its peer (PE or PU) containing its
   pool handle and optionally information to which other PEs the peer
   should failover.  This gives a PE a form of last will and testament
   that can be used to guide a PU to select the "next best" PE.  This
   determination may be made by the PE giving the business card in such


Tuexen, et al.            Expires May 18, 2007                 [Page 12]

Internet-Draft            RSerPool Architecture            November 2006


   a way as to spread the load evenly across the server pool, or it may
   be given with the knowledge that certain PE's will have a more
   current set of state available to service the PU.

   Presenting the pool handle is important in case of PE <-> PE
   communication in which one of the PEs acts as a PU for establishing
   the communication.  This can yield fault tolerance in both directions
   in cases where instead of client/ server, peer to peer concepts are
   being used and both are members of separate server pools.  Often
   times in such a case, the PE selected does not know the name of the
   PU's server pool, by providing that information both PE's will be
   capable of failing over to an alternate.

   Providing information to which PE the PU should failover can also be
   very important.  Consider the scenario presented in the following
   figure.

                   .......................
                   .      +-------+      .
                   .      |       |      .
                   .      |  PE 1 |      .
                   .      |       |      .
                   .      +-------+      .
                   .                     .
                   .     server pool     .
                   .                     .
                   .                     .
    +-------+      .      +-------+      .       +-------+
    |       |      .      |       |      .       |       |
    |  PU 1 |------.------|  PE 2 |------.-------|  PU 2 |
    |       |      .      |       |      .       |       |
    +-------+      .      +-------+      .       +-------+
                   .                     .
                   .                     .
                   .                     .
                   .                     .
                   .      +-------+      .
                   .      |       |      .
                   .      |  PE 3 |      .
                   .      |       |      .
                   .      +-------+      .
                   .......................
                 Two PUs accessing the same PE

   PU 1 is using PE 2 of the server pool.  Assume that PE 1 and PE 2
   share state but not PE 2 and PE 3.  Using the business card of PE 2
   it is possible for PE 2 to inform PU 1 that it should fail over to PE
   1 in case of a failure.


Tuexen, et al.            Expires May 18, 2007                 [Page 13]

Internet-Draft            RSerPool Architecture            November 2006


   A slightly more complicated situation is if two pool users, PU 1 and
   PU 2, use PE 2 but both, PU 1 and PU 2, need to use the same PE.
   Then it is important that PU 1 and PU 2 fail over to the same PE.
   This can be handled in a way such that PE 2 gives the same business
   card to PU 1 and PU 2.

2.3.2  Cookies

   Cookies may optionally be sent from the PE to the PU.  The PU only
   stores the last received cookie.  In case of fail over the PU sends
   this last received cookie to the new PE.  This method provides a
   simple way of state sharing between the PEs.  Please note that the
   old PE should sign the cookie and the receiving PE should verify the
   signature.  For the PU, the cookie has no structure and is only
   stored and transmitted to the new PE.

2.4  Typical Interactions between RSerPool Components

   The following drawing shows the typical RSerPool components and their
   possible interactions with each other:


Tuexen, et al.            Expires May 18, 2007                 [Page 14]

Internet-Draft            RSerPool Architecture            November 2006


   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   ~                                                operational scope ~
   ~  .........................          .........................    ~
   ~  .        server pool 1  .          .        server pool 2  .    ~
   ~  .  +-------+ +-------+  .    (d)   .  +-------+ +-------+  .    ~
   ~  .  |PE(1,A)| |PE(1,C)|<-------------->|PE(2,B)| |PE(2,A)|<---+  ~
   ~  .  +-------+ +-------+  .          .  +-------+ +-------+  . |  ~
   ~  .      ^            ^   .          .      ^         ^      . |  ~
   ~  .      |      (a)   |   .          .      |         |      . |  ~
   ~  .      +----------+ |   .          .      |         |      . |  ~
   ~  .  +-------+      | |   .          .      |         |      . |  ~
   ~  .  |PE(1,B)|<---+ | |   .          .      |         |      . |  ~
   ~  .  +-------+    | | |   .          .      |         |      . |  ~
   ~  .      ^        | | |   .          .      |         |      . |  ~
   ~  .......|........|.|.|....          .......|.........|....... |  ~
   ~         |        | | |                     |         |        |  ~
   ~      (c)|     (a)| | |(a)               (a)|      (a)|     (c)|  ~
   ~         |        | | |                     |         |        |  ~
   ~         |        v v v                     v         v        |  ~
   ~         |     +++++++++++++++    (e)     +++++++++++++++      |  ~
   ~         |     + ENRP server +<---------->+ ENRP server +      |  ~
   ~         |     +++++++++++++++            +++++++++++++++      |  ~
   ~         v            ^                          ^             |  ~
   ~     *********        |                          |             |  ~
   ~     * PU(A) *<-------+                       (b)|             |  ~
   ~     *********   (b)                             |             |  ~
   ~                                                 v             |  ~
   ~         :::::::::::::::::      (f)      *****************     |  ~
   ~         : other clients :<------------->* proxy/gateway * <---+  ~
   ~         :::::::::::::::::               *****************        ~
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            RSerPool components and their possible interactions.


                                 Figure 5

   In this figure we can identify the following possible interactions:

   (a) server pool elements <-> ENRP server: (ASAP)  Each PE in a pool
      uses ASAP to register or de-register itself as well as to exchange
      other auxiliary information with the ENRP server.  The ENRP server
      also uses ASAP to monitor the operational status of each PE in a
      pool.

   (b) PU <-> ENRP server: (ASAP)  A PU normally uses ASAP to request
      the ENRP server for a pool handle to address translation service
      before the PU can send user messages addressed to a server pool by
      the pool's handle.


Tuexen, et al.            Expires May 18, 2007                 [Page 15]

Internet-Draft            RSerPool Architecture            November 2006


   (c) PU <-> PE: (ASAP)  ASAP can be used to exchange some auxiliary
      information of the two parties before they engage in user data
      transfer.

   (d) server pool <-> server pool: (ASAP)  A PE in a server pool can
      become a PU to another pool when the PE tries to initiate
      communication with the other pool.  In such a case, the
      interactions described in (a) and (c) above will apply.

   (e) ENRP server <-> ENRP server: (ENRP)  ENRP can be used to fulfill
      various handle space operation, administration, and maintenance
      (OAM) functions.

   (f) Other Clients <-> Proxy/Gateway: standard protocols  The proxy/
      gateway enables clients ("other clients"), which are not RSerPool
      aware, to access services provided by an RSerPool based server
      pool.  It should be noted that these proxies/gateways may become a
      single point of failure.


3  Examples

   In this section the basic concepts behind ENRP and ASAP are
   illustrated through examples.  First, an RSerPool aware FTP server
   and Rserpool aware clients are presented.  Secondly, a scenario with
   an RSerPool aware server with an Rserpool non-aware client shows how
   to effectively use Rserpool with legacy clients or in a situation
   where exposure to the PU of the list of addresses associated with the
   handlespace is undesirable.  This requirement has been expressed by
   some telephony network operators who are concerned about potential
   network address mapping.  The last two examples illustrate load
   balancing and telephony scenarios.

3.1  Two File Transfer Examples

   In this section we present two file transfer examples using ENRP and
   ASAP.  We present two separate examples demonstrating an RSerPool-
   aware client and an RSerPool-unaware client that is using a Proxy or
   Gateway to perform the file transfer.  In these examples we will use
   a FTP RFC959 [RFC0959] model with some modifications.  In the first
   example (client is RSerPool-aware) we will modify FTP concepts so
   that the file transfer takes place over SCTP.  In the second example,
   we will use TCP between the RSerPool-unaware client and the Proxy.
   The Proxy itself will use the modified FTP with RSerPool as
   illustrated in the first example.

   Please note that in the example we do NOT follow FTP RFC959 [RFC0959]
   precisely but use FTP-like concepts and attempt to adhere to the


Tuexen, et al.            Expires May 18, 2007                 [Page 16]

Internet-Draft            RSerPool Architecture            November 2006


   basic FTP model.  These examples use FTP for illustrative purposes.
   FTP was chosen since many of the basic concept are well known and
   should be familiar to readers.

3.1.1  The RSerPool Aware Client

   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   ~                                                operational scope ~
   ~  .........................                                       ~
   ~  . "file transfer pool"  .                                       ~
   ~  .  +-------+ +-------+  .                                       ~
   ~ +-> |PE(1,A)| |PE(1,C)|  .                                       ~
   ~ |.  +-------+ +-------+  .                                       ~
   ~ |.      ^            ^   .                                       ~
   ~ |.      +----------+ |   .                                       ~
   ~ |.  +-------+      | |   .                                       ~
   ~ |.  |PE(1,B)|<---+ | |   .                                       ~
   ~ |.  +-------+    | | |   .                                       ~
   ~ |.      ^        | | |   .                                       ~
   ~ |.......|........|.|.|....                                       ~
   ~ |  ASAP |    ASAP| | |ASAP                                       ~
   ~ |(d)    |(c)     | | |                                           ~
   ~ |       v        v v v                                           ~
   ~ |   *********   +++++++++++++++                                  ~
   ~ + ->* PU(X) *   + ENRP server +                                  ~
   ~     *********   +++++++++++++++                                  ~
   ~         ^     ASAP     ^                                         ~
   ~         |     <-(b)    |                                         ~
   ~         +--------------+                                         ~
   ~               (a)->                                              ~
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

               Architecture for RSerPool aware client.

                                 Figure 6

   To effect a file transfer the following steps would take place.

   1.  The application in PU(X) sends a login request.  The PU(X)'s ASAP
       layer sends an ASAP request to an ENRP server to request the list
       of pool elements (using (a)).  The pool handle to identify the
       pool is "File Transfer Pool".  The ASAP layer queues the login
       request.

   2.  The ENRP server returns a list of the three PEs PE(1,A), PE(1,B)
       and PE(1,C) to the ASAP layer in PU(X) (using (b)).


Tuexen, et al.            Expires May 18, 2007                 [Page 17]

Internet-Draft            RSerPool Architecture            November 2006


   3.  The ASAP layer selects one of the PEs, for example PE(1,B).  It
       transmits the login request and the other FTP control data.
       Finally, it starts the transmission of the requested files (using
       (c)).  Note that optionally, the multiple stream feature of SCTP
       could be used.

   4.  Suppose that during the file transfer transmission, PE(1,B)
       fails.  If the PE's are sharing file transfer state, a fail-over
       to PE(1,A) could be initiated.  PE(1,A) then continues the
       transfer until complete (see (d)).  In parallel, a request from
       PE(1,A) is made to the ENRP server to request a cache update for
       the server pool "File Transfer Pool".  Furthermore, a report is
       generated that PE(1,B) is non-responsive.  This would trigger
       appropriate audits that may remove PE(1,B) from the pool if the
       ENRP server had not already detected the failure) (using (a)).

3.1.2  The RSerPool Unaware Client

   In this example we investigate the use of a Proxy server assuming the
   same set of scenario as illustrated above.

   In this example the steps will occur:

   1.  The FTP client and the Proxy/Gateway are using the TCP-based ftp
       protocol.  The client sends the login request to the proxy (using
       (e)).

   2.  The proxy behaves like a client and performs the actions
       described under (1), (2) and (3) of the above description (using
       (a), (b) and (c)).

   3.  The ftp communication continues and will be translated by the
       proxy into the RSerPool aware dialect.  This interworking uses
       (f) and (c).

   Note that in this example high availability is maintained between the
   Proxy and the server pool but a single point of failure exists
   between the FTP client and the Proxy, i.e. the command TCP connection
   and its one IP address it is using for commands.


Tuexen, et al.            Expires May 18, 2007                 [Page 18]

Internet-Draft            RSerPool Architecture            November 2006


   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   ~                                                operational scope ~
   ~  .........................                                       ~
   ~  . "file transfer pool"  .                                       ~
   ~  .  +-------+ +-------+  .                                       ~
   ~  .  |PE(1,A)| |PE(1,C)|  .                                       ~
   ~  .  +-------+ +-------+  .                                       ~
   ~  .      ^            ^   .                                       ~
   ~  .      +----------+ |   .                                       ~
   ~  .  +-------+      | |   .                                       ~
   ~  .  |PE(1,B)|<---+ | |   .                                       ~
   ~  .  +-------+    | | |   .                                       ~
   ~  .......^........|.|.|....                                       ~
   ~         |        | | |                                           ~
   ~         |    ASAP| | |ASAP                                       ~
   ~         |        | | |                                           ~
   ~         |        v v v                                           ~
   ~         |       +++++++++++++++          +++++++++++++++         ~
   ~         |       + ENRP server +<--ENRP-->+ ENRP server +         ~
   ~         |       +++++++++++++++          +++++++++++++++         ~
   ~         |                                ASAP   ^                ~
   ~         |     ASAP       (c)                (b) |  ^             ~
   ~         +---------------------------------+  |  |  |             ~
   ~                                           |  v  | (a)            ~
   ~                                           v     v                ~
   ~         :::::::::::::::::     (e)->     *****************        ~
   ~         :   FTP client  :<------------->* proxy/gateway *        ~
   ~         :::::::::::::::::     (f)       *****************        ~
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                Architecture for RSerPool unaware client.

                                 Figure 7

3.2  Load Balancing Example

   This example is similar to the one above describing an RSerPool
   unaware client.  In both examples the clients do not need to support
   the RSerPool protocol suite.

   There are several servers in a pool and the traffic from clients is
   distributed among them by a load balancer.  The load balancer can
   make use of load information provided by the servers for optimal load
   distribution.

   One possibility of using RSerPool for this application is described
   in the next figure.  The servers become pool elements in a pool and
   register themselves with ENRP servers.  They can also provide load
   information.  The load balancer acts as a pool user and gets the


Tuexen, et al.            Expires May 18, 2007                 [Page 19]

Internet-Draft            RSerPool Architecture            November 2006


   addresses and possibly the load information via ASAP communication
   with ENRP servers.  The communication between the clients and servers
   is not affected.

   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   ~                                                operational scope ~
   ~  .........................                                       ~
   ~  .    "server pool"      .                                       ~
   ~  .  +-------+ +-------+  .                                       ~
   ~  .  |PE(1,A)| |PE(1,C)|  .                                       ~
   ~  .  +-------+ +-------+  .                                       ~
   ~  .      ^            ^   .                                       ~
   ~  .      +----------+ |   .                                       ~
   ~  .  +-------+      | |   .                                       ~
   ~  .  |PE(1,B)|<---+ | |   .                                       ~
   ~  .  +-------+    | | |   .                                       ~
   ~  .......^........|.|.|....                                       ~
   ~         |        | | |                                           ~
   ~         |    ASAP| | |ASAP                                       ~
   ~         |        | | |                                           ~
   ~         |        v v v                                           ~
   ~         |       +++++++++++++++          +++++++++++++++         ~
   ~         |       + ENRP server +<--ENRP-->+ ENRP server +         ~
   ~         |       +++++++++++++++          +++++++++++++++         ~
   ~         |                                       ^                ~
   ~         |               (c)                     |                ~
   ~         +---------------------------------+     | ASAP           ~
   ~                                           |     | (a)            ~
   ~                                           v     v                ~
   ~         :::::::::::::::::      (b)    **********************     ~
   ~         :     client    :<----------->* load balancer (PU) *     ~
   ~         :::::::::::::::::             **********************     ~
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
              Architecture for an RSerPool based load balancer.

                                 Figure 8

3.3  Telephony Signaling Example

   This example shows the use of ASAP/RSerPool to support server pooling
   for high availability of a telephony application such as a Voice over
   IP Gateway Controller (GWC) and Gatekeeper services (GK).

   In this example, we show two different scenarios of deploying these
   services using RSerPool in order to illustrate the flexibility of the
   RSerPool architecture.


Tuexen, et al.            Expires May 18, 2007                 [Page 20]

Internet-Draft            RSerPool Architecture            November 2006


3.3.1  Decomposed GWC and GK Scenario

   In this scenario, both GWC and GK services are deployed as separate
   pools with some number of PEs, as shown in the following diagram.
   Each of the pools will register their unique pool handle with the
   ENRP server.  We also assume that there are a Signaling Gateway (SG)
   and a Media Gateway (MG) present and both are RSerPool aware.

                              ...................
                              .    gateway      .
                              . controller pool .
       .................      .   +-------+     .
       .   gatekeeper  .      .   |PE(2,A)|     .
       .     pool      .      .   +-------+     .
       .   +-------+   .      .   +-------+     .
       .   |PE(1,A)|   .      .   |PE(2,B)|     .
       .   +-------+   .      .   +-------+     .
       .   +-------+   . (d)  .   +-------+     .
       .   |PE(1,B)|<------------>|PE(2,C)|<-------------+
       .   +-------+   .      .   +-------+     .        |
       .................      ........^..........        |
                                      |                  |
                                   (c)|               (e)|
                                      |                  v
           +++++++++++++++        *********       *****************
           + ENRP server +        * SG(X) *       * media gateway *
           +++++++++++++++        *********       *****************
                  ^                   ^
                  |                   |
                  |     <-(a)         |
                  +-------------------+
                         (b)->

               Deployment of Decomposed GWC and GK.

                                 Figure 9

   As shown in the previous figure, the following sequence takes place:

   1.  The Signaling Gateway (SG) receives an incoming signaling message
       to be forwarded to the GWC.  SG(X)'s ASAP layer sends an ASAP
       request to its "local" ENRP server to request the list of pool
       elements (PE's) of GWC (using (a)).  The handle used for this
       query is the pool handle of the GWC.  The ASAP layer queues the
       data to be sent to the GWC in local buffers until the ENRP server
       responds.


Tuexen, et al.            Expires May 18, 2007                 [Page 21]

Internet-Draft            RSerPool Architecture            November 2006


   2.  The ENRP server returns a list of the three PE's A, B and C to
       the ASAP layer in SG(X) together with information to be used for
       load-sharing traffic across the gateway controller pool (using
       (b)).

   3.  The ASAP layer in SG(X) will select one PE (e.g., PE(2,C)) and
       send the signaling message to it (using (c)).  The selection is
       based on the load sharing information of the gateway controller
       pool.

   4.  To progress the call, PE(2,C) finds that it needs to talk to the
       Gatekeeper.  Assuming it has the gatekeeper pool's information in
       its local cache (e.g., obtained and stored from a recent query to
       ENRP server), PE(2,C) selects PE(1,B) and sends the call control
       message (using (d)).

   5.  We assume PE(1,B) responds to PE(2,C) and authorizes the call to
       proceed.

   6.  PE(2,C) issues media control commands to the Media Gateway (using
       (e)).

   RSerPool will provide service robustness to the system if some
   failure occurs in the system.

   For example, if PE(1, B) in the Gatekeeper Pool crashed after
   receiving the call control message from PE(2, C) in step (d) above.
   What most likely will happen is that, due to the absence of a reply
   from the Gatekeeper, a timer expiration event will trigger the call
   state machine within PE(2, C) to resend the control message.  The
   ASAP layer at PE(2, C) will then notice the failure of PE(1, B)
   through the endpoint unreachability detection by the transport
   protocol beneath ASAP and automatically deliver the re-sent call
   control message to the alternate GK pool member PE(1, A).  With
   appropriate intra-pool call state sharing support, PE(1, A) will
   correctly handle the call and reply to PE(2, C) and hence progress
   the call.

3.3.2  Collocated GWC and GK Scenario

   In this scenario, the GWC and GK services are collocated (e.g., they
   are implemented as a single process).  In this case, one can form a
   pool that provides both GWC and GK services as shown in the figure
   below.

   The same sequence as described in 5.2.1 takes place, except that step
   (4) now becomes internal to the PE(3,C).  Again, we assume server C
   is selected by SG.


Tuexen, et al.            Expires May 18, 2007                 [Page 22]

Internet-Draft            RSerPool Architecture            November 2006


        ........................................
        .  gateway controller/gatekeeper pool  .
        .                  +-------+           .
        .                  |PE(3,A)|           .
        .                  +-------+           .
        .           +-------+                  .
        .           |PE(3,C)|<---------------------------+
        .           +-------+                  .         |
        .    +-------+  ^                      .         |
        .    |PE(3,B)|  |                      .         |
        .    +-------+  |                      .         |
        ................|.......................         |
                        |                                |
                        +-------------+                  |
                                      |                  |
                                   (c)|               (e)|
                                      v                  v
           +++++++++++++++        *********       *****************
           + ENRP server +        * SG(X) *       * media gateway *
           +++++++++++++++        *********       *****************
                  ^                   ^
                  |                   |
                  |     <-(a)         |
                  +-------------------+
                         (b)->

               Deployment of Collocated GWC and GK.

                                 Figure 10


4  Security Considerations

   The RSerPool protocol must allow us to secure the RSerPool
   infrastructure.  There are security and privacy issues that relate to
   the handle space, pool element registration and user queries of the
   handle space.  In [I-D.ietf-rserpool-threats] a complete threat
   analysis of RSerPool components is presented.


5  IANA Considerations

   There are no actions needed.


6  Acknowledgments

   The authors would like to thank Bernard Aboba, Phillip Conrad, Harrie


Tuexen, et al.            Expires May 18, 2007                 [Page 23]

Internet-Draft            RSerPool Architecture            November 2006


   Hazewinkel, Matt Holdrege, Lyndon Ong, Christopher Ross, Maureen
   Stillman, Werner Vogels and many others for their invaluable comments
   and suggestions.


7.  References

7.1.  Normative References

   [I-D.ietf-rserpool-threats]
              Stillman, M., "Threats Introduced by Rserpool and
              Requirements for Security in response to  Threats",
              draft-ietf-rserpool-threats-05 (work in progress),
              July 2005.

   [RFC2026]  Bradner, S., "The Internet Standards Process -- Revision
              3", BCP 9, RFC 2026, October 1996.

7.2.  Informative References

   [I-D.ietf-rserpool-comp]
              Loughney, J., "Comparison of Protocols for Reliable Server
              Pooling", draft-ietf-rserpool-comp-10 (work in progress),
              July 2005.

   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
              RFC 793, September 1981.

   [RFC0959]  Postel, J. and J. Reynolds, "File Transfer Protocol",
              STD 9, RFC 959, October 1985.

   [RFC2608]  Guttman, E., Perkins, C., Veizades, J., and M. Day,
              "Service Location Protocol, Version 2", RFC 2608,
              June 1999.

   [RFC2719]  Ong, L., Rytina, I., Garcia, M., Schwarzbauer, H., Coene,
              L., Lin, H., Juhasz, I., Holdrege, M., and C. Sharp,
              "Framework Architecture for Signaling Transport",
              RFC 2719, October 1999.

   [RFC2960]  Stewart, R., Xie, Q., Morneault, K., Sharp, C.,
              Schwarzbauer, H., Taylor, T., Rytina, I., Kalla, M.,
              Zhang, L., and V. Paxson, "Stream Control Transmission
              Protocol", RFC 2960, October 2000.

   [RFC3237]  Tuexen, M., Xie, Q., Stewart, R., Shore, M., Ong, L.,
              Loughney, J., and M. Stillman, "Requirements for Reliable
              Server Pooling", RFC 3237, January 2002.


Tuexen, et al.            Expires May 18, 2007                 [Page 24]

Internet-Draft            RSerPool Architecture            November 2006


Authors' Addresses

   Michael Tuexen (editor)
   Muenster Univ. of Applied Sciences
   Stegerwaldstr. 39
   48565 Steinfurt
   Germany

   Email: tuexen@fh-muenster.de


   Qiaobing Xie
   Motorola, Inc.
   1501 W. Shure Drive, #2309
   Arlington Heights, IL  60004
   USA

   Phone: +1-847-632-3028
   Email: qxie1@email.mot.com


   Randall R. Stewart
   Cisco Systems, Inc.
   8725 West Higgins Road
   Suite 300
   Chicago, IL  60631
   USA

   Phone: +1-815-477-2127
   Email: rrs@cisco.com


   Melinda Shore
   Cisco Systems, Inc.
   809 Hayts Rd
   Ithaca, NY  14850
   USA

   Phone: +1 607 272 7512
   Email: mshore@cisco.com


Tuexen, et al.            Expires May 18, 2007                 [Page 25]

Internet-Draft            RSerPool Architecture            November 2006


   John Loughney
   Nokia Research Center
   PO Box 407
   FIN-00045 Nokia Group  FIN-00045
   Finland

   Email: john.loughney@nokia.com


   Aron J. Silverton
   Motorola Labs
   1301 E. Algonquin Road
   Room 2246
   Schaumburg, IL  60196
   US

   Phone: +1 847-576-8747
   Email: aron.j.silverton@motorola.com


Tuexen, et al.            Expires May 18, 2007                 [Page 26]

Internet-Draft            RSerPool Architecture            November 2006


Full Copyright Statement

   Copyright (C) The Internet Society (2006).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Acknowledgment

   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).


Tuexen, et al.            Expires May 18, 2007                 [Page 27]