2.8.14 Reliable Server Pooling (rserpool)

NOTE: This charter is a snapshot of the 51st IETF Meeting in London, England. It may now be out-of-date. Last Modified: 31-Jul-01

Chair(s):

Lyndon Ong <lyong@ciena.com>
Maureen Stillman <maureen.stillman@nokia.com>

Transport Area Director(s):

Scott Bradner <sob@harvard.edu>
Allison Mankin <mankin@isi.edu>

Transport Area Advisor:

Scott Bradner <sob@harvard.edu>

Technical Advisor(s):

Ned Freed <ned.freed@mrochek.com>

Mailing Lists:

General Discussion:rserpool@ietf.org
To Subscribe: rserpool-request@ietf.org
In Body: subscribe email_address
Archive: ftp://ftp.ietf.org/ietf-mail-archive/rserpool/

Description of Working Group:

The purpose of the WG is to develop an architecture and protocols for the management and operation of server pools supporting highly reliable applications, and for client access mechanisms to a server pool.

The WG will define architecture and requirements for management and access to server pools, including requirements from a variety of applications, building blocks and interfaces, different styles of pooling, security requirements and performance requirements, such as failover times and coping with heterogeneous latencies. This will be documented in an Informational RFC.

Scope:

The working group will focus on supporting high availability and scalability of applications through the use of pools of servers. This requires both a way to keep track of what servers are in the pool and are able to receive requests and a way for the client to bind to a desired server.

The Working Group will NOT address:

1) reliable multicast protocols - the use of multicast for reliable server pooling is optional. Reliable multicast protocols will be developed by the RMT WG.

2) synchronization/consistency of data between server pool elements, e.g. shared memory

3) mechanisms for sharing state information between server pool elements

4) Transaction failover. If a server fails during processing of a transaction this transaction may be lost. Some services may provide a way to handle the failure, but this is not guaranteed.

The WG will address client access mechanisms for server pools, specifically:

1) An access mechanism that allows geographically dispersed servers in the pool

2) A client-server binding mechanism that allows dynamic assignment of client to servers based on load balancing or application specific assignment policies.

3) Support of automatic reconfiguration of the client/server binding in case of server failure or administrative changes.

To the extent that new protocols are necessary to support the requirements for server pooling, these will be documented in a Standards Track RFC on client access to a binding service (i.e. name space) protocol.

The WG will also address use of proxying to interwork existing client access mechanisms to any new binding service.

The WG will address server pool management and a distributed service to support client/server binding, including:

1) A scalable mechanism for tracking server pool membership (incl. registration)

2) A scalable protocol for performing node failure detection, reconfiguration and failover, and otherwise managing the server pool (supporting caching, membership, query, authentication, and security)

3) A distributed service to support binding of clients to servers, based on information specific to the server pool. Given that this service is essential to access the server pool, a high degree of availability is necessary.

4) A means for allowing flexible load assignment and balancing policies

The protocols and procedures for server pool management will be documented in a Standards Track RFC.

The WG will address:

- transport protocol(s) that would be supported (eg. UDP, SCTP, TCP)

- any new congestion management issues

- relationship to existing work such as URI resolution mechanisms

Rserpool will consult with other IETF working groups such as Reliable multicast, DNS extensions, AAA, URN, WREC and Sigtran as appropriate and will not duplicate any of these efforts.

Goals and Milestones:

Done

  

Initial draft of RSPool Requirements And Architecture document

Jan 01

  

Submit Reqts and Architecture draft to IESG for consideration as an Informational RFC

Mar 01

  

Initial draft of Binding Service document

Jun 01

  

Initial draft of Client/server binding and Server Pool Management document

Sep 01

  

Submit drafts of Binding Service and Server Pool Management to IESG for consideration as Proposed Standard RFCs

Internet-Drafts:
No Request For Comments

Current Meeting Report

Minutes of Reliable Server Pooling WG, Monday August 6, 2001
IETF 51
Co-Chairs: Lyndon Ong (lyong@ciena.com)
Maureen Stillman (maureen.stillman@nokia.com)

54 people attended this meeting.

Agenda:

1) Status of the Proposed Informational RFC on Requirements for Reliable

Server Pooling

2) Open Issues on WG ASAP and ENRP docs

Speakers: Randy Stewart, Qiaobing Xie
draft-ietf-rserpool-asap-00.txt
draft-ietf-rserpool-enrp-00.txt

3) Candidate ENRP and ASAP

Speaker: Ram Gopal
draft-gopal-asap-candidate-00.txt
draft-gopal-enrp-candidate-00.txt
Comparison with requirements and with WG items ASAP and ENRP.

4) Comments on comparison draft

Speaker: John Loughney
draft-ietf-reserpool-comp-01.txt

5) RSERPOOL naming and selection functional support using SLP

Speaker: Erik Guttman

6) General Discussion

7) Next Steps

Summary of Agenda Items

1) Status of the Proposed Informational RFC on Requirements for Reliable

Server Pooling

Currently the requirements document is on track for approval as an informational RFC. The IESG will review this document shortly.

2) Open Issues on WG ASAP and ENRP docs

ASAP:

Randy Stewart presented open issues and changes to ASAP. Changes agreed upon will be documented in the next version of the Internet draft.

1. The message format should be changed to TLV.

2. The ENRP server discovery needs work to add manual, mcast channel, and DNS alternatives.

3. Load balancing needs work.

4. We discussed the "business card" idea - exchange pool handle of PU but concerns were expressed about security and transparency issues. The conclusion was that this is better handled at the application layer.

5. Do we have any structure to names? Can they indicate more than the service, i.e., characteristics desired? Should session layer provide more complex semantics? Conclusion: we can add this later it currently is not a priority. Also, PEs should be identical for failover purposes.

6. The subject of the transport protocol for Rserpool has come up several times. TCP vs. SCTP - do the A-Ds have a preference for us to choose one? Scott indicated that this is not a requirement. We should consider the flexibility vs. the additional complexity. It may be desirable to consider PU-PE separately from PU-ENRP and PE-ENRP, since the transport usage is different. It was pointed out that SCTP would logically be used for interchange with the ENRP server, but that there might be other types of traffic between PU and PE, such as for example real-time traffic that would use RTP/UDP. PU-PE is more application-dependent.

7. Server discovery issues - There was a discussion of use of well-known channel, concerns with scalability. There is good reason to look at existing alternatives such as DNS and SLP for this, as they have already figured out many of the issues.

8. Security remains a difficult issue. Options include TLS and HIP.

Security and failover is a big problem. Questions concerning whether we are the group with the right expertise to address this. Problems involve state transfer, esp. if you consider mechanisms like cipher block chaining. We agreed to form a small design team to put together a description of the security problems introduced by Rserpool and recommendations on how to address them and what is in Rserpool scope. This group should include lower level issues in their discussion, such as hijack of IP address.

ENRP:

Qiaobing Xie presented open issues and changes to ENRP. Changes agreed upon will be documented in the next version of the Internet draft.

1) Peer discovery - how do you discover other ENRP servers when you come up as an ENRP server? Two possible solutions are to use SLP or DNS mechanisms.

2) Synchronization of name space - do we need an audit mechanism? How closely do we need to synchronize name server information? The current assumption based on prototyping is one minute. The soft state as used in SLP might be helpful - this allows the server to specify a refresh interval so that if the server is congested it can space out refresh requests at longer intervals. This solution is scalable.

3) Load balancing - current mechanism is extremely simple and robust - choose the server that responds fastest. This server will be either geographically closest or further but less loaded, so you choose what is most efficient in general, also this does not require complex processing under heavy load.

3) Candidate ENRP and ASAP

Ram Gopal spoke on candidate ENRP and ASAP and compared them with the current WG drafts and identified some open issues.

1. Startup should not be multicast-dependent. The C-draft also proposes a method of load balancing ENRP servers. PE registration with ENRP server - questions on scalability, algorithm

2. Server unreliability is an issue. Server information may be inconsistent due to network access problems; can you trust the information at the "home" server? The proposal is to distribute information across all ENRP servers

3. Heartbeat - security and authorization issues.

4. Auto-deregistration -- Propose that PE should state lifetime when it registers.

5. Security - Propose that endpoint name should be encrypted

6. PU-PE authentication - should it be required?

Some comments: Subgroups are a bad idea; we should create separate pools instead. Authorization is in question - is it needed? The current ENRP load balancing algorithm is in some sense self-regulating, i.e., there is no advantage to violating it - you will go to a more congested or distant PE. ENRP server selection gives you the "best" server for your purposes.

7. ASCII vs. binary message format - advantages to ASCII - SIP?

Disadvantages - efficiency of storage and conversion, e.g., 500 servers with redundancy = 1000 addresses needing to be passed.

8. Load balancing - use of contexts - questions on whether this is a problem

9. Proxy PE - how are proxies handled in the Rserpool architecture?

9. Alternative transport protocols - more debate on the transport protocol

4) Comments on comparison draft

John Loughney discussed comparison draft.

He has updated DNS text and needs to update SLP with Erik's help. Is the document sufficient, or do we need to consider others? Some suggestions that we should include comparison of ENRP and ASAP with others, however as works in progress these drafts are a moving target.

5) RSERPOOL naming and selection functional support using SLP

Erik is willing to help with analysis and possible extension of SLP in support of Rserpool. SLP has some deficiencies in security, as it tends to be very open.

He suggests mapping a name to more than just to an endpoint, but to also include characteristics of that endpoint.

In SLP multicast is not the only mechanism for finding a DA (directory agent = ENRP server), but also allows use of configured addresses. Erik

proposed exploring an ENRP-SLP mapping which he has already started to work on. In contrast to ENRP, Erik suggests that a "home" server not needed. It can be distributed using a mesh mechanism (see reference below). Qiaobing's comment was that "home" idea in ENRP is for efficiency. The SLP mesh document entitled "mSLP - Mesh enhanced service location protocol" is draft-zhao-slp-da-interaction-12.txt

SLP v2 is the current RFC. This RFC requires user agent applications to poll the network to determine if a service appears or disappears. Notification and subscription for SLP (experimental) extensions - experimental because of lack of real (vs. lab) implementation - addresses this issue. The notification/subscription function is documented in an experimental RFC 3082 entitled "notification and subscription in SLP".

We need to consider service agent (SA) fault tolerance in SLP. How easy is it to extend SLP? Can SLP be used to do ENRP server discovery? Support of real-time failover is required for PE discovery. This is not currently possible through SLP and may be out of its scope.

6) General Discussion

Major security issues were raised during the WG discussions. The area director reminded the group of the new encryption directive from the IESG. The directive is outlined in the Internet draft draft-ietf-saag-whyenc-00.txt.

7) Next Steps

Further discussion of the new technical ideas for the protocol will be continued on the list. A design team to discuss security issues is being formed. The group will formulate the problem and scope of the working group effort and present this to the list for comment. A teleconference will be held after the meeting, to be announced on the list. Everyone is welcome to participate.

Slides

ASAP Issues
RSERPOOL with SLP?