idnits 2.17.1 

draft-chu-ip-cluster-00.txt:
  ** The Abstract section seems to be numbered


  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-19) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 6) being 59 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the
     document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (August 1996) is 10109 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Downref: Normative reference to an Informational RFC: RFC 1794 (ref. '1')


     Summary: 9 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	INTERNET-DRAFT                                     Chi Chu
3	Expires: February 21, 1997                         Research 2000, Inc.
4	                                                   August 1996

6	                            IP Cluster
7	                    draft-chu-ip-cluster-00.txt

9	Status of this Memo

11	     This document is an Internet-Draft.  Internet-Drafts are working
12	     documents of the Internet Engineering Task Force (IETF), its
13	     areas, and its working groups.  Note that other groups may also
14	     distribute working documents as Internet-Drafts.

16	     Internet-Drafts are draft documents valid for a maximum of six
17	     months and may be updated, replaced, or obsoleted by other
18	     documents at any time.  It is inappropriate to use Internet-
19	     Drafts as reference material or to cite them other than as
20	     ``work in progress.''

22	     To learn the current status of any Internet-Draft, please check
23	     the ``1id-abstracts.txt'' listing contained in the Internet-
24	     Drafts Shadow Directories on ftp.is.co.za (Africa),
25	     nic.nordu.net (Europe), munnari.oz.au (Pacific Rim),
26	     ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast).

28	1. Abstract

30	   This Internet-Draft is intended to provide a means for
31	   "IP Clustering" across multiple servers. It is meant as an improved
32	   alternative to the various solutions for distributing WWW traffic
33	   already attempted by the IETF DNS Working Group. In addition,
34	   the clustering method can be applied not only to a heavily visited
35	   web server, but also to any overloaded TCP/IP servers such as a
36	   domain name server. The IP Cluster provides two primary functions:
37	   IP traffic distribution to multiple servers and fault-tolerance.

39	2. Introduction

41	   The notion of distributing IP (Web) data traffic to multiple server
42	   machines has already been foray-ed by the various DNS methods
43	   mentioned or described in RFC 1794 [1]. The basic drawbacks for
44	   all these methods are similar:

46	     * short or zero TTL for DNS records - this is not intended by
47	       the DNS specification and incurs a few unpleasant consequences;
48	     * heavy DNS traffic - since secondary or non-authoritative DNS
49	       servers cannot effectively cache the data, all these methods
50	       generate heavy DNS queries across the global Internet,
51	       bombarding a chain of servers in the name space;

53	     * potentially high delay - if any server in the DNS chain
54	       experiences outage or bottleneck, the response to the initial
55	       query would be significantly delayed if an alternate DNS server
56	       were required to process the query.
57	     * the primary DNS server becomes the single point of failure -
58	       since the TTL is very small or zero, outage of the primary
59	       server for even a small period of time results in failed DNS
60	       lookup;
61	     * easier to spoof - a DNS record can be easily "spoof-ed" to
62	       mislead a client to a bogus host name to IP address mapping;

64	   and among other drawbacks that are method specific. In short, these
65	   DNS methods solve one problem that may be beneficial to a single
66	   site, but create another that can be quite undesirable for the
67	   Internet at large. Just imagine what would happen if every website
68	   decides to implement a DNS method for distributing its web traffic.

70	   Clearly, it is imperative and highly desirable that an alternative
71	   solution be established that does not suffer the same drawbacks
72	   discussed above, and yet the new solution not introduce a new
73	   problem equal in its severity to the network at large.

75	3. The Alternative

77	3.1 Applicable Topology

79	   The proposed method requires an IP router connecting to multiple
80	   servers in a switch-like configuration. That is, each server
81	   machine is directly connected to a unique physical port/interface
82	   on the router. Since a router interface can be a LAN or serial
83	   interface, this IP cluster formation can span locally or be
84	   distributed via wide-area network.

86	3.2 Description

88	   The nature of IP load balancing requires that for a given IP host
89	   name, typically corresponding to some network services, the data
90	   traffic and thus the processing be actually distributed among
91	   several server machines, with some control. This idea of a
92	   "virtual server" provides transparent services to a remote client.
93	   The virtual server itself consists a number of host machines, or
94	   cluster members, each performing a set of services.  In a true
95	   cluster environment, a cluster member performs a set of services
96	   or functions that may be different from that of another member
97	   within the cluster.

99	   Similar to all the DNS load balancing methods, the proposed method
100	   described in this document assumes symmetric host processing.
101	   Namely, the so-called "IP Cluster" consists of a number of cluster
102	   members each of which performs the same set of services (although
103	   strictly speaking, it does not have to be necessarily so).

105	   The alternative method does not rely on the dynamic host name
106	   to IP address mapping. Instead, it relies on the concept of a
107	   Virtual IP (VIP) address. This VIP address is configured as the
108	   host IP address for all cluster members. Each cluster member
109	   is directly connected to a unique router interface port, much
110	   like an Ether-Switch configuration, topologically.

112	   The VIP address appears to the outside world as just another
113	   unique IP address, with the usual DNS host name to IP address
114	   mapping in the traditional static sense. To each of the IP
115	   cluster members, it thinks that this VIP address is its globally
116	   unique host address.

118	   However, this VIP address appears very differently to the IP
119	   router to which all the cluster members are directly connected
120	   to. With careful and deliberate choice of the VIP address (e.g.,
121	   xx.xx.xx.63 for a Class C network), and with the appropriate
122	   subnet (or variable subnet) mask enabled in the router interface
123	   ports, this unique host IP address is in effect a broadcasting
124	   address as for as the router is concerned. Consequently, upon
125	   receiving an IP packet with destination address equal to this VIP
126	   address, the router will attempt to, assuming configured properly,
127	   broadcast the packet to all its relevant interfaces.

129	   Each of the "broadcasted" interface, however, is configured with
130	   a simple filter. This simple filter basically filters on the IP
131	   source address of the incoming packet. Thus with each interface
132	   filter permitting only a unique and non-overlapping portion of
133	   the IP address space to route through, we have effectively
134	   achieved high-performance "IP-Switching".

136	   Furthermore, since this portioning of the IP address space can be
137	   well controlled by each interface filter's bitmasking and
138	   wildcarding, load balancing can be accomplished now with respect
139	   to CPU, memory, IO, or all of the above, depending upon the
140	   application nature of the IP cluster.

142	4. A Scalable Model

144	   The IP clustering model described above should scale very well.
145	   Physically, the "Virtual IP" clustering is primarily limited by
146	   the number of router interface ports. In terms of performance, the
147	   scalability of this model is limited mostly by network bandwidth
148	   technology and the router performance which is usually orders of
149	   magnitude greater than a workstation server's ability to deliver
150	   the same data throughput.

152	   In short, this IP clustering model should scale quite linearly.

154	5. Fault Resilience and Fault Tolerance

156	   Fault Resilience (FR) here means the ability of the IP cluster to
157	   be able to

159	     * automatically redistribute its parallel server processing in
160	       the event of any single cluster member (i.e., server machine)
161	       failure, and
162	     * automatically restore to the normal parallel processing once
163	       the failed server has recovered (by whatever means).

165	   The IP clustering method described in this proposal should be able
166	   to support the above requirements. There are a number of viable
167	   implementations, however; and I shall briefly describe the basic
168	   concept.

170	   Essentially what needs to be done here to achieve FR is similar
171	   to what is done in a "classic" cluster environment. Each cluster
172	   member monitors the health and status of the other cluster members.
173	   When a failure is detected, each monitoring member (which means
174	   all but the failed machine) automatically enables itself to support
175	   a portion of the services or functions that it is configured to
176	   assume for that failed cluster member. When the failed cluster
177	   member becomes alive again (usually through a heartbeat), all other
178	   cluster members will fall back to their normal mode of processing.

180	   While I will not delve into all the relevant issues of building
181	   a Fault Tolerant (FT) IP Cluster, suffice to say, however, with
182	   this IP cluster model, one may easily build a Fault Tolerant IP
183	   Cluster against any single point of host-or-network failure within
184	   the cluster.

186	6. Implementation

188	   The implementation of this IP cluster assumes that a router used
189	   for IP switching is capable of forwarding IP broadcast packets.
190	   While most routers have limited broadcast forwarding capability
191	   (e.g., some may not forward TCP/IP broadcast packets), this
192	   limitation should be easily removed by a perspective router vendor
193	   by relaxing the artifically imposed transport-layer filtering
194	   (which is not entirely a router's business to begin with).

196	   Reconfiguration of the IP-Switch/router filters for achieving
197	   better load balance should be performed by an automated script.
198	   Since this type of reconfiguration is considered system down-time
199	   for the IP cluster, the implementation of such a script should
200	   minimize the down-time by, for instance, separating logging into
201	   the router from actually modifying the filters with human control.

203	   As for communications between cluster members (i.e., heartbeat,
204	   etc.), any number of protocols can be used. It may be as simple
205	   as ping and tcp-echo, or as sophisticated as a new multicast
206	   protocol.

208	7. Performance
209	   As already mentioned in Section 4, the IP clustering method
210	   described in this proposal should be extremely fast. The so-called
211	   IP cluster here is essentially an IP-Switch (as opposed to an
212	   Ether or FastEther Switch) connecting to number of cluster
213	   members each taking full advantage of the underlying transmission
214	   medium without the usual network contention.

216	   Assuming that one is to configure a "Super IP-Switch" with
217	   maximum IO ports and each port is connected to the highest
218	   bandwidth technology and server machine available, the only issue
219	   with regard to performance then is the router's routing
220	   capability, particularly the router's CPU required to perform
221	   the interface filtering.

223	   We can rest assured, however, that this interface filtering or
224	   the router's routing performance cannot realistically be an issue
225	   for two reasons. Reason one, because of bitmasking and wildcarding,
226	   each interface filter list should be very short and compact.
227	   (I don't see more than six lines in each access list unless the
228	   same router is also used for firewalling, etc.) Reason two, long
229	   before one reaches such routing performance issues, any reasonable
230	   organization would want to add a second router into the same IP
231	   cluster. The VIP clustering model supports multiple routers as
232	   an integral part of a single IP cluster. In fact, building such
233	   an IP cluster with multiple routers is one step towards building
234	   a fault-tolerant IP cluster.

236	   One question remains: How effective is the load balancing scheme
237	   based on the IP source address filtering, which if not effective,
238	   would defeat a lot of this high-performance claim. I would say:
239	   pretty effective, especially if the client base is very large
240	   (which is what this proposal is intended to accomplish to begin
241	   with).

243	   This is simply a basic principle of statistical analysis: when
244	   there is a large number of statistical samples, with each sample
245	   behaving randomly and wildly, the overall statistical distribution
246	   is often predictable and well behaved. In fact, the larger the
247	   number, the more predictable and better behaved the statistical
248	   envelope would be. Thus, this statistical property works greatly
249	   in favor of this Internet-Draft's intent to use the IP cluster to
250	   support very large client base.

252	   Assuming one has setup the proposed IP cluster with multiple
253	   servers.  It makes no sense to talk about how good the load
254	   balance actually is when the traffic is light enough that if all
255	   the traffic gets distributed to a single cluster member that that
256	   member server is still not overloaded.  Good load balance becomes
257	   relevant when traffic is heavy enough that some or all of the
258	   cluster members must share significant (but still not necessarily
259	   equal) portions of the traffic load.  It is important to keep the
260	   perspective that the real purpose of clustering is to avoid
261	   server overloading and not to artificially maintain equal load
262	   balance at all time. The beauty of this IP clustering model is
263	   that the more traffic and the larger the client base grows, the
264	   better and more evenly the cluster distributes the load without
265	   incurring any processing overhead.

267	   The above load analysis simply means that an effective IP cluster
268	   does not require fully dynamic load balancing per IP packet.
269	   In fact, a truly dynamic load balancing scheme on per packet
270	   basis would adversely affect the performance of such an IP cluster.
271	   How often (e.g., once a month, etc.) and what criteria (e.g., CPU,
272	   memory, IO) the load balance sampling and analyzing should be
273	   performed in order to re-tune, if necessary, the IP-Switch/router
274	   access filter lists are application dependent.

276	8. Security Considerations

278	   While the DNS methods for IP clustering relies on dynamic host
279	   name to IP address mapping, which can easily be "spoof-ed",
280	   the Virtual IP method does not suffer the same level of security
281	   issues for the simple reason that it is more difficult to spoof
282	   (and spoof it well) the routing topology of the Internet than to
283	   spoof a DNS record.

285	   Additionally, this Virtual IP clustering model does not preclude
286	   any security schemes that are available under a non-cluster single
287	   server environment, firewalls included.

289	9. Acknowledgments

291	   Much appreciation is due to Mike Lee and Josh Sierles for
292	   enlightening me with the DNS load balancing methods, and to Josh
293	   again for referring me to RFC 1794.

295	10. References

297	   [1] Brisco, T., "DNS Support for Load Balancing", RFC 1794,
298	       Rutgers University, April 1995.

300	11. Author's Address

302	   Chi Chu
303	   Research 2000, Inc.
304	   265 Cherry Street, 16G
305	   New York, New York 10002
306	   USA

308	   Phone: 212-598-9455
309	   Email: chi@soho.ios.com
310	   URL:   http://soho.ios.com/~chi

312	This document expires February 21, 1997.