idnits 2.17.1 draft-dreibholz-rserpool-applic-distcomp-14.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 2, 2013) is 4129 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 4960 (Obsoleted by RFC 9260) == Outdated reference: A later version (-34) exists of draft-dreibholz-rserpool-asap-hropt-11 == Outdated reference: A later version (-33) exists of draft-dreibholz-rserpool-delay-10 == Outdated reference: A later version (-31) exists of draft-dreibholz-rserpool-enrp-takeover-08 Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Dreibholz 3 Internet-Draft Simula Research Laboratory 4 Intended status: Informational January 2, 2013 5 Expires: July 6, 2013 7 Applicability of Reliable Server Pooling for Real-Time Distributed 8 Computing 9 draft-dreibholz-rserpool-applic-distcomp-14.txt 11 Abstract 13 This document describes the applicability of the Reliable Server 14 Pooling architecture to manage real-time distributed computing pools 15 and access the resources of such pools. 17 Status of this Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at http://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on July 6, 2013. 34 Copyright Notice 36 Copyright (c) 2013 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (http://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 This document may contain material from IETF Documents or IETF 50 Contributions published or made publicly available before November 51 10, 2008. The person(s) controlling the copyright in some of this 52 material may not have granted the IETF Trust the right to allow 53 modifications of such material outside the IETF Standards Process. 54 Without obtaining an adequate license from the person(s) controlling 55 the copyright in such materials, this document may not be modified 56 outside the IETF Standards Process, and derivative works of it may 57 not be created outside the IETF Standards Process, except to format 58 it for publication as an RFC or to translate it into languages other 59 than English. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 66 2. Distributed Computing using RSerPool . . . . . . . . . . . . . 3 67 2.1. Requirements . . . . . . . . . . . . . . . . . . . . . . . 3 68 2.2. Architecture . . . . . . . . . . . . . . . . . . . . . . . 4 69 2.3. Limitations . . . . . . . . . . . . . . . . . . . . . . . . 5 70 3. Reference Implementation . . . . . . . . . . . . . . . . . . . 5 71 4. Testbed Platform . . . . . . . . . . . . . . . . . . . . . . . 5 72 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 5 73 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 74 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6 75 7.1. Normative References . . . . . . . . . . . . . . . . . . . 6 76 7.2. Informative References . . . . . . . . . . . . . . . . . . 7 77 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 9 79 1. Introduction 81 Reliable Server Pooling defines protocols for providing highly 82 available services. The services are located in a pool of redundant 83 servers and if a server fails, another server will take over. The 84 only requirement put on these servers belonging to the pool is that 85 if state is maintained by the server, this state must be transferred 86 to the other server taking over. 88 The goal is to provide server-based redundancy. Transport and 89 network level redundancy are handled by the transport and network 90 layer protocols. 92 The application may choose to distribute its traffic over the servers 93 of the pool conforming to a certain policy. 95 1.1. Scope 97 The scope of this document is to explain the way of using Reliable 98 Server Pooling mechanisms to manage and access pools of Distributed 99 Computing resources. 101 1.2. Terminology 103 The terms are commonly identified in related work and can be found in 104 the Aggregate Server Access Protocol and Endpoint Handlespace 105 Redundancy Protocol Common Parameters document [RFC5354]. 107 2. Distributed Computing using RSerPool 109 2.1. Requirements 111 The application scenario for Distributed Computing is defined as 112 follows: 114 o Clients generate large computation jobs. Jobs have to be 115 processed by servers as soon as possible (real-time), i.e. unlike 116 concepts like SETI@home [SETIatHome], it is not possible to let 117 clients fetch a job, process it later and may be some day upload 118 the result. 120 o Jobs may be partitionable, i.e. they can be split up to smaller 121 pieces which can be processed independently and the processing 122 results can be concatenated to the processing result of the 123 complete job. Jobs have to be processed by servers. 125 o Servers may be unreliable; i.e. user computers may be temporarily 126 added to the pool of computing resources and may be revoked when 127 they are used again by their owners. Furthermore, they may simply 128 disappear because of broken network connections (modems, etc.) or 129 power turned off. 131 o The processing power of servers in a pool of computing resources 132 may be very heterogeneous, i.e. a few supercomputers and many low- 133 end user PCs. 135 Maintaining a Distributed Computing pool for the scenario described 136 above arises the following requirements to the pool management: 138 o It must be possible to manage large server pools, e.g. up to some 139 hundreds or even thousands of servers. 141 o Due to heterogeneous processing resources within a pool, it must 142 be possible to use appropriate server selection procedures to 143 meaningfully utilize the available resources. 145 o It must be possible to dynamically add and remove servers. 147 o Servers may be unreliable, especially when the servers are 148 represented by user PCs. Failover mechanisms are required to 149 continue an interrupted computation session. 151 2.2. Architecture 153 All requirements for pool and session management of the Distributed 154 Computing scenario defined in the previous section can be fulfilled 155 by the Reliable Server Pooling architecture: 157 o An efficient implementation of the handlespace management 158 structures allows pools to contain thousands of elements. 159 Handlespace management structures have been proposed, implemented 160 and analyzed in [IJHIT2008], [Dre2006]. 162 o RSerPool allows to specify server selection rules by pool member 163 selection policies [RFC5356]. A set of adaptive and non-adaptive 164 policies is already defined. To fulfill the requirements of new 165 applications, it is also possible to define new policies. 166 Research has already been made on the subject of load distribution 167 efficiency of pool policies in Distributed Computing scenarios: 168 see [Dre2006], [IJAIT2009], [LCN2005], [Tencon2005], 169 [Euromicro2007] for details. 171 o Dynamic addition and removal of PEs is a feature of RSerPool 172 [RFC5352]. 174 o The control/data channel concept [RFC5351] of RSerPool realizes a 175 session layer. That is, RSerPool already handles the main task of 176 maintaining and monitoring connections between PUs and PEs; the 177 only task of the application layer to provide full failover 178 functionality is to realize an application-dependent failover 179 procedure. By the usage of client-based state synchronization 180 [IJAIT2009], [LCN2002] in the form of ASAP Cookies, a failover may 181 be fully transparent to the PU while only a state restoration is 182 necessary on the PE side. A demo application [RSerPoolPage] using 183 the RSerPool session layer in a Distributed Computing application 184 is described in [Infocom2005]. 186 2.3. Limitations 188 Applying RSerPool for distributed computing applications, the duties 189 of the RSerPool architecture are still limited to the management of 190 pools and independent sessions only. It is in particular a non-goal 191 to provide functionalities like data synchronization among sessions, 192 user authentication, accounting or the support for more than one 193 administrative domain. Such functionalities are considered to be 194 application-specific and are therefore out of the scope of RSerPool. 196 3. Reference Implementation 198 The RSerPool reference implementation RSPLIB, including example 199 Distributed Computing applications, can be found at [RSerPoolPage]. 200 It supports the functionalities defined by [RFC5351], [RFC5352], 201 [RFC5353], [RFC5354] and [RFC5355] as well as the options 202 [I-D.dreibholz-rserpool-asap-hropt], 203 [I-D.dreibholz-rserpool-enrp-takeover] and 204 [I-D.dreibholz-rserpool-delay]. An introduction to this 205 implementation is provided in [Dre2006]. 207 4. Testbed Platform 209 A large-scale and realistic Internet testbed platform with support 210 for the multi-homing feature of the underlying SCTP protocol is 211 NorNet. A description of NorNet is provided in [PAMS2013-NorNet], 212 some further information can be found on the project website 213 [NorNet-Website]. 215 5. Security Considerations 217 The protocols used in the Reliable Server Pooling architecture only 218 try to increase the availability of the servers in the network. 220 RSerPool protocols do not contain any protocol mechanisms which are 221 directly related to user message authentication, integrity and 222 confidentiality functions. For such features, it depends on the 223 IPSEC protocols or on Transport Layer Security (TLS) protocols for 224 its own security and on the architecture and/or security features of 225 its user protocols. 227 The RSerPool architecture allows the use of different transport 228 protocols for its application and control data exchange. These 229 transport protocols may have mechanisms for reducing the risk of 230 blind denial-of-service attacks and/or masquerade attacks. If such 231 measures are required by the applications, then it is advised to 232 check the SCTP (see [RFC4960]) applicability statement [RFC3257] for 233 guidance on this issue. 235 6. IANA Considerations 237 This document introduces no additional considerations for IANA. 239 7. References 241 7.1. Normative References 243 [RFC3257] Coene, L., "Stream Control Transmission Protocol 244 Applicability Statement", RFC 3257, April 2002. 246 [RFC4960] Stewart, R., "Stream Control Transmission Protocol", 247 RFC 4960, September 2007. 249 [RFC5351] Lei, P., Ong, L., Tuexen, M., and T. Dreibholz, "An 250 Overview of Reliable Server Pooling Protocols", RFC 5351, 251 September 2008. 253 [RFC5352] Stewart, R., Xie, Q., Stillman, M., and M. Tuexen, 254 "Aggregate Server Access Protocol (ASAP)", RFC 5352, 255 September 2008. 257 [RFC5353] Xie, Q., Stewart, R., Stillman, M., Tuexen, M., and A. 258 Silverton, "Endpoint Handlespace Redundancy Protocol 259 (ENRP)", RFC 5353, September 2008. 261 [RFC5354] Stewart, R., Xie, Q., Stillman, M., and M. Tuexen, 262 "Aggregate Server Access Protocol (ASAP) and Endpoint 263 Handlespace Redundancy Protocol (ENRP) Parameters", 264 RFC 5354, September 2008. 266 [RFC5355] Stillman, M., Gopal, R., Guttman, E., Sengodan, S., and M. 267 Holdrege, "Threats Introduced by Reliable Server Pooling 268 (RSerPool) and Requirements for Security in Response to 269 Threats", RFC 5355, September 2008. 271 [RFC5356] Dreibholz, T. and M. Tuexen, "Reliable Server Pooling 272 Policies", RFC 5356, September 2008. 274 [I-D.dreibholz-rserpool-asap-hropt] 275 Dreibholz, T., "Handle Resolution Option for ASAP", 276 draft-dreibholz-rserpool-asap-hropt-11 (work in progress), 277 July 2012. 279 [I-D.dreibholz-rserpool-delay] 280 Dreibholz, T. and X. Zhou, "Definition of a Delay 281 Measurement Infrastructure and Delay-Sensitive Least-Used 282 Policy for Reliable Server Pooling", 283 draft-dreibholz-rserpool-delay-10 (work in progress), 284 July 2012. 286 [I-D.dreibholz-rserpool-enrp-takeover] 287 Dreibholz, T. and X. Zhou, "Takeover Suggestion Flag for 288 the ENRP Handle Update Message", 289 draft-dreibholz-rserpool-enrp-takeover-08 (work in 290 progress), July 2012. 292 7.2. Informative References 294 [Dre2006] Dreibholz, T., "Reliable Server Pooling - Evaluation, 295 Optimization and Extension of a Novel IETF Architecture", 296 March 2007. 298 [Euromicro2007] 299 Dreibholz, T., Zhou, X., and E. Rathgeb, "A Performance 300 Evaluation of RSerPool Server Selection Policies in 301 Varying Heterogeneous Capacity Scenarios", Proceedings of 302 the 33rd IEEE EuroMirco Conference on Software Engineering 303 and Advanced Applications, Pages 157-164, ISBN 0-7695- 304 2977-1, DOI 10.1109/EUROMICRO.2007.9, August 2007. 306 [IJAIT2009] 307 Dreibholz, T. and E. Rathgeb, "Overview and Evaluation of 308 the Server Redundancy and Session Failover Mechanisms in 309 the Reliable Server Pooling Framework", International 310 Journal on Advances in Internet Technology (IJAIT), Volume 311 2, Number 1, Pages 1-14, ISSN 1942-2652, June 2009. 313 [IJHIT2008] 314 Dreibholz, T. and E. Rathgeb, "An Evaluation of the Pool 315 Maintenance Overhead in Reliable Server Pooling Systems", 316 SERSC International Journal on Hybrid Information 317 Technology (IJHIT), Volume 1, Number 2, Pages 17-32, 318 ISSN 1738-9968, April 2008. 320 [Infocom2005] 321 Dreibholz, T. and E. Rathgeb, "An Application 322 Demonstration of the Reliable Server Pooling Framework", 323 Proceedings of the 24th IEEE INFOCOM , March 2005. 325 [LCN2002] Dreibholz, T., "An Efficient Approach for State Sharing in 326 Server Pools", Proceedings of the 27th IEEE Local Computer 327 Networks Conference (LCN), Pages 348-349, ISBN 0-7695- 328 1591-6, DOI 10.1109/LCN.2002.1181806, November 2002. 330 [LCN2005] Dreibholz, T. and E. Rathgeb, "On the Performance of 331 Reliable Server Pooling Systems", Proceedings of the IEEE 332 Conference on Local Computer Networks (LCN) 30th 333 Anniversary, Pages 200-208, ISBN 0-7695-2421-4, 334 DOI 10.1109/LCN.2005.98, November 2005. 336 [RSerPoolPage] 337 Dreibholz, T., "Thomas Dreibholz's RSerPool Page", 2012. 339 [SETIatHome] 340 SETI Project, "SETI@home: Search for Extraterrestrial 341 Intelligence at home", 2010. 343 [Tencon2005] 344 Dreibholz, T. and E. Rathgeb, "The Performance of Reliable 345 Server Pooling Systems in Different Server Capacity 346 Scenarios", Proceedings of the IEEE TENCON, ISBN 0-7803- 347 9312-0, DOI 10.1109/TENCON.2005.300939, November 2005. 349 [NorNet-Website] 350 Xiang, J., "NorNet -- A Programmable Testbed for 351 Measurements and Experimental Networking Research", 2013. 353 [PAMS2013-NorNet] 354 Dreibholz, T. and E. Gran, "Design and Implementation of 355 the NorNet Core Research Testbed for Multi-Homed Systems", 356 Proceedings of the 3nd International Workshop on Protocols 357 and Applications with Multi-Homing Support (PAMS) , 358 March 2013. 360 Author's Address 362 Thomas Dreibholz 363 Simula Research Laboratory, Network Systems Group 364 Martin Linges vei 17 365 1364 Fornebu, Akershus 366 Norway 368 Phone: +47-6782-8200 369 Fax: +47-6782-8201 370 Email: dreibh@simula.no 371 URI: http://www.iem.uni-due.de/~dreibh/