idnits 2.17.1 draft-dreibholz-rserpool-applic-distcomp-22.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 23, 2017) is 2622 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 4960 (Obsoleted by RFC 9260) == Outdated reference: A later version (-33) exists of draft-dreibholz-rserpool-asap-hropt-19 == Outdated reference: A later version (-32) exists of draft-dreibholz-rserpool-delay-18 == Outdated reference: A later version (-30) exists of draft-dreibholz-rserpool-enrp-takeover-16 Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Dreibholz 3 Internet-Draft Simula Research Laboratory 4 Intended status: Informational January 23, 2017 5 Expires: July 27, 2017 7 Applicability of Reliable Server Pooling for Real-Time Distributed 8 Computing 9 draft-dreibholz-rserpool-applic-distcomp-22.txt 11 Abstract 13 This document describes the applicability of the Reliable Server 14 Pooling architecture to manage real-time distributed computing pools 15 and access the resources of such pools. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at http://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on July 27, 2017. 34 Copyright Notice 36 Copyright (c) 2017 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (http://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 52 1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 2 53 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 2 54 2. Distributed Computing using RSerPool . . . . . . . . . . . . 3 55 2.1. Requirements . . . . . . . . . . . . . . . . . . . . . . 3 56 2.2. Architecture . . . . . . . . . . . . . . . . . . . . . . 4 57 2.3. Limitations . . . . . . . . . . . . . . . . . . . . . . . 4 58 3. Reference Implementation . . . . . . . . . . . . . . . . . . 5 59 4. Testbed Platform . . . . . . . . . . . . . . . . . . . . . . 5 60 5. Security Considerations . . . . . . . . . . . . . . . . . . . 5 61 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 5 62 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 5 63 7.1. Normative References . . . . . . . . . . . . . . . . . . 6 64 7.2. Informative References . . . . . . . . . . . . . . . . . 7 65 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 9 67 1. Introduction 69 Reliable Server Pooling defines protocols for providing highly 70 available services. The services are located in a pool of redundant 71 servers and if a server fails, another server will take over. The 72 only requirement put on these servers belonging to the pool is that 73 if state is maintained by the server, this state must be transferred 74 to the other server taking over. 76 The goal is to provide server-based redundancy. Transport and 77 network level redundancy are handled by the transport and network 78 layer protocols. 80 The application may choose to distribute its traffic over the servers 81 of the pool conforming to a certain policy. 83 1.1. Scope 85 The scope of this document is to explain the way of using Reliable 86 Server Pooling mechanisms to manage and access pools of Distributed 87 Computing resources. 89 1.2. Terminology 91 The terms are commonly identified in related work and can be found in 92 the Aggregate Server Access Protocol and Endpoint Handlespace 93 Redundancy Protocol Common Parameters document [RFC5354]. 95 2. Distributed Computing using RSerPool 97 2.1. Requirements 99 The application scenario for Distributed Computing is defined as 100 follows: 102 o Clients generate large computation jobs. Jobs have to be 103 processed by servers as soon as possible (real-time), i.e. unlike 104 concepts like SETI@home [SETIatHome-Website], it is not possible 105 to let clients fetch a job, process it later and may be some day 106 upload the result. 108 o Jobs may be partitionable, i.e. they can be split up to smaller 109 pieces which can be processed independently and the processing 110 results can be concatenated to the processing result of the 111 complete job. Jobs have to be processed by servers. 113 o Servers may be unreliable; i.e. user computers may be temporarily 114 added to the pool of computing resources and may be revoked when 115 they are used again by their owners. Furthermore, they may simply 116 disappear because of broken network connections (modems, etc.) or 117 power turned off. 119 o The processing power of servers in a pool of computing resources 120 may be very heterogeneous, i.e. a few supercomputers and many low- 121 end user PCs. 123 Maintaining a Distributed Computing pool for the scenario described 124 above arises the following requirements to the pool management: 126 o It must be possible to manage large server pools, e.g. up to some 127 hundreds or even thousands of servers. 129 o Due to heterogeneous processing resources within a pool, it must 130 be possible to use appropriate server selection procedures to 131 meaningfully utilize the available resources. 133 o It must be possible to dynamically add and remove servers. 135 o Servers may be unreliable, especially when the servers are 136 represented by user PCs. Failover mechanisms are required to 137 continue an interrupted computation session. 139 2.2. Architecture 141 All requirements for pool and session management of the Distributed 142 Computing scenario defined in the previous section can be fulfilled 143 by the Reliable Server Pooling architecture: 145 o An efficient implementation of the handlespace management 146 structures allows pools to contain thousands of elements. 147 Handlespace management structures have been proposed, implemented 148 and analyzed in [IJHIT2008], [Dre2006]. 150 o RSerPool allows to specify server selection rules by pool member 151 selection policies [RFC5356]. A set of adaptive and non-adaptive 152 policies is already defined. To fulfill the requirements of new 153 applications, it is also possible to define new policies. 154 Research has already been made on the subject of load distribution 155 efficiency of pool policies in Distributed Computing scenarios: 156 see [Dre2006], [IJAIT2009], [LCN2005], [Tencon2005], 157 [Euromicro2007] for details. 159 o Dynamic addition and removal of PEs is a feature of RSerPool 160 [RFC5352]. 162 o The control/data channel concept [RFC5351] of RSerPool realizes a 163 session layer. That is, RSerPool already handles the main task of 164 maintaining and monitoring connections between PUs and PEs; the 165 only task of the application layer to provide full failover 166 functionality is to realize an application-dependent failover 167 procedure. By the usage of client-based state synchronization 168 [IJAIT2009], [LCN2002] in the form of ASAP Cookies, a failover may 169 be fully transparent to the PU while only a state restoration is 170 necessary on the PE side. A demo application [RSerPool-Website] 171 using the RSerPool session layer in a Distributed Computing 172 application is described in [Infocom2005]. 174 2.3. Limitations 176 Applying RSerPool for distributed computing applications, the duties 177 of the RSerPool architecture are still limited to the management of 178 pools and independent sessions only. It is in particular a non-goal 179 to provide functionalities like data synchronization among sessions, 180 user authentication, accounting or the support for more than one 181 administrative domain. Such functionalities are considered to be 182 application-specific and are therefore out of the scope of RSerPool. 184 3. Reference Implementation 186 The RSerPool reference implementation RSPLIB, including example 187 Distributed Computing applications, can be found at 188 [RSerPool-Website]. It supports the functionalities defined by 189 [RFC5351], [RFC5352], [RFC5353], [RFC5354] and [RFC5355] as well as 190 the options [I-D.dreibholz-rserpool-asap-hropt], 191 [I-D.dreibholz-rserpool-enrp-takeover] and 192 [I-D.dreibholz-rserpool-delay]. An introduction to this 193 implementation is provided in [Dre2006]. 195 4. Testbed Platform 197 A large-scale and realistic Internet testbed platform with support 198 for the multi-homing feature of the underlying SCTP protocol is 199 NorNet. A description of NorNet is provided in [PAMS2013-NorNet], 200 some further information can be found on the project website 201 [NorNet-Website]. 203 5. Security Considerations 205 The protocols used in the Reliable Server Pooling architecture only 206 try to increase the availability of the servers in the network. 207 RSerPool protocols do not contain any protocol mechanisms which are 208 directly related to user message authentication, integrity and 209 confidentiality functions. For such features, it depends on the 210 IPSEC protocols or on Transport Layer Security (TLS) protocols for 211 its own security and on the architecture and/or security features of 212 its user protocols. 214 The RSerPool architecture allows the use of different transport 215 protocols for its application and control data exchange. These 216 transport protocols may have mechanisms for reducing the risk of 217 blind denial-of-service attacks and/or masquerade attacks. If such 218 measures are required by the applications, then it is advised to 219 check the SCTP (see [RFC4960]) applicability statement [RFC3257] for 220 guidance on this issue. 222 6. IANA Considerations 224 This document introduces no additional considerations for IANA. 226 7. References 227 7.1. Normative References 229 [RFC3257] Coene, L., "Stream Control Transmission Protocol 230 Applicability Statement", RFC 3257, DOI 10.17487/RFC3257, 231 April 2002, . 233 [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", 234 RFC 4960, DOI 10.17487/RFC4960, September 2007, 235 . 237 [RFC5351] Lei, P., Ong, L., Tuexen, M., and T. Dreibholz, "An 238 Overview of Reliable Server Pooling Protocols", RFC 5351, 239 DOI 10.17487/RFC5351, September 2008, 240 . 242 [RFC5352] Stewart, R., Xie, Q., Stillman, M., and M. Tuexen, 243 "Aggregate Server Access Protocol (ASAP)", RFC 5352, 244 DOI 10.17487/RFC5352, September 2008, 245 . 247 [RFC5353] Xie, Q., Stewart, R., Stillman, M., Tuexen, M., and A. 248 Silverton, "Endpoint Handlespace Redundancy Protocol 249 (ENRP)", RFC 5353, DOI 10.17487/RFC5353, September 2008, 250 . 252 [RFC5354] Stewart, R., Xie, Q., Stillman, M., and M. Tuexen, 253 "Aggregate Server Access Protocol (ASAP) and Endpoint 254 Handlespace Redundancy Protocol (ENRP) Parameters", 255 RFC 5354, DOI 10.17487/RFC5354, September 2008, 256 . 258 [RFC5355] Stillman, M., Ed., Gopal, R., Guttman, E., Sengodan, S., 259 and M. Holdrege, "Threats Introduced by Reliable Server 260 Pooling (RSerPool) and Requirements for Security in 261 Response to Threats", RFC 5355, DOI 10.17487/RFC5355, 262 September 2008, . 264 [RFC5356] Dreibholz, T. and M. Tuexen, "Reliable Server Pooling 265 Policies", RFC 5356, DOI 10.17487/RFC5356, September 2008, 266 . 268 [I-D.dreibholz-rserpool-asap-hropt] 269 Dreibholz, T., "Handle Resolution Option for ASAP", draft- 270 dreibholz-rserpool-asap-hropt-19 (work in progress), July 271 2016. 273 [I-D.dreibholz-rserpool-delay] 274 Dreibholz, T. and X. Zhou, "Definition of a Delay 275 Measurement Infrastructure and Delay-Sensitive Least-Used 276 Policy for Reliable Server Pooling", draft-dreibholz- 277 rserpool-delay-18 (work in progress), July 2016. 279 [I-D.dreibholz-rserpool-enrp-takeover] 280 Dreibholz, T. and X. Zhou, "Takeover Suggestion Flag for 281 the ENRP Handle Update Message", draft-dreibholz-rserpool- 282 enrp-takeover-16 (work in progress), July 2016. 284 7.2. Informative References 286 [Dre2006] Dreibholz, T., "Reliable Server Pooling - Evaluation, 287 Optimization and Extension of a Novel IETF Architecture", 288 March 2007, . 292 [Euromicro2007] 293 Dreibholz, T., Zhou, X., and E. Rathgeb, "A Performance 294 Evaluation of RSerPool Server Selection Policies in 295 Varying Heterogeneous Capacity Scenarios", Proceedings of 296 the 33rd IEEE EuroMirco Conference on Software Engineering 297 and Advanced Applications Pages 157-164, 298 ISBN 0-7695-2977-1, DOI 10.1109/EUROMICRO.2007.9, August 299 2007, . 302 [IJAIT2009] 303 Dreibholz, T. and E. Rathgeb, "Overview and Evaluation of 304 the Server Redundancy and Session Failover Mechanisms in 305 the Reliable Server Pooling Framework", International 306 Journal on Advances in Internet Technology (IJAIT) Number 307 1, Volume 2, Pages 1-14, ISSN 1942-2652, June 2009, 308 . 311 [IJHIT2008] 312 Dreibholz, T. and E. Rathgeb, "An Evaluation of the Pool 313 Maintenance Overhead in Reliable Server Pooling Systems", 314 SERSC International Journal on Hybrid Information 315 Technology (IJHIT) Number 2, Volume 1, Pages 17-32, 316 ISSN 1738-9968, April 2008, . 320 [Infocom2005] 321 Dreibholz, T. and E. Rathgeb, "An Application 322 Demonstration of the Reliable Server Pooling Framework", 323 Proceedings of the 24th IEEE INFOCOM, March 2005, 324 . 327 [LCN2002] Dreibholz, T., "An Efficient Approach for State Sharing in 328 Server Pools", Proceedings of the 27th IEEE Local Computer 329 Networks Conference (LCN) Pages 348-349, 330 ISBN 0-7695-1591-6, DOI 10.1109/LCN.2002.1181806, November 331 2002, . 335 [LCN2005] Dreibholz, T. and E. Rathgeb, "On the Performance of 336 Reliable Server Pooling Systems", Proceedings of the IEEE 337 Conference on Local Computer Networks (LCN) 30th 338 Anniversary Pages 200-208, ISBN 0-7695-2421-4, 339 DOI 10.1109/LCN.2005.98, November 2005, 340 . 343 [Tencon2005] 344 Dreibholz, T. and E. Rathgeb, "The Performance of Reliable 345 Server Pooling Systems in Different Server Capacity 346 Scenarios", Proceedings of the IEEE 347 TENCON ISBN 0-7803-9312-0, DOI 10.1109/TENCON.2005.300939, 348 November 2005, . 352 [PAMS2013-NorNet] 353 Dreibholz, T. and E. Gran, "Design and Implementation of 354 the NorNet Core Research Testbed for Multi-Homed Systems", 355 Proceedings of the 3nd International Workshop on Protocols 356 and Applications with Multi-Homing Support (PAMS) Pages 357 1094-1100, ISBN 978-0-7695-4952-1, 358 DOI 10.1109/WAINA.2013.71, March 2013, 359 . 363 [SETIatHome-Website] 364 SETI Project, , "SETI@home: Search for Extraterrestrial 365 Intelligence at home", 2016, 366 . 368 [RSerPool-Website] 369 Dreibholz, T., "Thomas Dreibholz's RSerPool Page", 370 Online: http://www.iem.uni-due.de/~dreibh/rserpool/, 2016, 371 . 373 [NorNet-Website] 374 Dreibholz, T., "NorNet -- A Real-World, Large-Scale Multi- 375 Homing Testbed", Online: https://www.nntb.no/, 2016, 376 . 378 Author's Address 380 Thomas Dreibholz 381 Simula Research Laboratory, Network Systems Group 382 Martin Linges vei 17 383 1364 Fornebu, Akershus 384 Norway 386 Phone: +47-6782-8200 387 Fax: +47-6782-8201 388 Email: dreibh@simula.no 389 URI: http://www.iem.uni-due.de/~dreibh/