idnits 2.17.1 draft-vinapamula-flow-ha-14.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 1, 2015) is 3097 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Vinapamula 3 Internet-Draft Juniper Networks 4 Intended status: Informational S. Sivakumar 5 Expires: May 4, 2016 Cisco Systems 6 M. Boucadair 7 Orange 8 T. Reddy 9 Cisco 10 November 1, 2015 12 Application-Initiated Flow High Availability Awareness through Port 13 Control Protocol (PCP) 14 draft-vinapamula-flow-ha-14 16 Abstract 18 This document specifies a mechanism for a host to signal via Port 19 Control Protocol (PCP) which connections should be protected against 20 network failures. These connections will be elected to be subject to 21 high availability mechanisms enabled at the network side. 23 This approach assumes that applications/users have more visibility 24 about sensitive connections rather than any heuristic that can be 25 enabled at the network side to guess which connections should be 26 check-pointed. 28 Requirements Language 30 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 31 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 32 document are to be interpreted as described in RFC 2119 [RFC2119]. 34 Status of This Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at http://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on May 4, 2016. 50 Copyright Notice 52 Copyright (c) 2015 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 68 1.1. Note . . . . . . . . . . . . . . . . . . . . . . . . . . 3 69 2. Issues with the Existing Implementations . . . . . . . . . . 3 70 3. CHECKPOINT-REQUIRED PCP Option . . . . . . . . . . . . . . . 4 71 3.1. Format . . . . . . . . . . . . . . . . . . . . . . . . . 4 72 3.2. Operation . . . . . . . . . . . . . . . . . . . . . . . . 5 73 4. Sample Use cases . . . . . . . . . . . . . . . . . . . . . . 7 74 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 75 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 76 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 77 7.1. Normative references . . . . . . . . . . . . . . . . . . 9 78 7.2. Informative References . . . . . . . . . . . . . . . . . 9 79 Appendix A. Appendix . . . . . . . . . . . . . . . . . . . . . . 11 80 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 11 81 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 83 1. Introduction 85 The risk of Internet service disruption is critical in service 86 providers and enterprise networking environments. Such a risk is 87 often mitigated with the introduction of active/backup systems. Such 88 designs not only contribute to minimize the risk of service 89 disruption, but also facilitate maintenance operations (e.g., hitless 90 H/W or S/W upgrades). 92 In addition, the nature of some connections leads to the 93 establishment and the maintenance of connection-specific states by 94 some of the network functions invoked when the connection is 95 established. During active/backup failover in case of a network 96 failure, the said states need to be check-pointed by the backup 97 system. Additional issues are further discussed in Section 2. 99 Heuristics based on the protocol, mapping lifetime, etc., are used in 100 the network to elect which connections need to be check-pointed 101 (e.g., by means of high availability techniques). This document 102 advocates for an application-initiated approach that would allow 103 applications/users to signal to the network which of their 104 connections are critical. 106 This document specifies how PCP [RFC6887] can be extended to signal 107 which connection should be check-pointed for high availability 108 (Section 3). A set of use cases are provided for illustration 109 purposes in Section 4. This document does not make any assumption on 110 the PCP-controlled device that will process the PCP-formatted 111 signaling information from PCP clients. These devices are likely to 112 be flow-aware. 114 The approach in this document is aligned with the networking trends 115 advocating for open network APIs to interact with applications/ 116 services (e.g., [RFC7149]). Policy-decision making process at the 117 network side will be enriched with information signaled by 118 application using PCP for instance. 120 1.1. Note 122 The CHECKPOINT-REQUIRED PCP option (Section 3) is defined in the 123 Specification Required range (see Section 6). In order to be 124 assigned a code point in that range, a permanent publication is 125 required as per Section 4.1 of [RFC5226]. Publication of an RFC is 126 an ideal means of achieving this requirement and also to ease 127 interoperability. 129 Note, this work was presented to the Port Control Protocol (pcp) WG 130 but there was no consensus to define this option in the "Standards 131 Action" range despite positive feedback was received from the working 132 group. Technical comments that were received during pcp meetings and 133 those received on the mailing list were addressed. 135 2. Issues with the Existing Implementations 137 Regardless of the selected technology or design like HA-based 138 designs, reliably securing connections is expensive in terms of 139 memory, CPU and other resources. Also check-pointing may not be 140 required for all connections as all connections may not be critical. 141 But, this leaves a challenge to identify what connections to check- 142 point. 144 Typically, long-lived connections are identified and, only the states 145 of such connections are check-pointed. 147 Typically, this is addressed by identifying long lived connections 148 and check-pointing state of only those connections that lived long 149 enough, to the backup for service continuity. 151 However, check-pointing long lived connections raises the following 152 issues: 154 1. It is hard for a network to identify/guess which connection is 155 (business) critical. This characterization is often customer- 156 specific: a flow can be sensitive for a User#1 while it is not 157 for another User#2. Furthermore, this characterization can vary 158 over time: a flow can be sensitive during hour X, while it is not 159 be during other times. 161 2. Heuristics are not deterministic. 163 3. A potentially long-lived connection may experience disruption 164 upon failure of the active system, but before it is check- 165 pointed. 167 4. A connection may not be long lived but critical Voice over IP 168 (VoIP) conversations. 170 5. Likewise, not all long-lived connections are deemed critical: for 171 example, connections that pertain to free Internet services are 172 usually considered not critical compared to the equivalent 173 connections for paid services. Only the latter need to be check- 174 pointed. 176 3. CHECKPOINT-REQUIRED PCP Option 178 3.1. Format 180 The solution is based on the assumption that an application or user 181 is the best judge to decide which of its connections are critical. 183 An application or user may explicitly identify the connections that 184 need to be check-pointed by means of a PCP client, using the 185 CHECKPOINT_REQUIRED option as described in Figure 1. 187 The entry to be backed up is indicated by the content of a MAP or 188 PEER message. 190 0 1 2 3 191 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 192 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 193 |Option Code=TBA| Reserved | Option Length | 194 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 196 Option Name: CHECKPOINT_REQUIRED 197 Number: 198 Purpose: Indicate if an entry needs to be check-pointed. 199 Valid for Opcodes: MAP, PEER 200 Length: 0. 201 May appear in: request, response. 202 Maximum occurrences: 1. 204 Figure 1: CHECKPOINT_REQUIRED PCP Option 206 The description of the fields is as follows: 208 o Option Code: To be assigned by IANA (see Section 6). 210 o Reserved: This field is initialized as specified in Section 7.3 of 211 [RFC6887]. 213 o Option Length: 0. This means no data is included in the option. 215 An application or user can take advantage of this PCP option to 216 explicitly indicate which of the connections need to be check-pointed 217 and should not be disrupted. The processing of this option by the 218 PCP server will then yield the check-pointing of the corresponding 219 states by the relevant devices or functions dynamically controlled by 220 the PCP server. 222 Communication between application/user and PCP client is 223 implementation-specific. 225 3.2. Operation 227 Support of the CHECKPOINT_REQUIRED option by PCP servers and PCP 228 clients is optional. This option (Code TBA; see Figure 1) may be 229 included in a PCP MAP/PEER request to indicate a connection is to be 230 protected against network failures. 232 There is a risk that every PCP client may wish to check-point every 233 connection, which can potentially load the system. Administration 234 SHOULD restrict the number of connections that can be elected to be 235 backed up and the rate of check-pointing on per network attachment 236 point (e.g., CPE, host). To that aim, the PCP server should 237 unambiguously identify the network attachment point a PCP client 238 belongs to. For example, the PCP server may rely on the PCP identity 239 [RFC7652], the assigned prefix to a CPE/host, the subscriber-mask 240 [I-D.vinapamula-softwire-dslite-prefix-binding], or other 241 identification means. 243 The PCP client includes a CHECKPOINT_REQUIRED option in a MAP or PEER 244 request to signal that the corresponding mapping is to be protected. 246 If the PCP client does not receive a CHECKPOINT_REQUIRED option in 247 response to a PCP request that enclosed the CHECKPOINT_REQUIRED 248 option, this means that either the PCP server does not support the 249 option, or the PCP server is configured to ignore the option or the 250 PCP server cannot satisfy the request expressed in this option (e.g., 251 because of a lack of resources). 253 If the CHECKPOINT_REQUIRED option is not included in the PCP client 254 request, the PCP server MUST NOT include the CHECKPOINT_REQUIRED 255 option in the associated response. 257 When the PCP server receives a CHECKPOINT_REQUIRED option, the PCP 258 server checks if it can honor this request depending on whether 259 resources are available for check-pointing. If there are no 260 resources available for check-pointing, but there are resources 261 available to honor the MAP/PEER request, a response is sent back to 262 the PCP client without including the CHECKPOINT_REQUIRED option 263 (i.e., the request is processed as any MAP/PEER request that does not 264 convey a CHECKPOINT_REQUIRED option). If check-pointing resources 265 are still available and the quota for this PCP client is not reached, 266 the PCP server tags the corresponding entry as eligible to HA 267 mechanism and sends back the CHECKPOINT_REQUIRED option in the 268 positive answer to the PCP client. 270 To update the check-pointing behavior of a mapping maintained by the 271 PCP server, the PCP client generates a PCP MAP/PEER renewal request 272 that includes a CHECKPOINT_REQUIRED option to indicate this mapping 273 has to be check-pointed or without including a CHECKPOINT_REQUIRED 274 option to indicate this mapping does not need be check-pointed 275 anymore. Upon receipt of the PCP request, the PCP server proceeds 276 with the same operations to validate a MAP/PEER request updating an 277 existing mapping. If validation checks are passed, the PCP server 278 updates the check-point flag associated with that mapping accordingly 279 (i.e., it is set if a CHECKPOINT_REQUIRED option was included in the 280 update request or it is cleared if no CHECKPOINT_REQUIRED option was 281 included) , and the PCP server returns the response to the PCP client 282 accordingly. 284 What information to check-point and how to check-point is out of 285 scope of this document, and is left for implementations. Also, 286 interest to indicate check-pointing by users/applications in a PCP 287 request, may be automatic, semi-automatic, or human intervened. This 288 behavior is also left for application implementations. For managed 289 CPEs, a service provider may influence what connections to be check- 290 pointed. 292 It is RECOMMENDED to check-point state on backup for honored requests 293 before a response is sent to the PCP client. 295 4. Sample Use cases 297 Below are provided some examples for illustration purposes: 299 Example 1: Consider a streaming service such as live TV 300 broadcasting, or any other media streaming, that supports check- 301 pointing signalling functionality. Suppose, this application is 302 installed in three hosts A, B and C. For A it is critical and 303 doesn't want interruption while for B it is not. While for C, 304 only some programs are of interest. At the time of installing 305 this application's software, corresponding preferences can be 306 provisioned. When the application starts streaming: 308 * All the flows associated with the streaming application are 309 critical for A. Limiting the number of flows to be backed up 310 will ensure that host doesn't exceed the user's limit. 312 * In case of B, none of these flows are critical for check- 313 pointing. CHECKPOINT_REQUIRED option is not included in the 314 PCP requests. 316 * In case of C, the user is invited to interact with the 317 application by the means of a configuration option that is 318 provided to dynamically select which streaming to check-point, 319 based on the user's interest. 321 Example 2: Consider a streaming service offered by a provider. 322 Suppose, three levels of subscriptions are offered by that 323 provider: e.g., gold, silver, bronze. To guarantee a certain 324 level of quality of service for each subscription, policies are 325 configured such that: 327 * All flows associated with a gold subscription should be check- 328 pointed. 330 * Only some flows associated with a silver subscription are 331 check-pointed. 333 * None of the flows associated with a bronze subscription are 334 check-pointed. 336 When a user invokes the streaming service, he/she may fall into 337 one of those buckets, and according to the configured policy, his/ 338 her associated streaming flows are automatically check-pointed. 339 Login credentials can be used as a trigger to determine the 340 subscription level (and therefore the associated check-pointing 341 behavior). 343 Example 3: Consider a VoIP application that is able to request its 344 flows to be check-pointed. No matter what is configured by the 345 user, some calls such as emergency calls should be check-pointed. 346 The application has to identify such calls. 348 Example 4: In the context of an enterprise network, applications are 349 customized by the administrator. Instructions whether a 350 CHECKPOINT_REQUIRED option is to be included is determined by the 351 administrator. Only the subset of applications identified by the 352 administrator will make use of this option in conformance with the 353 enterprise network management policies. Any mis-behavior can be 354 considered as an abuse. 356 In order to avoid that every application includes a 357 CHECKPOINT_REQUIRED option in its PCP requests, the following items 358 are assumed: 360 o Applications may be delivered with some default settings for 361 check-pointing, and these settings should be programmable by end 362 user. 364 o Exposing and enforcing these settings is application specific. 366 o End user may customize these settings on need basis based on his 367 preferences. 369 5. Security Considerations 371 PCP-related security considerations are discussed in [RFC6887]. 373 CHECKPOINT_REQUIRED option can be used by an attacker to identify 374 critical flows, which is sensitive from a privacy standpoint. Also, 375 an attacker can cause critical flows to not be check-pointed by 376 stripping the CHECKPOINT_REQUIRED option or by consuming the quota by 377 adding the option to other flows. 379 These two issues can be mitigated if the network on which the PCP 380 messages are to be sent is fully trusted. Means to defend against 381 attackers who can intercept packets between the PCP server and the 382 PCP client should be enabled. In some deployments, access control 383 lists (ACLs) can be installed on the PCP client, PCP server, and the 384 network between them, so those ACLs allow only communications between 385 trusted PCP elements. If the networking environment between the PCP 386 client and the PCP server is not secure, PCP authentication [RFC7652] 387 MUST be enabled. 389 A network device can always override the end-user signalling, i.e., 390 what is signaled by the PCP client, if the instructions are 391 conflicting with the network policies. 393 6. IANA Considerations 395 The following PCP Option Code is to be allocated in the 396 "Specification Required" range (192-223; optional to process range) 397 (the registry is maintained in http://www.iana.org/ assignments/pcp- 398 parameters): 400 CHECKPOINT_REQUIRED set to TBA (see Section 3.1) 402 7. References 404 7.1. Normative references 406 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 407 Requirement Levels", BCP 14, RFC 2119, 408 DOI 10.17487/RFC2119, March 1997, 409 . 411 [RFC6887] Wing, D., Ed., Cheshire, S., Boucadair, M., Penno, R., and 412 P. Selkirk, "Port Control Protocol (PCP)", RFC 6887, 413 DOI 10.17487/RFC6887, April 2013, 414 . 416 [RFC7652] Cullen, M., Hartman, S., Zhang, D., and T. Reddy, "Port 417 Control Protocol (PCP) Authentication Mechanism", 418 RFC 7652, DOI 10.17487/RFC7652, September 2015, 419 . 421 7.2. Informative References 423 [I-D.vinapamula-softwire-dslite-prefix-binding] 424 Vinapamula, S. and M. Boucadair, "Recommendations for 425 Prefix Binding in the Softwire DS-Lite Context", draft- 426 vinapamula-softwire-dslite-prefix-binding-12 (work in 427 progress), October 2015. 429 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 430 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 431 DOI 10.17487/RFC5226, May 2008, 432 . 434 [RFC7149] Boucadair, M. and C. Jacquenet, "Software-Defined 435 Networking: A Perspective from within a Service Provider 436 Environment", RFC 7149, DOI 10.17487/RFC7149, March 2014, 437 . 439 Appendix A. Appendix 441 It was tempting to include additional fields in the option but this 442 would lead to a more complex design that is not justified, e.g.,: 444 o Define a dedicated field to indicate a priority level. This 445 priority is intended to be used by the PCP server as a hint when 446 processing a request with a CHECKPOINT_REQUIRED option. 447 Nevertheless, an applications may systematically choose to set the 448 priority level to the highest value so that it increases its 449 chance to be serviced! 451 o Return a more granular failure error code to the requesting PCP 452 client. Nevertheless this would require extra processing at both 453 the PCP client and server sides for handling the various error 454 codes without any guarantee for the PCP client to have its 455 mappings check-pointed. 457 Acknowledgments 459 Thanks to Reinaldo Penno, Stuart Cheshire, Dave Thaler, Prashanth 460 Patil, and Christian Jacquenet for their comments. 462 Authors' Addresses 464 Suresh Vinapamula 465 Juniper Networks 466 1194 North Mathilda Avenue 467 Sunnyvale, CA 94089 468 USA 470 Phone: +1 408 936 5441 471 EMail: sureshk@juniper.net 473 Senthil Sivakumar 474 Cisco Systems 475 7100-8 Kit Creek Road 476 Research Triangle Park, NC 27760 477 USA 479 Phone: +1 919 392 5158 480 EMail: ssenthil@cisco.com 481 Mohamed Boucadair 482 Orange 483 Rennes 35000 484 France 486 EMail: mohamed.boucadair@orange.com 488 Tirumaleswar Reddy 489 Cisco Systems, Inc. 490 Cessna Business Park, Varthur Hobli 491 Sarjapur Marathalli Outer Ring Road 492 Bangalore, Karnataka 560103 493 India 495 EMail: tireddy@cisco.com