idnits 2.17.1 

draft-vinapamula-flow-ha-14.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (November 1, 2015) is 3097 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Obsolete informational reference (is this intentional?): RFC 5226
     (Obsoleted by RFC 8126)


     Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                      S. Vinapamula
3	Internet-Draft                                          Juniper Networks
4	Intended status: Informational                              S. Sivakumar
5	Expires: May 4, 2016                                       Cisco Systems
6	                                                            M. Boucadair
7	                                                                  Orange
8	                                                                T. Reddy
9	                                                                   Cisco
10	                                                        November 1, 2015

12	  Application-Initiated Flow High Availability Awareness through Port
13	                         Control Protocol (PCP)
14	                      draft-vinapamula-flow-ha-14

16	Abstract

18	   This document specifies a mechanism for a host to signal via Port
19	   Control Protocol (PCP) which connections should be protected against
20	   network failures.  These connections will be elected to be subject to
21	   high availability mechanisms enabled at the network side.

23	   This approach assumes that applications/users have more visibility
24	   about sensitive connections rather than any heuristic that can be
25	   enabled at the network side to guess which connections should be
26	   check-pointed.

28	Requirements Language

30	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
31	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
32	   document are to be interpreted as described in RFC 2119 [RFC2119].

34	Status of This Memo

36	   This Internet-Draft is submitted in full conformance with the
37	   provisions of BCP 78 and BCP 79.

39	   Internet-Drafts are working documents of the Internet Engineering
40	   Task Force (IETF).  Note that other groups may also distribute
41	   working documents as Internet-Drafts.  The list of current Internet-
42	   Drafts is at http://datatracker.ietf.org/drafts/current/.

44	   Internet-Drafts are draft documents valid for a maximum of six months
45	   and may be updated, replaced, or obsoleted by other documents at any
46	   time.  It is inappropriate to use Internet-Drafts as reference
47	   material or to cite them other than as "work in progress."
48	   This Internet-Draft will expire on May 4, 2016.

50	Copyright Notice

52	   Copyright (c) 2015 IETF Trust and the persons identified as the
53	   document authors.  All rights reserved.

55	   This document is subject to BCP 78 and the IETF Trust's Legal
56	   Provisions Relating to IETF Documents
57	   (http://trustee.ietf.org/license-info) in effect on the date of
58	   publication of this document.  Please review these documents
59	   carefully, as they describe your rights and restrictions with respect
60	   to this document.  Code Components extracted from this document must
61	   include Simplified BSD License text as described in Section 4.e of
62	   the Trust Legal Provisions and are provided without warranty as
63	   described in the Simplified BSD License.

65	Table of Contents

67	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
68	     1.1.  Note  . . . . . . . . . . . . . . . . . . . . . . . . . .   3
69	   2.  Issues with the Existing Implementations  . . . . . . . . . .   3
70	   3.  CHECKPOINT-REQUIRED PCP Option  . . . . . . . . . . . . . . .   4
71	     3.1.  Format  . . . . . . . . . . . . . . . . . . . . . . . . .   4
72	     3.2.  Operation . . . . . . . . . . . . . . . . . . . . . . . .   5
73	   4.  Sample Use cases  . . . . . . . . . . . . . . . . . . . . . .   7
74	   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
75	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
76	   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   9
77	     7.1.  Normative references  . . . . . . . . . . . . . . . . . .   9
78	     7.2.  Informative References  . . . . . . . . . . . . . . . . .   9
79	   Appendix A.  Appendix . . . . . . . . . . . . . . . . . . . . . .  11
80	   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .  11
81	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  11

83	1.  Introduction

85	   The risk of Internet service disruption is critical in service
86	   providers and enterprise networking environments.  Such a risk is
87	   often mitigated with the introduction of active/backup systems.  Such
88	   designs not only contribute to minimize the risk of service
89	   disruption, but also facilitate maintenance operations (e.g., hitless
90	   H/W or S/W upgrades).

92	   In addition, the nature of some connections leads to the
93	   establishment and the maintenance of connection-specific states by
94	   some of the network functions invoked when the connection is
95	   established.  During active/backup failover in case of a network
96	   failure, the said states need to be check-pointed by the backup
97	   system.  Additional issues are further discussed in Section 2.

99	   Heuristics based on the protocol, mapping lifetime, etc., are used in
100	   the network to elect which connections need to be check-pointed
101	   (e.g., by means of high availability techniques).  This document
102	   advocates for an application-initiated approach that would allow
103	   applications/users to signal to the network which of their
104	   connections are critical.

106	   This document specifies how PCP [RFC6887] can be extended to signal
107	   which connection should be check-pointed for high availability
108	   (Section 3).  A set of use cases are provided for illustration
109	   purposes in Section 4.  This document does not make any assumption on
110	   the PCP-controlled device that will process the PCP-formatted
111	   signaling information from PCP clients.  These devices are likely to
112	   be flow-aware.

114	   The approach in this document is aligned with the networking trends
115	   advocating for open network APIs to interact with applications/
116	   services (e.g., [RFC7149]).  Policy-decision making process at the
117	   network side will be enriched with information signaled by
118	   application using PCP for instance.

120	1.1.  Note

122	   The CHECKPOINT-REQUIRED PCP option (Section 3) is defined in the
123	   Specification Required range (see Section 6).  In order to be
124	   assigned a code point in that range, a permanent publication is
125	   required as per Section 4.1 of [RFC5226].  Publication of an RFC is
126	   an ideal means of achieving this requirement and also to ease
127	   interoperability.

129	   Note, this work was presented to the Port Control Protocol (pcp) WG
130	   but there was no consensus to define this option in the "Standards
131	   Action" range despite positive feedback was received from the working
132	   group.  Technical comments that were received during pcp meetings and
133	   those received on the mailing list were addressed.

135	2.  Issues with the Existing Implementations

137	   Regardless of the selected technology or design like HA-based
138	   designs, reliably securing connections is expensive in terms of
139	   memory, CPU and other resources.  Also check-pointing may not be
140	   required for all connections as all connections may not be critical.
141	   But, this leaves a challenge to identify what connections to check-
142	   point.

144	   Typically, long-lived connections are identified and, only the states
145	   of such connections are check-pointed.

147	   Typically, this is addressed by identifying long lived connections
148	   and check-pointing state of only those connections that lived long
149	   enough, to the backup for service continuity.

151	   However, check-pointing long lived connections raises the following
152	   issues:

154	   1.  It is hard for a network to identify/guess which connection is
155	       (business) critical.  This characterization is often customer-
156	       specific: a flow can be sensitive for a User#1 while it is not
157	       for another User#2.  Furthermore, this characterization can vary
158	       over time: a flow can be sensitive during hour X, while it is not
159	       be during other times.

161	   2.  Heuristics are not deterministic.

163	   3.  A potentially long-lived connection may experience disruption
164	       upon failure of the active system, but before it is check-
165	       pointed.

167	   4.  A connection may not be long lived but critical Voice over IP
168	       (VoIP) conversations.

170	   5.  Likewise, not all long-lived connections are deemed critical: for
171	       example, connections that pertain to free Internet services are
172	       usually considered not critical compared to the equivalent
173	       connections for paid services.  Only the latter need to be check-
174	       pointed.

176	3.  CHECKPOINT-REQUIRED PCP Option

178	3.1.  Format

180	   The solution is based on the assumption that an application or user
181	   is the best judge to decide which of its connections are critical.

183	   An application or user may explicitly identify the connections that
184	   need to be check-pointed by means of a PCP client, using the
185	   CHECKPOINT_REQUIRED option as described in Figure 1.

187	   The entry to be backed up is indicated by the content of a MAP or
188	   PEER message.

190	    0                   1                   2                   3
191	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
192	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
193	   |Option Code=TBA|  Reserved     |        Option Length          |
194	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

196	            Option Name: CHECKPOINT_REQUIRED
197	            Number: <TBA>
198	            Purpose:  Indicate if an entry needs to be check-pointed.
199	            Valid for Opcodes: MAP, PEER
200	            Length: 0.
201	            May appear in: request, response.
202	            Maximum occurrences: 1.

204	                 Figure 1: CHECKPOINT_REQUIRED PCP Option

206	   The description of the fields is as follows:

208	   o  Option Code: To be assigned by IANA (see Section 6).

210	   o  Reserved: This field is initialized as specified in Section 7.3 of
211	      [RFC6887].

213	   o  Option Length: 0.  This means no data is included in the option.

215	   An application or user can take advantage of this PCP option to
216	   explicitly indicate which of the connections need to be check-pointed
217	   and should not be disrupted.  The processing of this option by the
218	   PCP server will then yield the check-pointing of the corresponding
219	   states by the relevant devices or functions dynamically controlled by
220	   the PCP server.

222	   Communication between application/user and PCP client is
223	   implementation-specific.

225	3.2.  Operation

227	   Support of the CHECKPOINT_REQUIRED option by PCP servers and PCP
228	   clients is optional.  This option (Code TBA; see Figure 1) may be
229	   included in a PCP MAP/PEER request to indicate a connection is to be
230	   protected against network failures.

232	   There is a risk that every PCP client may wish to check-point every
233	   connection, which can potentially load the system.  Administration
234	   SHOULD restrict the number of connections that can be elected to be
235	   backed up and the rate of check-pointing on per network attachment
236	   point (e.g., CPE, host).  To that aim, the PCP server should
237	   unambiguously identify the network attachment point a PCP client
238	   belongs to.  For example, the PCP server may rely on the PCP identity
239	   [RFC7652], the assigned prefix to a CPE/host, the subscriber-mask
240	   [I-D.vinapamula-softwire-dslite-prefix-binding], or other
241	   identification means.

243	   The PCP client includes a CHECKPOINT_REQUIRED option in a MAP or PEER
244	   request to signal that the corresponding mapping is to be protected.

246	   If the PCP client does not receive a CHECKPOINT_REQUIRED option in
247	   response to a PCP request that enclosed the CHECKPOINT_REQUIRED
248	   option, this means that either the PCP server does not support the
249	   option, or the PCP server is configured to ignore the option or the
250	   PCP server cannot satisfy the request expressed in this option (e.g.,
251	   because of a lack of resources).

253	   If the CHECKPOINT_REQUIRED option is not included in the PCP client
254	   request, the PCP server MUST NOT include the CHECKPOINT_REQUIRED
255	   option in the associated response.

257	   When the PCP server receives a CHECKPOINT_REQUIRED option, the PCP
258	   server checks if it can honor this request depending on whether
259	   resources are available for check-pointing.  If there are no
260	   resources available for check-pointing, but there are resources
261	   available to honor the MAP/PEER request, a response is sent back to
262	   the PCP client without including the CHECKPOINT_REQUIRED option
263	   (i.e., the request is processed as any MAP/PEER request that does not
264	   convey a CHECKPOINT_REQUIRED option).  If check-pointing resources
265	   are still available and the quota for this PCP client is not reached,
266	   the PCP server tags the corresponding entry as eligible to HA
267	   mechanism and sends back the CHECKPOINT_REQUIRED option in the
268	   positive answer to the PCP client.

270	   To update the check-pointing behavior of a mapping maintained by the
271	   PCP server, the PCP client generates a PCP MAP/PEER renewal request
272	   that includes a CHECKPOINT_REQUIRED option to indicate this mapping
273	   has to be check-pointed or without including a CHECKPOINT_REQUIRED
274	   option to indicate this mapping does not need be check-pointed
275	   anymore.  Upon receipt of the PCP request, the PCP server proceeds
276	   with the same operations to validate a MAP/PEER request updating an
277	   existing mapping.  If validation checks are passed, the PCP server
278	   updates the check-point flag associated with that mapping accordingly
279	   (i.e., it is set if a CHECKPOINT_REQUIRED option was included in the
280	   update request or it is cleared if no CHECKPOINT_REQUIRED option was
281	   included) , and the PCP server returns the response to the PCP client
282	   accordingly.

284	   What information to check-point and how to check-point is out of
285	   scope of this document, and is left for implementations.  Also,
286	   interest to indicate check-pointing by users/applications in a PCP
287	   request, may be automatic, semi-automatic, or human intervened.  This
288	   behavior is also left for application implementations.  For managed
289	   CPEs, a service provider may influence what connections to be check-
290	   pointed.

292	   It is RECOMMENDED to check-point state on backup for honored requests
293	   before a response is sent to the PCP client.

295	4.  Sample Use cases

297	   Below are provided some examples for illustration purposes:

299	   Example 1:  Consider a streaming service such as live TV
300	      broadcasting, or any other media streaming, that supports check-
301	      pointing signalling functionality.  Suppose, this application is
302	      installed in three hosts A, B and C.  For A it is critical and
303	      doesn't want interruption while for B it is not.  While for C,
304	      only some programs are of interest.  At the time of installing
305	      this application's software, corresponding preferences can be
306	      provisioned.  When the application starts streaming:

308	      *  All the flows associated with the streaming application are
309	         critical for A.  Limiting the number of flows to be backed up
310	         will ensure that host doesn't exceed the user's limit.

312	      *  In case of B, none of these flows are critical for check-
313	         pointing.  CHECKPOINT_REQUIRED option is not included in the
314	         PCP requests.

316	      *  In case of C, the user is invited to interact with the
317	         application by the means of a configuration option that is
318	         provided to dynamically select which streaming to check-point,
319	         based on the user's interest.

321	   Example 2:  Consider a streaming service offered by a provider.
322	      Suppose, three levels of subscriptions are offered by that
323	      provider: e.g., gold, silver, bronze.  To guarantee a certain
324	      level of quality of service for each subscription, policies are
325	      configured such that:

327	      *  All flows associated with a gold subscription should be check-
328	         pointed.

330	      *  Only some flows associated with a silver subscription are
331	         check-pointed.

333	      *  None of the flows associated with a bronze subscription are
334	         check-pointed.

336	      When a user invokes the streaming service, he/she may fall into
337	      one of those buckets, and according to the configured policy, his/
338	      her associated streaming flows are automatically check-pointed.
339	      Login credentials can be used as a trigger to determine the
340	      subscription level (and therefore the associated check-pointing
341	      behavior).

343	   Example 3:  Consider a VoIP application that is able to request its
344	      flows to be check-pointed.  No matter what is configured by the
345	      user, some calls such as emergency calls should be check-pointed.
346	      The application has to identify such calls.

348	   Example 4:  In the context of an enterprise network, applications are
349	      customized by the administrator.  Instructions whether a
350	      CHECKPOINT_REQUIRED option is to be included is determined by the
351	      administrator.  Only the subset of applications identified by the
352	      administrator will make use of this option in conformance with the
353	      enterprise network management policies.  Any mis-behavior can be
354	      considered as an abuse.

356	   In order to avoid that every application includes a
357	   CHECKPOINT_REQUIRED option in its PCP requests, the following items
358	   are assumed:

360	   o  Applications may be delivered with some default settings for
361	      check-pointing, and these settings should be programmable by end
362	      user.

364	   o  Exposing and enforcing these settings is application specific.

366	   o  End user may customize these settings on need basis based on his
367	      preferences.

369	5.  Security Considerations

371	   PCP-related security considerations are discussed in [RFC6887].

373	   CHECKPOINT_REQUIRED option can be used by an attacker to identify
374	   critical flows, which is sensitive from a privacy standpoint.  Also,
375	   an attacker can cause critical flows to not be check-pointed by
376	   stripping the CHECKPOINT_REQUIRED option or by consuming the quota by
377	   adding the option to other flows.

379	   These two issues can be mitigated if the network on which the PCP
380	   messages are to be sent is fully trusted.  Means to defend against
381	   attackers who can intercept packets between the PCP server and the
382	   PCP client should be enabled.  In some deployments, access control
383	   lists (ACLs) can be installed on the PCP client, PCP server, and the
384	   network between them, so those ACLs allow only communications between
385	   trusted PCP elements.  If the networking environment between the PCP
386	   client and the PCP server is not secure, PCP authentication [RFC7652]
387	   MUST be enabled.

389	   A network device can always override the end-user signalling, i.e.,
390	   what is signaled by the PCP client, if the instructions are
391	   conflicting with the network policies.

393	6.  IANA Considerations

395	   The following PCP Option Code is to be allocated in the
396	   "Specification Required" range (192-223; optional to process range)
397	   (the registry is maintained in http://www.iana.org/ assignments/pcp-
398	   parameters):

400	      CHECKPOINT_REQUIRED set to TBA (see Section 3.1)

402	7.  References

404	7.1.  Normative references

406	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
407	              Requirement Levels", BCP 14, RFC 2119,
408	              DOI 10.17487/RFC2119, March 1997,
409	              <http://www.rfc-editor.org/info/rfc2119>.

411	   [RFC6887]  Wing, D., Ed., Cheshire, S., Boucadair, M., Penno, R., and
412	              P. Selkirk, "Port Control Protocol (PCP)", RFC 6887,
413	              DOI 10.17487/RFC6887, April 2013,
414	              <http://www.rfc-editor.org/info/rfc6887>.

416	   [RFC7652]  Cullen, M., Hartman, S., Zhang, D., and T. Reddy, "Port
417	              Control Protocol (PCP) Authentication Mechanism",
418	              RFC 7652, DOI 10.17487/RFC7652, September 2015,
419	              <http://www.rfc-editor.org/info/rfc7652>.

421	7.2.  Informative References

423	   [I-D.vinapamula-softwire-dslite-prefix-binding]
424	              Vinapamula, S. and M. Boucadair, "Recommendations for
425	              Prefix Binding in the Softwire DS-Lite Context", draft-
426	              vinapamula-softwire-dslite-prefix-binding-12 (work in
427	              progress), October 2015.

429	   [RFC5226]  Narten, T. and H. Alvestrand, "Guidelines for Writing an
430	              IANA Considerations Section in RFCs", BCP 26, RFC 5226,
431	              DOI 10.17487/RFC5226, May 2008,
432	              <http://www.rfc-editor.org/info/rfc5226>.

434	   [RFC7149]  Boucadair, M. and C. Jacquenet, "Software-Defined
435	              Networking: A Perspective from within a Service Provider
436	              Environment", RFC 7149, DOI 10.17487/RFC7149, March 2014,
437	              <http://www.rfc-editor.org/info/rfc7149>.

439	Appendix A.  Appendix

441	   It was tempting to include additional fields in the option but this
442	   would lead to a more complex design that is not justified, e.g.,:

444	   o  Define a dedicated field to indicate a priority level.  This
445	      priority is intended to be used by the PCP server as a hint when
446	      processing a request with a CHECKPOINT_REQUIRED option.
447	      Nevertheless, an applications may systematically choose to set the
448	      priority level to the highest value so that it increases its
449	      chance to be serviced!

451	   o  Return a more granular failure error code to the requesting PCP
452	      client.  Nevertheless this would require extra processing at both
453	      the PCP client and server sides for handling the various error
454	      codes without any guarantee for the PCP client to have its
455	      mappings check-pointed.

457	Acknowledgments

459	   Thanks to Reinaldo Penno, Stuart Cheshire, Dave Thaler, Prashanth
460	   Patil, and Christian Jacquenet for their comments.

462	Authors' Addresses

464	   Suresh Vinapamula
465	   Juniper Networks
466	   1194 North Mathilda Avenue
467	   Sunnyvale, CA  94089
468	   USA

470	   Phone: +1 408 936 5441
471	   EMail: sureshk@juniper.net

473	   Senthil Sivakumar
474	   Cisco Systems
475	   7100-8 Kit Creek Road
476	   Research Triangle Park, NC  27760
477	   USA

479	   Phone: +1 919 392 5158
480	   EMail: ssenthil@cisco.com
481	   Mohamed Boucadair
482	   Orange
483	   Rennes  35000
484	   France

486	   EMail: mohamed.boucadair@orange.com

488	   Tirumaleswar Reddy
489	   Cisco Systems, Inc.
490	   Cessna Business Park, Varthur Hobli
491	   Sarjapur Marathalli Outer Ring Road
492	   Bangalore, Karnataka  560103
493	   India

495	   EMail: tireddy@cisco.com