idnits 2.17.1 

draft-purkayastha-sfc-service-indirection-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (March 1, 2018) is 2248 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-12) exists of
     draft-ietf-bier-use-cases-06


     Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                     D. Purkayastha
3	Internet-Draft                                                 A. Rahman
4	Intended status: Informational                                D. Trossen
5	Expires: September 2, 2018              InterDigital Communications, LLC
6	                                                           Z. Despotovic
7	                                                              R. Khalili
8	                                                                  Huawei
9	                                                           March 1, 2018

11	    Alternative Handling of Dynamic Chaining and Service Indirection
12	              draft-purkayastha-sfc-service-indirection-02

14	Abstract

16	   Many stringent requirements are imposed on today's network, such as
17	   low latency, high availability and reliability in order to support
18	   several use cases such as IoT, Gaming, Content distribution, Robotics
19	   etc.  Networks need to be flexible and dynamic in terms of allocation
20	   of services and resources.  Network Operators should be able to
21	   reconfigure the composition of a service and steer users towards new
22	   service end points as user move or resource availability changes.
23	   SFC allows network operators to easily create and reconfigure service
24	   function chains dynamically in response to changing network
25	   requirements.  We discuss a use case where Service Function Chain can
26	   adapt or self-organize as demanded by the network condition without
27	   requiring SPI re-classification.  This can be achieved, for example,
28	   by decoupling the service consumer and service endpoint by a new
29	   service function proposed in this draft.  We describe few
30	   requirements for this service function to enable dynamic switching
31	   between consumer and end point.

33	Status of This Memo

35	   This Internet-Draft is submitted in full conformance with the
36	   provisions of BCP 78 and BCP 79.

38	   Internet-Drafts are working documents of the Internet Engineering
39	   Task Force (IETF).  Note that other groups may also distribute
40	   working documents as Internet-Drafts.  The list of current Internet-
41	   Drafts is at https://datatracker.ietf.org/drafts/current/.

43	   Internet-Drafts are draft documents valid for a maximum of six months
44	   and may be updated, replaced, or obsoleted by other documents at any
45	   time.  It is inappropriate to use Internet-Drafts as reference
46	   material or to cite them other than as "work in progress."

48	   This Internet-Draft will expire on September 2, 2018.

50	Copyright Notice

52	   Copyright (c) 2018 IETF Trust and the persons identified as the
53	   document authors.  All rights reserved.

55	   This document is subject to BCP 78 and the IETF Trust's Legal
56	   Provisions Relating to IETF Documents
57	   (https://trustee.ietf.org/license-info) in effect on the date of
58	   publication of this document.  Please review these documents
59	   carefully, as they describe your rights and restrictions with respect
60	   to this document.  Code Components extracted from this document must
61	   include Simplified BSD License text as described in Section 4.e of
62	   the Trust Legal Provisions and are provided without warranty as
63	   described in the Simplified BSD License.

65	Table of Contents

67	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
68	   2.  Use Case Description  . . . . . . . . . . . . . . . . . . . .   3
69	     2.1.  Data Center . . . . . . . . . . . . . . . . . . . . . . .   3
70	     2.2.  Third party cloud service provider  . . . . . . . . . . .   4
71	     2.3.  ETSI MEC USE CASE . . . . . . . . . . . . . . . . . . . .   5
72	     2.4.  3GPP  . . . . . . . . . . . . . . . . . . . . . . . . . .   6
73	     2.5.  Use Case Analysis . . . . . . . . . . . . . . . . . . . .   6
74	   3.  NSH and Re-classification . . . . . . . . . . . . . . . . . .   8
75	     3.1.  Dynamic service chain creation using NSH  . . . . . . . .   9
76	   4.  Challenges with dynamic indirection . . . . . . . . . . . . .  10
77	   5.  HTTP as a transport . . . . . . . . . . . . . . . . . . . . .  12
78	   6.  Service Request Routing (SRR) Service Function  . . . . . . .  14
79	     6.1.  Overview  . . . . . . . . . . . . . . . . . . . . . . . .  14
80	     6.2.  Details of SRR Function . . . . . . . . . . . . . . . . .  16
81	   7.  Protocol Consideration  . . . . . . . . . . . . . . . . . . .  21
82	   8.  Next Steps  . . . . . . . . . . . . . . . . . . . . . . . . .  21
83	   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  21
84	   10. Security Considerations . . . . . . . . . . . . . . . . . . .  22
85	   11. Informative References  . . . . . . . . . . . . . . . . . . .  22
86	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  23

88	1.  Introduction

90	   The requirements on today's networks are very diverse, enabling
91	   multiple use cases such as IoT, Content Distribution, Gaming, Network
92	   functions such as Cloud RAN.  Every use case imposes certain
93	   requirements on the network.  These requirements vary from one
94	   extreme to other and often they are in a divergent direction.
95	   Network operator and service providers are pushing many functions
96	   towards the edge of the network in order to be closer to the users.

98	   This reduces latency and backhaul traffic, as user request can be
99	   processed locally.

101	   It becomes more challenging when network congestion, user mobility as
102	   well as non-deterministic availability of compute and storage
103	   resources are considered.  The impact is felt most in the edge of the
104	   network because as the users move, their point of attachment changes
105	   frequently, which results in (at least partially) relocating the
106	   service as well as the service endpoint.  Furthermore, network
107	   functions are pushed more and more towards the edge, where network,
108	   compute and storage resources are constrained and availability is
109	   non-deterministic.  Constrained network resources may lead into
110	   congestion in the network.  Also, storage resources may need to be
111	   moved where the user concentration is more in case of content
112	   delivery applications.

114	   We describe few use cases in the next section and derive the
115	   requirement for composing new services and service path in a dynamic
116	   edge network.  We address this dynamicity by introducing a special
117	   Service Function, called SRR (service request routing).  We describe
118	   the problems associated with today's network and Layer 3 based
119	   approach to handle dynamicity in the network.  We then discuss how
120	   such new Service Function with certain capabilities can handle the
121	   dynamicity better than these conventional methods.

123	2.  Use Case Description

125	2.1.  Data Center

127	   The data center use case draft [I-D.ietf-sfc-dc-use-cases] describes
128	   an East West traffic use case.  This is the predominant traffic in
129	   data centers today.  Server virtualization has led to the new
130	   paradigm where virtual machines can migrate from one server to
131	   another across the data center.  This explosion in east-west traffic
132	   is leading to newer data center network fabric architectures that
133	   provide consistent latencies from one point in the fabric to another.

135	   SFCs applied in an enterprise or service provider data center can be
136	   broadly categorized into two types:

138	   o  Access SFCs

140	   o  Application SFCs

142	   Access SFCs are focused on servicing traffic entering and leaving the
143	   data center while Application SFCs are focused on servicing traffic
144	   destined to applications.  Service providers deploy a single "Access
145	   SFC" and multiple "Application SFCs" for each tenant.  Enterprise
146	   data center operators on the other hand may not have a need for
147	   Access SFCs depending on the size and requirements of the enterprise.

149	   In carrier networks, operators may deploy multiple data centers
150	   dispersed geographically.  Each data center may host different types
151	   of service functions.  For example, latency sensitive or high usage
152	   service functions are deployed in regional data centers while other
153	   latency tolerant, low usage service functions are deployed in global
154	   or central data centers.  In such deployments, SFCs may span multiple
155	   data centers and enable operators to deploy services in a flexible
156	   and inexpensive way.

158	   It is clear that within the data center as well as in inter data
159	   center scenarios, users are serviced by multiple SFs distributed
160	   inside as well as outside a location.  In this scenario, it is clear
161	   that Service function chains should be able to reselect, redirect
162	   traffic very fast.  The draft identifies that Static service chains
163	   do not allow for modifying the SFCs as they require the ability to
164	   add SNs or remove SNs to scale up and down the service capacity.
165	   Likewise the ability to dynamically pick one among the many SN
166	   instance is not available.

168	2.2.  Third party cloud service provider

170	   This use case is related to an emerging business model, where
171	   computational resources for edge cloud service are provided by
172	   alternative facility providers that are non-traditional network
173	   operators.  This is due to the situation for many specific localized
174	   use cases, where network operators may not have necessary real estate
175	   available.  They may even not be willing to spend on CAPEX and OPEX
176	   for said point-of-presence, because there is no clear path for
177	   sustainable cost recovery [UKNIC].

179	   The industry is witnessing the emergence of real estate owners such
180	   as building asset or management companies, cell tower owners, railway
181	   companies or other facility owners willing to deploy edge cloud
182	   resources.  The facility provider, e.g. cell tower owner or building
183	   management company, deploys edge computing resources throughout their
184	   installation in the country.  They have their own operation and
185	   management software, which is capable of resource deployment, scale
186	   up or scale down resources, deploy edge applications from third party
187	   service providers . They are capable of offering service to more than
188	   one network operator at a specific location, thus acting as a
189	   "neutral host".  The facility provider, which owns cloud resources
190	   and provides application services, is referred to as "Third party
191	   Edge Owner (TEO)".

193	   There is more than one stakeholder in this ecosystem, E.g.  Network
194	   Service Provider, Real estate owner, Cloud capability (compute and
195	   storage resource) provider, Application/service provider.  An entity
196	   can assume more than one role.  From network operators point of view
197	   there may be "Cloud provider" or "Cloud service provider" depending
198	   on the roles assumed by external entity.

200	   "Cloud Providers" provide cloud resources (compute and storage) to
201	   network operators.  Network operators rent those resources and manage
202	   MEC host by themselves.  Network operator can set up application
203	   traffic rules, so that traffic can be processed, by that host.

205	   "Cloud Service Providers" not only make resources available to
206	   network operators or service providers, but also provides management
207	   and hosting service.  They can host edge applications on behalf of
208	   application service providers and sets up user plane traffic to be
209	   steered towards the edge application.

211	   Cloud Service Providers, as well as many organizations that need to
212	   share and analyze a quickly growing amount of data, such as
213	   retailers, manufacturers, telcos, financial services firms, and many
214	   more, are turning to localized Micro Data Centers(MDC) installed on
215	   the factory floor, in the telco central office, the back of a retail
216	   outlet, etc.  The solution applies to a broad base of applications
217	   that require low latency, high bandwidth, or both.

219	   As Micro Date centers are deployed at the edge of the network, common
220	   deployment options are:

222	   o  Micro Data Centers are deployed on L2 in the edge of the network

224	   o  Instead of single internet Point Of Presence (POP) deployment,
225	      multiple internet POP deployment is desirable to localize data

227	   o  Service is composed out of these multiple POP deployment of MDC,
228	      where data exchange and collaboration is expected among these MDCs

230	   o  Due to mobility, changes in network condition (e.g. congestion,
231	      load), service composition may change frequently to support
232	      promised quality of experience

234	2.3.  ETSI MEC USE CASE

236	   Take the following video orchestration service example from ETSI MEC
237	   Requirements document [ETSI_MEC].  The proposed use case of edge
238	   video orchestration suggests a scenario where visual content can be
239	   produced and consumed at the same location close to consumers in a
240	   densely populated and clearly limited area.  Such a case could be a
241	   sports event or concert where a remarkable number of consumers are
242	   using their handheld devices to access user select tailored content.
243	   The overall video experience is combined from multiple sources, such
244	   as local recording devices, which may be fixed as well as mobile, and
245	   master video from central production server.  The user is given an
246	   opportunity to select tailored views from a set of local video
247	   sources.

249	2.4.  3GPP

251	   3GPP Rel. 15 introduces the notion of the service-based interface
252	   (SBI) as an alternative to the traditional call pattern invocation of
253	   network functions.  This introduction targets the support for
254	   replication, e.g., driven by virtualized functions, as well as
255	   supporting alternative interactions, e.g., for different vertical
256	   market specific control planes, by making the discovery as well as
257	   composition of new interactions more flexible.

259	   We believe that SFC is a suitable framework for the interconnection
260	   of such network functions through the new SBI.  One of the
261	   aforementioned driving forces, namely the replication of functions
262	   aligns with our thinking in this draft in that indirections to new
263	   vertical instances need to be dynamic in reacting to the appearance
264	   of new virtual instances or to changes in policies for the selection
265	   of specific instances by specific calling entities.

267	2.5.  Use Case Analysis

269	   SFC allows network operators as well as service providers to compose
270	   new services by chaining individual service functions.

272	   In a dynamic network environment, like the edge of a network, the
273	   capability to dynamically compose new services from available
274	   services as well as move a service instance is desirable.  Dynamic
275	   composition and relocation of services may be attributed to:

277	   o  Congestion in the network: Due to constrained network resources,
278	      increase in the network load may create congestion in the network,
279	      resulting in a congested Service Function Path.  Service functions
280	      may detect congestion and reconfigure the Service Function Path to
281	      avoid it.

283	   o  In response to latency: in a dynamic network environment and with
284	      the need for ultra-low latency communication, instantiation of new
285	      service function endpoints might be the only remedy to combat the
286	      increase of latency caused, e.g., by increased load on a previous
287	      endpoint or mobility of the user and therefore increasing the
288	      'distance' to the service function endpoint.  Keeping the service
289	      function endpoint 'close' to the user allows for reducing latency,
290	      segregating communication in localized islands of service
291	      interaction.

293	   o  In response to user mobility: In a dynamic network environment
294	      where service functions move frequently because of user movement,
295	      load balancing or resource modification, service function chains
296	      and the service end points need to be created and recreated
297	      frequently

299	   o  Resource availability.: Availability of compute and storage
300	      resources varies with network load, number and type of
301	      applications running etc.  In the edge of the network, due to
302	      sudden increase of users, compute load may increase.  In this
303	      situation applications, running on the compute resources may be
304	      moved to another location where more resources are available.

306	   In SFC, there is a notion of logical chaining of SFs and chaining of
307	   actual physical locations, known as Rendered Service Path (RSP).  RSP
308	   provides a static binding of SFs to their physical location.  In
309	   order to create a chain in dynamic fashion, late binding of SFs and
310	   physical location may be desired.  SFC is capable of modifying the
311	   service chain to certain extent in response to network conditions,
312	   but not a complete solution has been described

314	   In order to route the service requests to service end points in a
315	   dynamic manner, we identify the following desirable features in a
316	   service function chain:

318	   o  Capability to trigger service chain reconfiguration based on
319	      network information such as congestion indication, mobility,
320	      degradation of user experience etc.  Service Functions should be
321	      able to process such network information, identify which section
322	      of the chain needs to be reconfigured and take action

324	   o  Fast switching from one service instance to another by not relying
325	      on the DNS for service location resolution.  Instead of DNS, the
326	      function should be able to identify the path, which will allow to
327	      reach the service end point.

329	   o  Direct path mobility, where the path between the requester and the
330	      responding service can be determined as being optimal (e.g.,
331	      shortest path or direct path to a selected instance), is needed to
332	      avoid the use of anchor points and further reduce service-level
333	      latency

335	   o  Indirect service requests at the network level, transparent to the
336	      requesting client and without the involvement of the DNS.  End
337	      user is not aware of the decision made by the SF.

339	   o  New methods for forwarding, such as path-based forwarding, direct
340	      path routing in mobility cases, path pinning for traffic steering
341	      and simplified service-specific peering towards the Internet.

343	3.  NSH and Re-classification

345	   [RFC7498] captures the problems associated with existing service
346	   deployments that are problematic.  The problems are described below
347	   at a high level:

349	   o  Network topology: Network service deployment is tightly coupled
350	      with network topology thus reducing the flexibility in service
351	      delivery.  It adds complexity in deploying network service when
352	      certain traffic types may need some service and other traffic
353	      types do not need the same service.

355	   o  Configuration complexity is the direct result of dependency on
356	      network topology.

358	   o  Limited availability of services

360	   o  Altering the order of a deployed chain is complex and cumbersome

362	   o  Coupling of service functions to topology may require service
363	      functions to support many transport encapsulations or for a
364	      transport gateway function to be present.

366	   o  In a dynamic environment like the Edge of a network service
367	      delivery, routing changes fast.  It may be difficult to deliver
368	      service dynamically due to the risk and complexity of VLANs and/or
369	      routing modifications.

371	   These factors provide motivation for a simplified and flexible
372	   service insertion model that addresses many of the current
373	   shortcomings and provides new, much needed functionality to enable
374	   service deployments in modern network environments.  Service chaining
375	   accomplishes this by considering service functions as resources, with
376	   associated attributes, available for scheduled consumption.
377	   Selective traffic, subject to policy, may then be "steered" to the
378	   requisite service resources, along with any "extra" information
379	   referred to as metadata.  This metadata is used for policy
380	   enforcement.

382	   A basic form of service chaining may be realized using existing
383	   transport encapsulations.  This method of chaining relies upon the
384	   tunneling of selected data between service functions.  Although this
385	   form of service chaining achieves some level of abstraction from the
386	   underlying topology, it does not truly create a service plane.  NSH
387	   [RFC8300] is a distinct identifiable plane that can be used across
388	   all transports to create a service chain and exchange metadata along
389	   the chain.

391	   Fundamentally, however, the notion of "services" in SFC is tied into
392	   specific service function endpoints, which lie along a well-defined
393	   service function path (SFP) where the path is defined through lower
394	   layer transport encapsulations.  If any such service function
395	   endpoint changes, the service chain needs to be adjusted; a procedure
396	   we outline in the following sub-section.

398	3.1.  Dynamic service chain creation using NSH

400	   We revisit the dynamic service chain creation capability of NSH.  NSH
401	   defines a new service plane protocol [RFC8300].  A Network Service
402	   Header (NSH) contains service path information and optionally
403	   metadata that are added to a packet or frame and used to create a
404	   service plane.  A control plane is required in order to exchange NSH
405	   values with participating nodes, and to provision the same nodes with
406	   requisite information such as service path ID to overlay mapping.

408	   The Network Service Header has three parts, Base header, Service Path
409	   Header and Context Header.  NSH Service Path Header is a 4-byte
410	   service path header follows the base header and defines two fields
411	   used to construct a service path:

413	   o  Service path identifier (SPI)

415	   o  Service index (SI)

417	   The following figure depicts the service path header.

419	     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
420	     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
421	     | Service Path ID                               | Service Index |
422	     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

424	                         Figure 1: NSH Path Header

426	   The service path identifier (SPI) is used to identify the service
427	   path that interconnects the needed service functions.  It allows
428	   nodes to utilize the identifier to select the appropriate network
429	   transport protocol and forwarding techniques.  The service index (SI)
430	   identifies the location of a packet within a service path.  As
431	   packets traverse a service path, the SI is decremented post-service.

433	   SPI represents the service path and altering the path identifier
434	   results in a change of a service path.  A change in SPI value is a
435	   result of re-classification.  It means a node in the service path
436	   determined, based on policy, that the initial classification was
437	   incorrect or incomplete.  If the updated classification results in
438	   the necessity of a new service path, the node updates the SPI and SI
439	   fields accordingly.  The new identifier is then used to select the
440	   appropriate overlay topology.  This allows service functions to alter
441	   the path of a packet without having to participate in the network
442	   topology and its associated control plane(s).  The method to
443	   determine that an existing classification is incorrect and how to
444	   determine the new classification is not defined.

446	4.  Challenges with dynamic indirection

448	   The emerging trend in today's network is to deploy network functions,
449	   services and applications at the edge of the network to support
450	   latency requirements, computational offload, traffic optimization
451	   etc.  As users are moving, application or services being used by
452	   users, may need to be moved closer to the user's new location.  This
453	   implies another instance of the service function may need to be
454	   instantiated close to the user's new location.  It may result in re-
455	   establishing service path from the newly instantiated service
456	   function to other service instances.  It is also possible that the
457	   newly instantiated service function may be redirected to a new
458	   service end point (e.g.  Application Server) for various reasons,
459	   such as incomplete content, proximity to data store, load balancing
460	   etc.  In another scenario, a single instance of the service function
461	   may not handle all users due to latency or load constraints.  A
462	   single service function may be instantiated more than once to balance
463	   user load.  As the number of instances increase and along with
464	   mobility, the complexity of service routing increases.  It is
465	   anticipated that there may be a constant action of function chaining,
466	   re-chaining occurring in the network.

468	   The challenge of dynamic indirection may be better described by
469	   analyzing the working of CDNs, which dynamically (re-)direct user-
470	   initiated requests towards the most appropriate content instance.
471	   This task becomes more difficult if granularity of the instance
472	   placement increases.  For instance, in case of a CDN being realized
473	   close to end users, specifically in edge of the network, the specific
474	   content instance might need to be selected dynamically.  After
475	   initial selection, the instance may change during service execution.

477	   In a conventional network, an instance of a service is found and
478	   selected using DNS.  The subsequent service request is then routed
479	   through the network between the client and the service.  If the user
480	   is doing a DNS lookup to access content served by a CDN then the DNS
481	   service will maintain a list of IP addresses that can be returned for
482	   a given domain name and will try to return an IP address of a node
483	   geographically close to the client.  Should the service provider want
484	   to replace an instance of their service with another one at a
485	   different IP address (and potentially a different physical location
486	   for various reasons such as load balancing, reliability etc.) then
487	   the DNS tables must be updated, i.e., the service needs to be
488	   (re-)registered quickly.  This is done by updating the local
489	   authoritative DNS server which then propagates the new mapping to DNS
490	   services across the world.  DNS propagation can take up to 48 hours
491	   so fast and dynamic switching from one service instance to another is
492	   not possible in conventional networks; even in more localized
493	   scenarios, the propagation of DNS updates might still be
494	   insufficient.  When relying on many surrogate service endpoints to
495	   exist in the edge network, there is a clear issue of certain
496	   resources not being available in one surrogate instance while
497	   existing in another so that changes in redirection might be
498	   desirable, while also changes in local load drive the need for such
499	   change in redirection.  With the emergence of container-based
500	   virtualization platforms, service function endpoints can be
501	   established in a matter of seconds and we therefore believe that the
502	   'reachability' of such said service instance, i.e., the possibility
503	   of route service requests to it from a client that was previously
504	   served elsewhere, must follow a similar timeline, i.e., a few seconds
505	   or even less.

507	   The other issue in conventional network lies with mobility management
508	   procedure.  These procedures use an anchor point, which terminates a
509	   session at the network edge.  As user moves around, traffic is
510	   redirected from the anchor point to the new point of attachment.
511	   Relying on typical mobility management approaches found in IP
512	   networks, usually leads to inefficient 'triangular' routing of
513	   requests through this common 'anchor' point.  This triangular routing
514	   increases the latency in reaching the new service function or service
515	   end points as users move.

517	   Traffic steering is a common procedure in managed networks,
518	   particularly at the edge, due to desired subscriber-centric traffic
519	   policies (e.g., related to pricing structures), resource requirements
520	   (e.g., related to using particular paths in the network) or mobility
521	   (e.g., users moving in a cellular network).  Today's methods for
522	   traffic steering include anchor-based mobility management as well as
523	   traffic classification, for instance, in packet gateways of cellular
524	   systems (using, e.g., deep packet inspection as well as port and
525	   address classification).  While the former leads to inefficient
526	   'triangular' traffic forwarding, the latter often requires additional
527	   state in the forwarders to differentiate traffic from one user to
528	   another.

530	   The analysis of CDN network shows that dynamic indirection is a
531	   necessary requirement, which needs to be supported by the networks.
532	   The goal for this indirection is to provide user applications lowest
533	   possible latency.  But as discussed above, relying on today's
534	   technique does not help in guaranteeing same latency to user
535	   applications.  On the other hand, there is a high possibility that
536	   latency may increase if we rely on Layer 3 based service redirection
537	   techniques.

539	   SFC handles indirection through the use of SPI.  A packet needs to be
540	   reclassified and the intermediate node changes the SPI.  Following
541	   are the typical steps that happens in order to implement the
542	   indirection.

544	   o  A packet arrives at a particular node

546	   o  The node contacts the policy manager

548	   o  Identifies the current classification is incorrect

550	   o  Reclassifies the packet, i.e. change the SPI

552	   o  Inserts the packet in the pipe, possibly towards the SFF

554	   The indirection mechanism in SFC involves certain steps to process
555	   policy information and change the SPI in the packet header, making it
556	   suitable to handle dynamic indirection requirements.  Our proposed SF
557	   in this document provides an additional method to handle dynamic
558	   indirection of service requests, not relying on the reclassification
559	   mechanism.  Combining these two techniques may provide flexibility
560	   and improvement over single method.

562	5.  HTTP as a transport

564	   With the extensive use of "web technology", "distributed services"
565	   and availability of heterogeneous network, HTTP has effectively
566	   transitioned into the common transport for name-based E2E
567	   communication across the web.  In the context of SFC and SF, HTTP
568	   requests and response are considered as the "Service Request (SR)".
569	   This use case describes how these SRs are directed towards correct SF
570	   in a fast and dynamic way.  The routing and indirection of SRs are
571	   abstracted at HTTP level, instead of the traditional approach where
572	   routing decision for a service request is made at Layer 3.

574	   If we abstract HTTP as a transport, HTTP requests, such as GET, PUT
575	   and POST can be routed based on the URI associated with the request,
576	   with the URI being simply the name of a resource or the invocation
577	   point for a service transaction.  Based on the name of the resource
578	   requested, the appropriate HTTP request can be routed to the suitable
579	   service endpoint.  If Service Functions (SF) could be identified
580	   using URI or name, HTTP requests to an SF would be routed or directed
581	   using name based routing.  With that, the redirection to the most
582	   suitable service instance is purely done based on named services with
583	   HTTP being a specific (application layer) transport service.

585	   The ongoing EU H2020 efforts like FLAME [H2020FLAME]  are driven by
586	   city-scale many-POP deployments of compute infrastructure, all SDN-
587	   connected and OpenStack managed.  Localized media use cases drive the
588	   need for name-based (HTTP as the main transport protocol here)
589	   service instances being chained with the relationship between
590	   specific virtual instances being controlled at the underlying
591	   routing/switching level.

593	   The notion of 'HTTP as-a transport', utilizing URLs as addressing
594	   scheme, can be used to create SFP as shown in Fig 2., i.e.,
595	   192.168.x.x -> www.example.com -> 192.168.x.x -> www.example2.com ->
596	   192.168.x.x -> ... -> www.exampleN.com.  It is this 'name-based'
597	   relationship that we see possibly realized through specific
598	   replicated instances, where in turn the routing towards those
599	   specific instances is realized by the SRR.

601	                                                         +--------+
602	                                                         |        |
603	            |-------------------------|------------------+  SRR   +
604	            |                         |                  |        |
605	            |                         |                  +---/|\--+
606	            |                         |                       |
607	       +---\|/--+   +---------+   +--\|/--+   +------+   +----+---+
608	       |        |   |         |   |       |   |      |   |        |
609	       + Client +-->+  SRR    +-->+ Media +-->+ SRR  +-->+ Media  +
610	       |        |   |         |   |  Fn1  |   |      |   |  Fn2   |
611	       +--------+   +---------+   +-------+   +------+   +--------+

613	       SFP:192.168.x.x-->www.example.com-->192.168.x.x
614	       -->www.example2.com-->192.168.x.x-->www.exampleN.com

616	            Figure 2: SFP with new HTTP-based Transport option

618	   In a pure SFC architectural framework, Classifier function may
619	   interact with SRR to obtain an SE (Service Encapsulation).  E.g. the
620	   Classifier function may look into the network locator map in Fig 2
621	   and determine the next SF is www.example.com.  It provides this
622	   information to SRR to obtain the next hop information.  SRR returns
623	   the SE for next hop, which can be a "bitfield" information that is
624	   being used in the overlay routing for this part of the SFP.  The
625	   Classifier function uses this SE to route the incoming packet
626	   directly at the transport network level.

628	6.  Service Request Routing (SRR) Service Function

630	6.1.  Overview

632	   The following diagram shows the application of the new proposed SRR
633	   service function in an example of media clients connecting to media
634	   servers.  There may be more than one media functions to support CDN
635	   like architecture, Surrogate servers to handle mobility and load
636	   balancing.

638	                                                         +--------+
639	                                                         |        |
640	            |-------------------------|------------------+  SRR   +
641	            |                         |                  |        |
642	            |                         |                  +---/|\--+
643	            |                         |                       |
644	       +---\|/--+   +---------+   +--\|/--+   +------+   +----+---+
645	       |        |   |         |   |       |   |      |   |        |
646	       + Client +-->+  IP     +-->+ Media +-->+ SRR  +-->+ Media  +
647	       |        |   | Routing |   |  Fn1  |   |      |   |  Fn2   |
648	       +--------+   +---------+   +-------+   +------+   +--------+

650	    Figure 3: General SFC with SRR Flexible Chaining, initiated via IP
651	                         Routed Client Connection

653	   The clients are connected to media functions through frontend routed
654	   network, e.g., relying on standard IP routing, while media functions
655	   are chained via the new proposed service request routing (SRR)
656	   function.  Alternatively, we also envision to utilize the SRR
657	   function directly between client SF and media function SF, as
658	   outlined in the figure below
659	                                                         +--------+
660	                                                         |        |
661	            |-------------------------|------------------+  SRR   +
662	            |                         |                  |        |
663	            |                         |                  +---/|\--+
664	            |                         |                       |
665	       +---\|/--+   +---------+   +--\|/--+   +------+   +----+---+
666	       |        |   |         |   |       |   |      |   |        |
667	       + Client +-->+  SRR    +-->+ Media +-->+ SRR  +-->+ Media  +
668	       |        |   |         |   |  Fn1  |   |      |   |  Fn2   |
669	       +--------+   +---------+   +-------+   +------+   +--------+

671	    Figure 4: General SFC with SRR Flexible Chaining, initiated via SRR
672	                              Chained Client

674	   For our considerations, we assume that each SF is realized by at
675	   least one or more service function endpoints (SFEs).  Hence, instead
676	   of looking at "chaining" as a concept that connects specific SFEs
677	   along a well-defined SFP, we propose to look at "chaining" at the
678	   level of "named" service functions rather than their specific
679	   endpoint instances.  With this in mind, the SRR service function
680	   lifts the relationship between the connecting SFs to the level of
681	   "logical" service functions rather than their specific realizing
682	   endpoints.  Instead of relying on dynamic re-chaining in case of any
683	   dynamically changing relationship between specific SFEs, the SRR
684	   provides the selection of suitable SFEs while maintaining the logical
685	   relationship between the SFs.  In Section 6.3, we will present the
686	   necessary extensions to the SFP concept to support this higher
687	   abstraction of "chaining" via "named" logical SFs.  The SRR
688	   introduces the flexibility in routing service requests from client to
689	   specific SFEs.  In the edge network, where users are moving and
690	   service end points may also change, having flexibility to decide and
691	   steer service requests directly helps in guaranteeing the same
692	   latency to user applications.  Clearly, that is achieved by reducing
693	   the switching time from SF to another.  As service end point changes,
694	   the routing functions makes instantaneous decision to route the
695	   request to the appropriate media server.

697	   The SRR introduces the flexibility in routing service requests from
698	   client to specific SFEs in response to conditions such as congestion
699	   in the network, user mobility etc.  In the edge network, where users
700	   are moving and service end points may also change, having flexibility
701	   to decide and steer service requests directly helps in guaranteeing
702	   the same latency to user applications.  The edge of the network maybe
703	   congested due to limited network resources.  The SRR may be able to
704	   determine network congestion and quickly route service requests to
705	   other Service End point, which is not experiencing congestion.  In
706	   addition, application-layer control functions might utilize latency
707	   measurements to ensure that suitable service instances are being
708	   created during runtime of the scenario such as to ensure that service
709	   function endpoints are available 'nearby' (possibly) moving so as to
710	   keep a desired latency under a desired value.

712	   Clearly, that is achieved by reducing the switching time from one SF
713	   endpoint to another.  As the service end point changes, the routing
714	   functions makes instantaneous decision to route the request to the
715	   appropriate media server.

717	   The possible improvements of using SRR within an SFC framework are
718	   listed below:

720	   o  Fast (between 10 and 20ms) switching times from one service
721	      instance to another by not relying on the DNS for service
722	      discovery and directly routing service requests at the level of
723	      the transport network.

725	   o  The capability to indirect service requests at the network level
726	      will help in reducing latency, when service end points change.
727	      E.g. when a service request is being sent to one surrogate
728	      instance but results in a HTTP 404 or 5xx error response, the
729	      original request is redirected to another alternative surrogate
730	      with minimal latency, i.e., right at the destination of said
731	      failed service request.  Nesting these operations effectively
732	      leads to a net-level 'search' among all available surrogate
733	      instances until the search is exhausted (with a negative result)
734	      or the resource is found.

736	   o  New methods for forwarding, such as path-based forwarding, will
737	      enable direct path routing in mobility cases, path pinning for
738	      traffic steering and simplified service-specific peering towards
739	      the Internet.  Such capability would allow for localizing traffic,
740	      reduce latency and costs.

742	6.2.  Details of SRR Function

744	   Assuming such introduction of an HTTP-level transport notion, the SRR
745	   function can be decomposed further as shown in Fig 5.

747	                                                         +--------+
748	                                                         |        |
749	            |-------------------------|------------------+  SRR   +
750	            |                         |                  |        |
751	            |                         |                  +---/|\--+
752	            |                         |                       |
753	       +---\|/--+   +---------+   +--\|/--+   +------+   +----+----+
754	       |        |   |         |   |       |   |      |   |         |
755	       + Client +-->+  SRR    +-->+Service+-->+ SRR  +-->+ Service +
756	       |        |   |         |   |  Fn1  |   |      |   |  Fn2    |
757	       +--------+   +---------+   +-------+   +------+   +---------+
758	                  /             \
759	                /                 \
760	              /                     \
761	        +--------------------------------------+
762	        |   +------------------+               |
763	        |   |  +-----+  +----+ |     +-----+   |
764	        |--->  | SFC |  | SR | |     | SR  |----->
765	        |   |  |Proxy|  |    | |     |     |   |
766	        |   |  +-----+  +----+ |     +-/|\-+   |
767	        |   |  Use Proxy if NAP|        |      |
768	        |   |  is not SFC      |        |      |
769	        |   |  enabled         |        |      |
770	        |   +-------/|\--------+        |      |
771	        |            |                  |      |
772	        |            |                  |      |
773	        |            |  +----------+    |      |
774	        |            |->| tSFF1    |-----      |
775	        |               +---/|\----+           |
776	        |                    |                 |
777	        |                    |                 |
778	        |     +----------+   |                 |
779	        |     |          |   |                 |
780	        |     +   PCE    +----    +-----+      |
781	        |     |          |--------| RT  |      |
782	        |     +----------+        +-----+      |
783	        |                                      |
784	        +--------------------------------------+

786	                        Figure 5: SRR decomposition

788	   Another option for the two functions routing via the SRR could be
789	   entirely link-local, i.e., there's another simple tSFF2 between
790	   client and SRR as well as SF1 and SRR that is simply a link-local
791	   transport.  The following figure describes this alternate option.

793	                                                         +--------+
794	                                                         |        |
795	            |-------------------------|------------------+  SRR   +
796	            |                         |                  |        |
797	            |                         |                  +---/|\--+
798	            |                         |                       |
799	       +---\|/--+   +---------+   +--\|/--+   +------+   +----+---+
800	       |        |   |         |   |       |   |      |   |        |
801	       + Client +-->+  SRR    +-->+Service+-->+ SRR  +-->+Service +
802	       |        |   |         |   |  Fn1  |   |      |   |  Fn2   |
803	       +--------+   +---------+   +-------+   +------+   +--------+
804	                   /              \
805	                  /                  \
806	                 /                      \
807	       +-----+    +---------------------------------+
808	       |tSFF2|--------->+----+           +-----+    | +--------+
809	       +-----+    |     | SR |           | SR  |----->| tSFF2  |-->
810	                  |     |    |           |     |    | +--------+
811	                  |     +----+           +-/|\-+    |
812	                  |       |                 |       |
813	                  |       |                 |       |
814	                  |       |                 |       |
815	                  |       |                 |       |
816	                  |       |     +-------+   |       |
817	                  |       |---->| tSFF1 |---        |
818	                  |             +--/|\--+           |
819	                  |                 |               |
820	                  |                 |               |
821	                  |      +-------+  |               |
822	                  |      |       |  |               |
823	                  |      + PCE   +---     +----+    |
824	                  |      |       |--------| RT |    |
825	                  |      +-------+        +----+    |
826	                  |                                 |
827	                  +---------------------------------+

829	       Figure 6: SRR decomposition using link-local client/function
830	                               communication

832	   The SRR function may be composed of the following functions:

834	   o  Service Router(SR) at the ingress, terminates on the client side
835	      Layer 3 and above protocols, such as TCP

837	   o  Service Router(SR) at the egress, terminates any transport
838	      protocol on the outgoing (server) side

840	   o  PCE, Path Computation Element function is responsible for
841	      selecting the correct next SF, also possibly realizing path policy
842	      enforcement.  The result of the selection is a path identifier
843	      which is delivered to the ingress SR upon initial path computation
844	      request (i.e., when sending a request to a specific URL on the SFP
845	      for the first time).  The path identifier is utilized for any
846	      future request for a given URL-based SF.  In case of another SF
847	      instance becoming available, indicated to the PCE through a
848	      registration procedure, the PCE will instruct all ingress SRs to
849	      invalidate path identifiers to the specific URL of the SF,
850	      resulting in an initial path computation request at the next SF
851	      request forwarding.  Through this, the newly registered SF
852	      instance might be utilized if the policy-governed path computation
853	      will select said SF instance.

855	   o  Reclassification Trigger Handler (RT) : Network measurement
856	      information, such as latency, packet loss or network congestion,
857	      etc. could be processed by the handler.  This may trigger
858	      reconfiguration of the specific service function endpoint chain
859	      over which the SFC is being executed.  The handler forwards the
860	      information about the chain reconfiguration to PCE.

862	   o  Transport-derived SFF (tSFF1): the communication between ingress/
863	      egress SRs as well as SRs to PCE is realized via a transport-
864	      derived SFF.  We outline here three possible tSFFs

866	      *  SDN-based: This option utilizes path-based forwarding through
867	         SDN-based wildcard matching fields, supported with
868	         OF1.2+[Reed2016].  It can be embedded into slicing approach of
869	         underlying transport infrastructure by leaving typical slicing
870	         fields available (e.g., VLAN tags).  The forwarding utilizes
871	         the Ethernet frame format at Layer 2, representing the
872	         topological links of a specific forwarding path in the
873	         transport network as unique bits in a fixed size bit array.
874	         For the latter, the approach utilizes the IPv6 source and
875	         destination fields for storing the bit array information (in a
876	         simple version for this forwarding, this limits the topology to
877	         256 links but extensions schemes are possible, which are left
878	         out of this document at this stage).  AS mentioned, the SDN
879	         forwarding decision action is a simple wildcard matching,
880	         supported with OF1.2+, with the wildcard representing the
881	         unique bit of a switch-specific output port.  With that, the
882	         switch needs to consider as many forwarding rules as switch
883	         local output ports - see [Reed2016] for more information.  Fig.
884	         xx illustrate this forwarding solution, including the ability
885	         to create ad-hoc multicast relations by simply ORing individual
886	         bitarrays representing unicast paths.

888	      *  Another approach is outlined in [I-D.ietf-bier-use-cases] where
889	         the SFF is suggested to be realized via a BIER overlay, in turn
890	         realized over a BIER-compliant underlay, such as MPLS.  BIER
891	         utilizes a similar bit array approach for representing a
892	         forwarding path in the overlay network but unlike [Reed2016],
893	         the bit fields indicate the egress BIER-compliant router that
894	         the packet is supposed to reach.

896	      *  As yet another alternative, the tSFF may utilize a flow
897	         aggregation approach, outlined in [Khalili2016], called edge
898	         switch classification (ESC).  In this approach, a path from an
899	         ingress to egress SR is described as a so-called edge
900	         classification vector (ECV), which combines information on the
901	         aggregated flow (following [Khalili2016]) and the switch-local
902	         endpoint.  The representation has similar bitarray
903	         characteristics as the previous two approaches

905	   o  NOTE: with the ingress and egress SRs terminating SF Layer 3
906	      connections and the utilization of bitarray-based tSFFs, the
907	      transmission of packets can effectively take place as an ad-hoc
908	      Layer multicast while the SFC itself is denoted as an n-times
909	      unicast SFC.  As an example, consider the chaining of a set of n
910	      clients to a single video server.  Each sub-SFC from an individual
911	      client to the video server will semantically result in a unicast
912	      response from the server back to the client (e.g., carrying the
913	      video chunk for a MPEG DASH-based video stream).  When combining
914	      the sub-SFCs to the single SFC with n times unicast relations to
915	      the server, the SRR will deliver the responses from the server via
916	      one or more multicast responses to one or more clients.  The size
917	      of the individual multicast groups will depend on the
918	      synchronicity of the client requests (and therefore on the
919	      synchronicity of the server responses).  Note that the multicast
920	      relations here are ad-hoc created by ORing the bitarrays
921	      representing the specific clients to which the responses are meant
922	      to be sent.  This is illustrated in the figure below.  The HTTP
923	      multicast use case is being presented in the BIER use case draft
924	      [I-D.ietf-bier-use-cases]albeit without specific a SFC relation.

926	          +---------+   +---------+
927	          |         |   |         |                  +--------+
928	          +IP only  +---+ ICN     +         00000010 | ICN    |
929	          |receiver |   | SR1     |         |--------| SR3    |
930	          |UE       |   +----|----+         |        +---||---+
931	          +---------+        | 10010011     |            ||
932	                       +-----|----+   +----------+ |-----||-----|
933	                       |          |   |          |  |   Cloud  |
934	                       |SDN Switch|---|SDN Switch|   |        |
935	                       |          |   |          |    |--||--|
936	                       +----|-----+   +----------+       ||
937	                            | 10100011                   ||
938	          +---------+   +---|-----+                 +----||----+
939	          |         |   |         |                 |          |
940	          +IP only  +---+ ICN     +                 + IP only  +
941	          |sender UE|   | SR2     |                 | Server   |
942	          +---------+   +---------+                 +----------+

944	       Figure 7: Illustration of Bitfield-based Forwarding using SDN

946	7.  Protocol Consideration

948	   For the operations outlined in the previous section, we foresee the
949	   following protocol changes are required:

951	   o  SR-to-SR protocol for HTTP: HTTP based message exchange between
952	      client and server SRs

954	   o  SR-PCE protocol: Used for path computation, obtaining routing
955	      information as well as provide path updates

957	   o  Registration protocol: Used to register FQDN service endpoints

959	8.  Next Steps

961	   Feedback from the SFC WG on the validity of this solution and its
962	   scope within the SFC WG.  If such alternative to the re-
963	   classification for service indirection is seen beneficial as well as
964	   fitting with the charter of the WG, the next steps would be to update
965	   the draft to outline potential protocol solutions required for the
966	   realization of such SRR SF.

968	9.  IANA Considerations

970	   This document requests no IANA actions.

972	10.  Security Considerations

974	   TBD.

976	11.  Informative References

978	   [ETSI_MEC]
979	              ETSI, "Mobile Edge Computing (MEC), Technical
980	              Requirements", GS MEC 002 1.1.1, March 2016,
981	              <http://www.etsi.org/deliver/etsi_gs/
982	              MEC/001_099/002/01.01.01_60/gs_MEC002v010101p.pdf>.

984	   [H2020FLAME]
985	              EU, "EU H2020 FLAME PROJECT",  , March 2016,
986	              <https://www.ict-flame.eu/>.

988	   [I-D.ietf-bier-use-cases]
989	              Kumar, N., Asati, R., Chen, M., Xu, X., Dolganow, A.,
990	              Przygienda, T., Gulko, A., Robinson, D., Arya, V., and C.
991	              Bestler, "BIER Use Cases", draft-ietf-bier-use-cases-06
992	              (work in progress), January 2018.

994	   [I-D.ietf-sfc-dc-use-cases]
995	              Kumar, S., Tufail, M., Majee, S., Captari, C., and S.
996	              Homma, "Service Function Chaining Use Cases In Data
997	              Centers", draft-ietf-sfc-dc-use-cases-06 (work in
998	              progress), February 2017.

1000	   [Khalili2016]
1001	              Khalili, R., Poe, W., Despotovic, Z., and A. Hecker,
1002	              "Reducing State of SDN Switches in Mobile Core Networks by
1003	              Flow Rule Aggregation", ICCCN, August, 2016.

1005	   [Reed2016]
1006	              Reed, M., Al-Naday, M., Thomas, N., Trossen, D., and S.
1007	              Spirou, "Reducing State of SDN Switches in Mobile Core
1008	              Networks by Flow Rule Aggregation", ICC 2016, 2016.

1010	   [RFC7498]  Quinn, P., Ed. and T. Nadeau, Ed., "Problem Statement for
1011	              Service Function Chaining", RFC 7498,
1012	              DOI 10.17487/RFC7498, April 2015,
1013	              <https://www.rfc-editor.org/info/rfc7498>.

1015	   [RFC8300]  Quinn, P., Ed., Elzur, U., Ed., and C. Pignataro, Ed.,
1016	              "Network Service Header (NSH)", RFC 8300,
1017	              DOI 10.17487/RFC8300, January 2018,
1018	              <https://www.rfc-editor.org/info/rfc8300>.

1020	   [UKNIC]    UK NIC, "5G Infrastructure Requirements in the UK", Final
1021	              Report 3.0, December 2016,
1022	              <https://www.gov.uk/government/uploads/system/uploads/
1023	              attachment_data/
1024	              file/577940/5G_Infrastructure_requirements_for_the_UK_-
1025	              _LS_Telcom_report_for_the_NIC.pdf>.

1027	Authors' Addresses

1029	   Debashish Purkayastha
1030	   InterDigital Communications, LLC
1031	   Conshohocken
1032	   USA

1034	   Email: Debashish.Purkayastha@InterDigital.com

1036	   Akbar Rahman
1037	   InterDigital Communications, LLC
1038	   Montreal
1039	   Canada

1041	   Email: Akbar.Rahman@InterDigital.com

1043	   Dirk Trossen
1044	   InterDigital Communications, LLC
1045	   64 Great Eastern Street, 1st Floor
1046	   London  EC2A 3QR
1047	   United Kingdom

1049	   Email: Dirk.Trossen@InterDigital.com
1050	   URI:   http://www.InterDigital.com/

1052	   Zoran Despotovic
1053	   Huawei

1055	   Email: Zoran.Despotovic@huawei.com
1056	   URI:   http://www.huawei.com/

1058	   Ramin Khalili
1059	   Huawei

1061	   Email: Ramin.khalili@huawei.com
1062	   URI:   http://www.huawei.com/