idnits 2.17.1 draft-purkayastha-sfc-service-indirection-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 1, 2018) is 2248 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-12) exists of draft-ietf-bier-use-cases-06 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group D. Purkayastha 3 Internet-Draft A. Rahman 4 Intended status: Informational D. Trossen 5 Expires: September 2, 2018 InterDigital Communications, LLC 6 Z. Despotovic 7 R. Khalili 8 Huawei 9 March 1, 2018 11 Alternative Handling of Dynamic Chaining and Service Indirection 12 draft-purkayastha-sfc-service-indirection-02 14 Abstract 16 Many stringent requirements are imposed on today's network, such as 17 low latency, high availability and reliability in order to support 18 several use cases such as IoT, Gaming, Content distribution, Robotics 19 etc. Networks need to be flexible and dynamic in terms of allocation 20 of services and resources. Network Operators should be able to 21 reconfigure the composition of a service and steer users towards new 22 service end points as user move or resource availability changes. 23 SFC allows network operators to easily create and reconfigure service 24 function chains dynamically in response to changing network 25 requirements. We discuss a use case where Service Function Chain can 26 adapt or self-organize as demanded by the network condition without 27 requiring SPI re-classification. This can be achieved, for example, 28 by decoupling the service consumer and service endpoint by a new 29 service function proposed in this draft. We describe few 30 requirements for this service function to enable dynamic switching 31 between consumer and end point. 33 Status of This Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at https://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on September 2, 2018. 50 Copyright Notice 52 Copyright (c) 2018 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (https://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 68 2. Use Case Description . . . . . . . . . . . . . . . . . . . . 3 69 2.1. Data Center . . . . . . . . . . . . . . . . . . . . . . . 3 70 2.2. Third party cloud service provider . . . . . . . . . . . 4 71 2.3. ETSI MEC USE CASE . . . . . . . . . . . . . . . . . . . . 5 72 2.4. 3GPP . . . . . . . . . . . . . . . . . . . . . . . . . . 6 73 2.5. Use Case Analysis . . . . . . . . . . . . . . . . . . . . 6 74 3. NSH and Re-classification . . . . . . . . . . . . . . . . . . 8 75 3.1. Dynamic service chain creation using NSH . . . . . . . . 9 76 4. Challenges with dynamic indirection . . . . . . . . . . . . . 10 77 5. HTTP as a transport . . . . . . . . . . . . . . . . . . . . . 12 78 6. Service Request Routing (SRR) Service Function . . . . . . . 14 79 6.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 14 80 6.2. Details of SRR Function . . . . . . . . . . . . . . . . . 16 81 7. Protocol Consideration . . . . . . . . . . . . . . . . . . . 21 82 8. Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . 21 83 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 84 10. Security Considerations . . . . . . . . . . . . . . . . . . . 22 85 11. Informative References . . . . . . . . . . . . . . . . . . . 22 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23 88 1. Introduction 90 The requirements on today's networks are very diverse, enabling 91 multiple use cases such as IoT, Content Distribution, Gaming, Network 92 functions such as Cloud RAN. Every use case imposes certain 93 requirements on the network. These requirements vary from one 94 extreme to other and often they are in a divergent direction. 95 Network operator and service providers are pushing many functions 96 towards the edge of the network in order to be closer to the users. 98 This reduces latency and backhaul traffic, as user request can be 99 processed locally. 101 It becomes more challenging when network congestion, user mobility as 102 well as non-deterministic availability of compute and storage 103 resources are considered. The impact is felt most in the edge of the 104 network because as the users move, their point of attachment changes 105 frequently, which results in (at least partially) relocating the 106 service as well as the service endpoint. Furthermore, network 107 functions are pushed more and more towards the edge, where network, 108 compute and storage resources are constrained and availability is 109 non-deterministic. Constrained network resources may lead into 110 congestion in the network. Also, storage resources may need to be 111 moved where the user concentration is more in case of content 112 delivery applications. 114 We describe few use cases in the next section and derive the 115 requirement for composing new services and service path in a dynamic 116 edge network. We address this dynamicity by introducing a special 117 Service Function, called SRR (service request routing). We describe 118 the problems associated with today's network and Layer 3 based 119 approach to handle dynamicity in the network. We then discuss how 120 such new Service Function with certain capabilities can handle the 121 dynamicity better than these conventional methods. 123 2. Use Case Description 125 2.1. Data Center 127 The data center use case draft [I-D.ietf-sfc-dc-use-cases] describes 128 an East West traffic use case. This is the predominant traffic in 129 data centers today. Server virtualization has led to the new 130 paradigm where virtual machines can migrate from one server to 131 another across the data center. This explosion in east-west traffic 132 is leading to newer data center network fabric architectures that 133 provide consistent latencies from one point in the fabric to another. 135 SFCs applied in an enterprise or service provider data center can be 136 broadly categorized into two types: 138 o Access SFCs 140 o Application SFCs 142 Access SFCs are focused on servicing traffic entering and leaving the 143 data center while Application SFCs are focused on servicing traffic 144 destined to applications. Service providers deploy a single "Access 145 SFC" and multiple "Application SFCs" for each tenant. Enterprise 146 data center operators on the other hand may not have a need for 147 Access SFCs depending on the size and requirements of the enterprise. 149 In carrier networks, operators may deploy multiple data centers 150 dispersed geographically. Each data center may host different types 151 of service functions. For example, latency sensitive or high usage 152 service functions are deployed in regional data centers while other 153 latency tolerant, low usage service functions are deployed in global 154 or central data centers. In such deployments, SFCs may span multiple 155 data centers and enable operators to deploy services in a flexible 156 and inexpensive way. 158 It is clear that within the data center as well as in inter data 159 center scenarios, users are serviced by multiple SFs distributed 160 inside as well as outside a location. In this scenario, it is clear 161 that Service function chains should be able to reselect, redirect 162 traffic very fast. The draft identifies that Static service chains 163 do not allow for modifying the SFCs as they require the ability to 164 add SNs or remove SNs to scale up and down the service capacity. 165 Likewise the ability to dynamically pick one among the many SN 166 instance is not available. 168 2.2. Third party cloud service provider 170 This use case is related to an emerging business model, where 171 computational resources for edge cloud service are provided by 172 alternative facility providers that are non-traditional network 173 operators. This is due to the situation for many specific localized 174 use cases, where network operators may not have necessary real estate 175 available. They may even not be willing to spend on CAPEX and OPEX 176 for said point-of-presence, because there is no clear path for 177 sustainable cost recovery [UKNIC]. 179 The industry is witnessing the emergence of real estate owners such 180 as building asset or management companies, cell tower owners, railway 181 companies or other facility owners willing to deploy edge cloud 182 resources. The facility provider, e.g. cell tower owner or building 183 management company, deploys edge computing resources throughout their 184 installation in the country. They have their own operation and 185 management software, which is capable of resource deployment, scale 186 up or scale down resources, deploy edge applications from third party 187 service providers . They are capable of offering service to more than 188 one network operator at a specific location, thus acting as a 189 "neutral host". The facility provider, which owns cloud resources 190 and provides application services, is referred to as "Third party 191 Edge Owner (TEO)". 193 There is more than one stakeholder in this ecosystem, E.g. Network 194 Service Provider, Real estate owner, Cloud capability (compute and 195 storage resource) provider, Application/service provider. An entity 196 can assume more than one role. From network operators point of view 197 there may be "Cloud provider" or "Cloud service provider" depending 198 on the roles assumed by external entity. 200 "Cloud Providers" provide cloud resources (compute and storage) to 201 network operators. Network operators rent those resources and manage 202 MEC host by themselves. Network operator can set up application 203 traffic rules, so that traffic can be processed, by that host. 205 "Cloud Service Providers" not only make resources available to 206 network operators or service providers, but also provides management 207 and hosting service. They can host edge applications on behalf of 208 application service providers and sets up user plane traffic to be 209 steered towards the edge application. 211 Cloud Service Providers, as well as many organizations that need to 212 share and analyze a quickly growing amount of data, such as 213 retailers, manufacturers, telcos, financial services firms, and many 214 more, are turning to localized Micro Data Centers(MDC) installed on 215 the factory floor, in the telco central office, the back of a retail 216 outlet, etc. The solution applies to a broad base of applications 217 that require low latency, high bandwidth, or both. 219 As Micro Date centers are deployed at the edge of the network, common 220 deployment options are: 222 o Micro Data Centers are deployed on L2 in the edge of the network 224 o Instead of single internet Point Of Presence (POP) deployment, 225 multiple internet POP deployment is desirable to localize data 227 o Service is composed out of these multiple POP deployment of MDC, 228 where data exchange and collaboration is expected among these MDCs 230 o Due to mobility, changes in network condition (e.g. congestion, 231 load), service composition may change frequently to support 232 promised quality of experience 234 2.3. ETSI MEC USE CASE 236 Take the following video orchestration service example from ETSI MEC 237 Requirements document [ETSI_MEC]. The proposed use case of edge 238 video orchestration suggests a scenario where visual content can be 239 produced and consumed at the same location close to consumers in a 240 densely populated and clearly limited area. Such a case could be a 241 sports event or concert where a remarkable number of consumers are 242 using their handheld devices to access user select tailored content. 243 The overall video experience is combined from multiple sources, such 244 as local recording devices, which may be fixed as well as mobile, and 245 master video from central production server. The user is given an 246 opportunity to select tailored views from a set of local video 247 sources. 249 2.4. 3GPP 251 3GPP Rel. 15 introduces the notion of the service-based interface 252 (SBI) as an alternative to the traditional call pattern invocation of 253 network functions. This introduction targets the support for 254 replication, e.g., driven by virtualized functions, as well as 255 supporting alternative interactions, e.g., for different vertical 256 market specific control planes, by making the discovery as well as 257 composition of new interactions more flexible. 259 We believe that SFC is a suitable framework for the interconnection 260 of such network functions through the new SBI. One of the 261 aforementioned driving forces, namely the replication of functions 262 aligns with our thinking in this draft in that indirections to new 263 vertical instances need to be dynamic in reacting to the appearance 264 of new virtual instances or to changes in policies for the selection 265 of specific instances by specific calling entities. 267 2.5. Use Case Analysis 269 SFC allows network operators as well as service providers to compose 270 new services by chaining individual service functions. 272 In a dynamic network environment, like the edge of a network, the 273 capability to dynamically compose new services from available 274 services as well as move a service instance is desirable. Dynamic 275 composition and relocation of services may be attributed to: 277 o Congestion in the network: Due to constrained network resources, 278 increase in the network load may create congestion in the network, 279 resulting in a congested Service Function Path. Service functions 280 may detect congestion and reconfigure the Service Function Path to 281 avoid it. 283 o In response to latency: in a dynamic network environment and with 284 the need for ultra-low latency communication, instantiation of new 285 service function endpoints might be the only remedy to combat the 286 increase of latency caused, e.g., by increased load on a previous 287 endpoint or mobility of the user and therefore increasing the 288 'distance' to the service function endpoint. Keeping the service 289 function endpoint 'close' to the user allows for reducing latency, 290 segregating communication in localized islands of service 291 interaction. 293 o In response to user mobility: In a dynamic network environment 294 where service functions move frequently because of user movement, 295 load balancing or resource modification, service function chains 296 and the service end points need to be created and recreated 297 frequently 299 o Resource availability.: Availability of compute and storage 300 resources varies with network load, number and type of 301 applications running etc. In the edge of the network, due to 302 sudden increase of users, compute load may increase. In this 303 situation applications, running on the compute resources may be 304 moved to another location where more resources are available. 306 In SFC, there is a notion of logical chaining of SFs and chaining of 307 actual physical locations, known as Rendered Service Path (RSP). RSP 308 provides a static binding of SFs to their physical location. In 309 order to create a chain in dynamic fashion, late binding of SFs and 310 physical location may be desired. SFC is capable of modifying the 311 service chain to certain extent in response to network conditions, 312 but not a complete solution has been described 314 In order to route the service requests to service end points in a 315 dynamic manner, we identify the following desirable features in a 316 service function chain: 318 o Capability to trigger service chain reconfiguration based on 319 network information such as congestion indication, mobility, 320 degradation of user experience etc. Service Functions should be 321 able to process such network information, identify which section 322 of the chain needs to be reconfigured and take action 324 o Fast switching from one service instance to another by not relying 325 on the DNS for service location resolution. Instead of DNS, the 326 function should be able to identify the path, which will allow to 327 reach the service end point. 329 o Direct path mobility, where the path between the requester and the 330 responding service can be determined as being optimal (e.g., 331 shortest path or direct path to a selected instance), is needed to 332 avoid the use of anchor points and further reduce service-level 333 latency 335 o Indirect service requests at the network level, transparent to the 336 requesting client and without the involvement of the DNS. End 337 user is not aware of the decision made by the SF. 339 o New methods for forwarding, such as path-based forwarding, direct 340 path routing in mobility cases, path pinning for traffic steering 341 and simplified service-specific peering towards the Internet. 343 3. NSH and Re-classification 345 [RFC7498] captures the problems associated with existing service 346 deployments that are problematic. The problems are described below 347 at a high level: 349 o Network topology: Network service deployment is tightly coupled 350 with network topology thus reducing the flexibility in service 351 delivery. It adds complexity in deploying network service when 352 certain traffic types may need some service and other traffic 353 types do not need the same service. 355 o Configuration complexity is the direct result of dependency on 356 network topology. 358 o Limited availability of services 360 o Altering the order of a deployed chain is complex and cumbersome 362 o Coupling of service functions to topology may require service 363 functions to support many transport encapsulations or for a 364 transport gateway function to be present. 366 o In a dynamic environment like the Edge of a network service 367 delivery, routing changes fast. It may be difficult to deliver 368 service dynamically due to the risk and complexity of VLANs and/or 369 routing modifications. 371 These factors provide motivation for a simplified and flexible 372 service insertion model that addresses many of the current 373 shortcomings and provides new, much needed functionality to enable 374 service deployments in modern network environments. Service chaining 375 accomplishes this by considering service functions as resources, with 376 associated attributes, available for scheduled consumption. 377 Selective traffic, subject to policy, may then be "steered" to the 378 requisite service resources, along with any "extra" information 379 referred to as metadata. This metadata is used for policy 380 enforcement. 382 A basic form of service chaining may be realized using existing 383 transport encapsulations. This method of chaining relies upon the 384 tunneling of selected data between service functions. Although this 385 form of service chaining achieves some level of abstraction from the 386 underlying topology, it does not truly create a service plane. NSH 387 [RFC8300] is a distinct identifiable plane that can be used across 388 all transports to create a service chain and exchange metadata along 389 the chain. 391 Fundamentally, however, the notion of "services" in SFC is tied into 392 specific service function endpoints, which lie along a well-defined 393 service function path (SFP) where the path is defined through lower 394 layer transport encapsulations. If any such service function 395 endpoint changes, the service chain needs to be adjusted; a procedure 396 we outline in the following sub-section. 398 3.1. Dynamic service chain creation using NSH 400 We revisit the dynamic service chain creation capability of NSH. NSH 401 defines a new service plane protocol [RFC8300]. A Network Service 402 Header (NSH) contains service path information and optionally 403 metadata that are added to a packet or frame and used to create a 404 service plane. A control plane is required in order to exchange NSH 405 values with participating nodes, and to provision the same nodes with 406 requisite information such as service path ID to overlay mapping. 408 The Network Service Header has three parts, Base header, Service Path 409 Header and Context Header. NSH Service Path Header is a 4-byte 410 service path header follows the base header and defines two fields 411 used to construct a service path: 413 o Service path identifier (SPI) 415 o Service index (SI) 417 The following figure depicts the service path header. 419 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 420 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 421 | Service Path ID | Service Index | 422 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 424 Figure 1: NSH Path Header 426 The service path identifier (SPI) is used to identify the service 427 path that interconnects the needed service functions. It allows 428 nodes to utilize the identifier to select the appropriate network 429 transport protocol and forwarding techniques. The service index (SI) 430 identifies the location of a packet within a service path. As 431 packets traverse a service path, the SI is decremented post-service. 433 SPI represents the service path and altering the path identifier 434 results in a change of a service path. A change in SPI value is a 435 result of re-classification. It means a node in the service path 436 determined, based on policy, that the initial classification was 437 incorrect or incomplete. If the updated classification results in 438 the necessity of a new service path, the node updates the SPI and SI 439 fields accordingly. The new identifier is then used to select the 440 appropriate overlay topology. This allows service functions to alter 441 the path of a packet without having to participate in the network 442 topology and its associated control plane(s). The method to 443 determine that an existing classification is incorrect and how to 444 determine the new classification is not defined. 446 4. Challenges with dynamic indirection 448 The emerging trend in today's network is to deploy network functions, 449 services and applications at the edge of the network to support 450 latency requirements, computational offload, traffic optimization 451 etc. As users are moving, application or services being used by 452 users, may need to be moved closer to the user's new location. This 453 implies another instance of the service function may need to be 454 instantiated close to the user's new location. It may result in re- 455 establishing service path from the newly instantiated service 456 function to other service instances. It is also possible that the 457 newly instantiated service function may be redirected to a new 458 service end point (e.g. Application Server) for various reasons, 459 such as incomplete content, proximity to data store, load balancing 460 etc. In another scenario, a single instance of the service function 461 may not handle all users due to latency or load constraints. A 462 single service function may be instantiated more than once to balance 463 user load. As the number of instances increase and along with 464 mobility, the complexity of service routing increases. It is 465 anticipated that there may be a constant action of function chaining, 466 re-chaining occurring in the network. 468 The challenge of dynamic indirection may be better described by 469 analyzing the working of CDNs, which dynamically (re-)direct user- 470 initiated requests towards the most appropriate content instance. 471 This task becomes more difficult if granularity of the instance 472 placement increases. For instance, in case of a CDN being realized 473 close to end users, specifically in edge of the network, the specific 474 content instance might need to be selected dynamically. After 475 initial selection, the instance may change during service execution. 477 In a conventional network, an instance of a service is found and 478 selected using DNS. The subsequent service request is then routed 479 through the network between the client and the service. If the user 480 is doing a DNS lookup to access content served by a CDN then the DNS 481 service will maintain a list of IP addresses that can be returned for 482 a given domain name and will try to return an IP address of a node 483 geographically close to the client. Should the service provider want 484 to replace an instance of their service with another one at a 485 different IP address (and potentially a different physical location 486 for various reasons such as load balancing, reliability etc.) then 487 the DNS tables must be updated, i.e., the service needs to be 488 (re-)registered quickly. This is done by updating the local 489 authoritative DNS server which then propagates the new mapping to DNS 490 services across the world. DNS propagation can take up to 48 hours 491 so fast and dynamic switching from one service instance to another is 492 not possible in conventional networks; even in more localized 493 scenarios, the propagation of DNS updates might still be 494 insufficient. When relying on many surrogate service endpoints to 495 exist in the edge network, there is a clear issue of certain 496 resources not being available in one surrogate instance while 497 existing in another so that changes in redirection might be 498 desirable, while also changes in local load drive the need for such 499 change in redirection. With the emergence of container-based 500 virtualization platforms, service function endpoints can be 501 established in a matter of seconds and we therefore believe that the 502 'reachability' of such said service instance, i.e., the possibility 503 of route service requests to it from a client that was previously 504 served elsewhere, must follow a similar timeline, i.e., a few seconds 505 or even less. 507 The other issue in conventional network lies with mobility management 508 procedure. These procedures use an anchor point, which terminates a 509 session at the network edge. As user moves around, traffic is 510 redirected from the anchor point to the new point of attachment. 511 Relying on typical mobility management approaches found in IP 512 networks, usually leads to inefficient 'triangular' routing of 513 requests through this common 'anchor' point. This triangular routing 514 increases the latency in reaching the new service function or service 515 end points as users move. 517 Traffic steering is a common procedure in managed networks, 518 particularly at the edge, due to desired subscriber-centric traffic 519 policies (e.g., related to pricing structures), resource requirements 520 (e.g., related to using particular paths in the network) or mobility 521 (e.g., users moving in a cellular network). Today's methods for 522 traffic steering include anchor-based mobility management as well as 523 traffic classification, for instance, in packet gateways of cellular 524 systems (using, e.g., deep packet inspection as well as port and 525 address classification). While the former leads to inefficient 526 'triangular' traffic forwarding, the latter often requires additional 527 state in the forwarders to differentiate traffic from one user to 528 another. 530 The analysis of CDN network shows that dynamic indirection is a 531 necessary requirement, which needs to be supported by the networks. 532 The goal for this indirection is to provide user applications lowest 533 possible latency. But as discussed above, relying on today's 534 technique does not help in guaranteeing same latency to user 535 applications. On the other hand, there is a high possibility that 536 latency may increase if we rely on Layer 3 based service redirection 537 techniques. 539 SFC handles indirection through the use of SPI. A packet needs to be 540 reclassified and the intermediate node changes the SPI. Following 541 are the typical steps that happens in order to implement the 542 indirection. 544 o A packet arrives at a particular node 546 o The node contacts the policy manager 548 o Identifies the current classification is incorrect 550 o Reclassifies the packet, i.e. change the SPI 552 o Inserts the packet in the pipe, possibly towards the SFF 554 The indirection mechanism in SFC involves certain steps to process 555 policy information and change the SPI in the packet header, making it 556 suitable to handle dynamic indirection requirements. Our proposed SF 557 in this document provides an additional method to handle dynamic 558 indirection of service requests, not relying on the reclassification 559 mechanism. Combining these two techniques may provide flexibility 560 and improvement over single method. 562 5. HTTP as a transport 564 With the extensive use of "web technology", "distributed services" 565 and availability of heterogeneous network, HTTP has effectively 566 transitioned into the common transport for name-based E2E 567 communication across the web. In the context of SFC and SF, HTTP 568 requests and response are considered as the "Service Request (SR)". 569 This use case describes how these SRs are directed towards correct SF 570 in a fast and dynamic way. The routing and indirection of SRs are 571 abstracted at HTTP level, instead of the traditional approach where 572 routing decision for a service request is made at Layer 3. 574 If we abstract HTTP as a transport, HTTP requests, such as GET, PUT 575 and POST can be routed based on the URI associated with the request, 576 with the URI being simply the name of a resource or the invocation 577 point for a service transaction. Based on the name of the resource 578 requested, the appropriate HTTP request can be routed to the suitable 579 service endpoint. If Service Functions (SF) could be identified 580 using URI or name, HTTP requests to an SF would be routed or directed 581 using name based routing. With that, the redirection to the most 582 suitable service instance is purely done based on named services with 583 HTTP being a specific (application layer) transport service. 585 The ongoing EU H2020 efforts like FLAME [H2020FLAME] are driven by 586 city-scale many-POP deployments of compute infrastructure, all SDN- 587 connected and OpenStack managed. Localized media use cases drive the 588 need for name-based (HTTP as the main transport protocol here) 589 service instances being chained with the relationship between 590 specific virtual instances being controlled at the underlying 591 routing/switching level. 593 The notion of 'HTTP as-a transport', utilizing URLs as addressing 594 scheme, can be used to create SFP as shown in Fig 2., i.e., 595 192.168.x.x -> www.example.com -> 192.168.x.x -> www.example2.com -> 596 192.168.x.x -> ... -> www.exampleN.com. It is this 'name-based' 597 relationship that we see possibly realized through specific 598 replicated instances, where in turn the routing towards those 599 specific instances is realized by the SRR. 601 +--------+ 602 | | 603 |-------------------------|------------------+ SRR + 604 | | | | 605 | | +---/|\--+ 606 | | | 607 +---\|/--+ +---------+ +--\|/--+ +------+ +----+---+ 608 | | | | | | | | | | 609 + Client +-->+ SRR +-->+ Media +-->+ SRR +-->+ Media + 610 | | | | | Fn1 | | | | Fn2 | 611 +--------+ +---------+ +-------+ +------+ +--------+ 613 SFP:192.168.x.x-->www.example.com-->192.168.x.x 614 -->www.example2.com-->192.168.x.x-->www.exampleN.com 616 Figure 2: SFP with new HTTP-based Transport option 618 In a pure SFC architectural framework, Classifier function may 619 interact with SRR to obtain an SE (Service Encapsulation). E.g. the 620 Classifier function may look into the network locator map in Fig 2 621 and determine the next SF is www.example.com. It provides this 622 information to SRR to obtain the next hop information. SRR returns 623 the SE for next hop, which can be a "bitfield" information that is 624 being used in the overlay routing for this part of the SFP. The 625 Classifier function uses this SE to route the incoming packet 626 directly at the transport network level. 628 6. Service Request Routing (SRR) Service Function 630 6.1. Overview 632 The following diagram shows the application of the new proposed SRR 633 service function in an example of media clients connecting to media 634 servers. There may be more than one media functions to support CDN 635 like architecture, Surrogate servers to handle mobility and load 636 balancing. 638 +--------+ 639 | | 640 |-------------------------|------------------+ SRR + 641 | | | | 642 | | +---/|\--+ 643 | | | 644 +---\|/--+ +---------+ +--\|/--+ +------+ +----+---+ 645 | | | | | | | | | | 646 + Client +-->+ IP +-->+ Media +-->+ SRR +-->+ Media + 647 | | | Routing | | Fn1 | | | | Fn2 | 648 +--------+ +---------+ +-------+ +------+ +--------+ 650 Figure 3: General SFC with SRR Flexible Chaining, initiated via IP 651 Routed Client Connection 653 The clients are connected to media functions through frontend routed 654 network, e.g., relying on standard IP routing, while media functions 655 are chained via the new proposed service request routing (SRR) 656 function. Alternatively, we also envision to utilize the SRR 657 function directly between client SF and media function SF, as 658 outlined in the figure below 659 +--------+ 660 | | 661 |-------------------------|------------------+ SRR + 662 | | | | 663 | | +---/|\--+ 664 | | | 665 +---\|/--+ +---------+ +--\|/--+ +------+ +----+---+ 666 | | | | | | | | | | 667 + Client +-->+ SRR +-->+ Media +-->+ SRR +-->+ Media + 668 | | | | | Fn1 | | | | Fn2 | 669 +--------+ +---------+ +-------+ +------+ +--------+ 671 Figure 4: General SFC with SRR Flexible Chaining, initiated via SRR 672 Chained Client 674 For our considerations, we assume that each SF is realized by at 675 least one or more service function endpoints (SFEs). Hence, instead 676 of looking at "chaining" as a concept that connects specific SFEs 677 along a well-defined SFP, we propose to look at "chaining" at the 678 level of "named" service functions rather than their specific 679 endpoint instances. With this in mind, the SRR service function 680 lifts the relationship between the connecting SFs to the level of 681 "logical" service functions rather than their specific realizing 682 endpoints. Instead of relying on dynamic re-chaining in case of any 683 dynamically changing relationship between specific SFEs, the SRR 684 provides the selection of suitable SFEs while maintaining the logical 685 relationship between the SFs. In Section 6.3, we will present the 686 necessary extensions to the SFP concept to support this higher 687 abstraction of "chaining" via "named" logical SFs. The SRR 688 introduces the flexibility in routing service requests from client to 689 specific SFEs. In the edge network, where users are moving and 690 service end points may also change, having flexibility to decide and 691 steer service requests directly helps in guaranteeing the same 692 latency to user applications. Clearly, that is achieved by reducing 693 the switching time from SF to another. As service end point changes, 694 the routing functions makes instantaneous decision to route the 695 request to the appropriate media server. 697 The SRR introduces the flexibility in routing service requests from 698 client to specific SFEs in response to conditions such as congestion 699 in the network, user mobility etc. In the edge network, where users 700 are moving and service end points may also change, having flexibility 701 to decide and steer service requests directly helps in guaranteeing 702 the same latency to user applications. The edge of the network maybe 703 congested due to limited network resources. The SRR may be able to 704 determine network congestion and quickly route service requests to 705 other Service End point, which is not experiencing congestion. In 706 addition, application-layer control functions might utilize latency 707 measurements to ensure that suitable service instances are being 708 created during runtime of the scenario such as to ensure that service 709 function endpoints are available 'nearby' (possibly) moving so as to 710 keep a desired latency under a desired value. 712 Clearly, that is achieved by reducing the switching time from one SF 713 endpoint to another. As the service end point changes, the routing 714 functions makes instantaneous decision to route the request to the 715 appropriate media server. 717 The possible improvements of using SRR within an SFC framework are 718 listed below: 720 o Fast (between 10 and 20ms) switching times from one service 721 instance to another by not relying on the DNS for service 722 discovery and directly routing service requests at the level of 723 the transport network. 725 o The capability to indirect service requests at the network level 726 will help in reducing latency, when service end points change. 727 E.g. when a service request is being sent to one surrogate 728 instance but results in a HTTP 404 or 5xx error response, the 729 original request is redirected to another alternative surrogate 730 with minimal latency, i.e., right at the destination of said 731 failed service request. Nesting these operations effectively 732 leads to a net-level 'search' among all available surrogate 733 instances until the search is exhausted (with a negative result) 734 or the resource is found. 736 o New methods for forwarding, such as path-based forwarding, will 737 enable direct path routing in mobility cases, path pinning for 738 traffic steering and simplified service-specific peering towards 739 the Internet. Such capability would allow for localizing traffic, 740 reduce latency and costs. 742 6.2. Details of SRR Function 744 Assuming such introduction of an HTTP-level transport notion, the SRR 745 function can be decomposed further as shown in Fig 5. 747 +--------+ 748 | | 749 |-------------------------|------------------+ SRR + 750 | | | | 751 | | +---/|\--+ 752 | | | 753 +---\|/--+ +---------+ +--\|/--+ +------+ +----+----+ 754 | | | | | | | | | | 755 + Client +-->+ SRR +-->+Service+-->+ SRR +-->+ Service + 756 | | | | | Fn1 | | | | Fn2 | 757 +--------+ +---------+ +-------+ +------+ +---------+ 758 / \ 759 / \ 760 / \ 761 +--------------------------------------+ 762 | +------------------+ | 763 | | +-----+ +----+ | +-----+ | 764 |---> | SFC | | SR | | | SR |-----> 765 | | |Proxy| | | | | | | 766 | | +-----+ +----+ | +-/|\-+ | 767 | | Use Proxy if NAP| | | 768 | | is not SFC | | | 769 | | enabled | | | 770 | +-------/|\--------+ | | 771 | | | | 772 | | | | 773 | | +----------+ | | 774 | |->| tSFF1 |----- | 775 | +---/|\----+ | 776 | | | 777 | | | 778 | +----------+ | | 779 | | | | | 780 | + PCE +---- +-----+ | 781 | | |--------| RT | | 782 | +----------+ +-----+ | 783 | | 784 +--------------------------------------+ 786 Figure 5: SRR decomposition 788 Another option for the two functions routing via the SRR could be 789 entirely link-local, i.e., there's another simple tSFF2 between 790 client and SRR as well as SF1 and SRR that is simply a link-local 791 transport. The following figure describes this alternate option. 793 +--------+ 794 | | 795 |-------------------------|------------------+ SRR + 796 | | | | 797 | | +---/|\--+ 798 | | | 799 +---\|/--+ +---------+ +--\|/--+ +------+ +----+---+ 800 | | | | | | | | | | 801 + Client +-->+ SRR +-->+Service+-->+ SRR +-->+Service + 802 | | | | | Fn1 | | | | Fn2 | 803 +--------+ +---------+ +-------+ +------+ +--------+ 804 / \ 805 / \ 806 / \ 807 +-----+ +---------------------------------+ 808 |tSFF2|--------->+----+ +-----+ | +--------+ 809 +-----+ | | SR | | SR |----->| tSFF2 |--> 810 | | | | | | +--------+ 811 | +----+ +-/|\-+ | 812 | | | | 813 | | | | 814 | | | | 815 | | | | 816 | | +-------+ | | 817 | |---->| tSFF1 |--- | 818 | +--/|\--+ | 819 | | | 820 | | | 821 | +-------+ | | 822 | | | | | 823 | + PCE +--- +----+ | 824 | | |--------| RT | | 825 | +-------+ +----+ | 826 | | 827 +---------------------------------+ 829 Figure 6: SRR decomposition using link-local client/function 830 communication 832 The SRR function may be composed of the following functions: 834 o Service Router(SR) at the ingress, terminates on the client side 835 Layer 3 and above protocols, such as TCP 837 o Service Router(SR) at the egress, terminates any transport 838 protocol on the outgoing (server) side 840 o PCE, Path Computation Element function is responsible for 841 selecting the correct next SF, also possibly realizing path policy 842 enforcement. The result of the selection is a path identifier 843 which is delivered to the ingress SR upon initial path computation 844 request (i.e., when sending a request to a specific URL on the SFP 845 for the first time). The path identifier is utilized for any 846 future request for a given URL-based SF. In case of another SF 847 instance becoming available, indicated to the PCE through a 848 registration procedure, the PCE will instruct all ingress SRs to 849 invalidate path identifiers to the specific URL of the SF, 850 resulting in an initial path computation request at the next SF 851 request forwarding. Through this, the newly registered SF 852 instance might be utilized if the policy-governed path computation 853 will select said SF instance. 855 o Reclassification Trigger Handler (RT) : Network measurement 856 information, such as latency, packet loss or network congestion, 857 etc. could be processed by the handler. This may trigger 858 reconfiguration of the specific service function endpoint chain 859 over which the SFC is being executed. The handler forwards the 860 information about the chain reconfiguration to PCE. 862 o Transport-derived SFF (tSFF1): the communication between ingress/ 863 egress SRs as well as SRs to PCE is realized via a transport- 864 derived SFF. We outline here three possible tSFFs 866 * SDN-based: This option utilizes path-based forwarding through 867 SDN-based wildcard matching fields, supported with 868 OF1.2+[Reed2016]. It can be embedded into slicing approach of 869 underlying transport infrastructure by leaving typical slicing 870 fields available (e.g., VLAN tags). The forwarding utilizes 871 the Ethernet frame format at Layer 2, representing the 872 topological links of a specific forwarding path in the 873 transport network as unique bits in a fixed size bit array. 874 For the latter, the approach utilizes the IPv6 source and 875 destination fields for storing the bit array information (in a 876 simple version for this forwarding, this limits the topology to 877 256 links but extensions schemes are possible, which are left 878 out of this document at this stage). AS mentioned, the SDN 879 forwarding decision action is a simple wildcard matching, 880 supported with OF1.2+, with the wildcard representing the 881 unique bit of a switch-specific output port. With that, the 882 switch needs to consider as many forwarding rules as switch 883 local output ports - see [Reed2016] for more information. Fig. 884 xx illustrate this forwarding solution, including the ability 885 to create ad-hoc multicast relations by simply ORing individual 886 bitarrays representing unicast paths. 888 * Another approach is outlined in [I-D.ietf-bier-use-cases] where 889 the SFF is suggested to be realized via a BIER overlay, in turn 890 realized over a BIER-compliant underlay, such as MPLS. BIER 891 utilizes a similar bit array approach for representing a 892 forwarding path in the overlay network but unlike [Reed2016], 893 the bit fields indicate the egress BIER-compliant router that 894 the packet is supposed to reach. 896 * As yet another alternative, the tSFF may utilize a flow 897 aggregation approach, outlined in [Khalili2016], called edge 898 switch classification (ESC). In this approach, a path from an 899 ingress to egress SR is described as a so-called edge 900 classification vector (ECV), which combines information on the 901 aggregated flow (following [Khalili2016]) and the switch-local 902 endpoint. The representation has similar bitarray 903 characteristics as the previous two approaches 905 o NOTE: with the ingress and egress SRs terminating SF Layer 3 906 connections and the utilization of bitarray-based tSFFs, the 907 transmission of packets can effectively take place as an ad-hoc 908 Layer multicast while the SFC itself is denoted as an n-times 909 unicast SFC. As an example, consider the chaining of a set of n 910 clients to a single video server. Each sub-SFC from an individual 911 client to the video server will semantically result in a unicast 912 response from the server back to the client (e.g., carrying the 913 video chunk for a MPEG DASH-based video stream). When combining 914 the sub-SFCs to the single SFC with n times unicast relations to 915 the server, the SRR will deliver the responses from the server via 916 one or more multicast responses to one or more clients. The size 917 of the individual multicast groups will depend on the 918 synchronicity of the client requests (and therefore on the 919 synchronicity of the server responses). Note that the multicast 920 relations here are ad-hoc created by ORing the bitarrays 921 representing the specific clients to which the responses are meant 922 to be sent. This is illustrated in the figure below. The HTTP 923 multicast use case is being presented in the BIER use case draft 924 [I-D.ietf-bier-use-cases]albeit without specific a SFC relation. 926 +---------+ +---------+ 927 | | | | +--------+ 928 +IP only +---+ ICN + 00000010 | ICN | 929 |receiver | | SR1 | |--------| SR3 | 930 |UE | +----|----+ | +---||---+ 931 +---------+ | 10010011 | || 932 +-----|----+ +----------+ |-----||-----| 933 | | | | | Cloud | 934 |SDN Switch|---|SDN Switch| | | 935 | | | | |--||--| 936 +----|-----+ +----------+ || 937 | 10100011 || 938 +---------+ +---|-----+ +----||----+ 939 | | | | | | 940 +IP only +---+ ICN + + IP only + 941 |sender UE| | SR2 | | Server | 942 +---------+ +---------+ +----------+ 944 Figure 7: Illustration of Bitfield-based Forwarding using SDN 946 7. Protocol Consideration 948 For the operations outlined in the previous section, we foresee the 949 following protocol changes are required: 951 o SR-to-SR protocol for HTTP: HTTP based message exchange between 952 client and server SRs 954 o SR-PCE protocol: Used for path computation, obtaining routing 955 information as well as provide path updates 957 o Registration protocol: Used to register FQDN service endpoints 959 8. Next Steps 961 Feedback from the SFC WG on the validity of this solution and its 962 scope within the SFC WG. If such alternative to the re- 963 classification for service indirection is seen beneficial as well as 964 fitting with the charter of the WG, the next steps would be to update 965 the draft to outline potential protocol solutions required for the 966 realization of such SRR SF. 968 9. IANA Considerations 970 This document requests no IANA actions. 972 10. Security Considerations 974 TBD. 976 11. Informative References 978 [ETSI_MEC] 979 ETSI, "Mobile Edge Computing (MEC), Technical 980 Requirements", GS MEC 002 1.1.1, March 2016, 981 . 984 [H2020FLAME] 985 EU, "EU H2020 FLAME PROJECT", , March 2016, 986 . 988 [I-D.ietf-bier-use-cases] 989 Kumar, N., Asati, R., Chen, M., Xu, X., Dolganow, A., 990 Przygienda, T., Gulko, A., Robinson, D., Arya, V., and C. 991 Bestler, "BIER Use Cases", draft-ietf-bier-use-cases-06 992 (work in progress), January 2018. 994 [I-D.ietf-sfc-dc-use-cases] 995 Kumar, S., Tufail, M., Majee, S., Captari, C., and S. 996 Homma, "Service Function Chaining Use Cases In Data 997 Centers", draft-ietf-sfc-dc-use-cases-06 (work in 998 progress), February 2017. 1000 [Khalili2016] 1001 Khalili, R., Poe, W., Despotovic, Z., and A. Hecker, 1002 "Reducing State of SDN Switches in Mobile Core Networks by 1003 Flow Rule Aggregation", ICCCN, August, 2016. 1005 [Reed2016] 1006 Reed, M., Al-Naday, M., Thomas, N., Trossen, D., and S. 1007 Spirou, "Reducing State of SDN Switches in Mobile Core 1008 Networks by Flow Rule Aggregation", ICC 2016, 2016. 1010 [RFC7498] Quinn, P., Ed. and T. Nadeau, Ed., "Problem Statement for 1011 Service Function Chaining", RFC 7498, 1012 DOI 10.17487/RFC7498, April 2015, 1013 . 1015 [RFC8300] Quinn, P., Ed., Elzur, U., Ed., and C. Pignataro, Ed., 1016 "Network Service Header (NSH)", RFC 8300, 1017 DOI 10.17487/RFC8300, January 2018, 1018 . 1020 [UKNIC] UK NIC, "5G Infrastructure Requirements in the UK", Final 1021 Report 3.0, December 2016, 1022 . 1027 Authors' Addresses 1029 Debashish Purkayastha 1030 InterDigital Communications, LLC 1031 Conshohocken 1032 USA 1034 Email: Debashish.Purkayastha@InterDigital.com 1036 Akbar Rahman 1037 InterDigital Communications, LLC 1038 Montreal 1039 Canada 1041 Email: Akbar.Rahman@InterDigital.com 1043 Dirk Trossen 1044 InterDigital Communications, LLC 1045 64 Great Eastern Street, 1st Floor 1046 London EC2A 3QR 1047 United Kingdom 1049 Email: Dirk.Trossen@InterDigital.com 1050 URI: http://www.InterDigital.com/ 1052 Zoran Despotovic 1053 Huawei 1055 Email: Zoran.Despotovic@huawei.com 1056 URI: http://www.huawei.com/ 1058 Ramin Khalili 1059 Huawei 1061 Email: Ramin.khalili@huawei.com 1062 URI: http://www.huawei.com/