idnits 2.17.1 draft-dunbar-sfc-fun-instances-restoration-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 2 instances of too long lines in the document, the longest one being 16 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (April 29, 2014) is 3622 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-13) exists of draft-ietf-sfc-problem-statement-02 Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network working group L. Dunbar 2 Internet Draft A. Malis 3 Intended status: Standard Track Huawei 4 Expires: October 2014 6 April 29, 2014 8 Framework for Service Function Instances Restoration 9 draft-dunbar-sfc-fun-instances-restoration-00.txt 11 Status of this Memo 13 This Internet-Draft is submitted in full conformance with the 14 provisions of BCP 78 and BCP 79. 16 This Internet-Draft is submitted in full conformance with the 17 provisions of BCP 78 and BCP 79. This document may not be modified, 18 and derivative works of it may not be created, except to publish it 19 as an RFC and to translate it into languages other than English. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six 27 months and may be updated, replaced, or obsoleted by other documents 28 at any time. It is inappropriate to use Internet-Drafts as 29 reference material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html 37 This Internet-Draft will expire on October 30, 2014. 39 Copyright Notice 41 Copyright (c) 2014 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with 49 respect to this document. Code Components extracted from this 50 document must include Simplified BSD License text as described in 51 Section 4.e of the Trust Legal Provisions and are provided without 52 warranty as described in the Simplified BSD License. 54 Abstract 56 This draft describes the framework of protection and restoration of 57 Service Chain Instance Path when some instances on the path fail or 58 need to be replaced. 60 Table of Contents 62 1. Introduction...................................................2 63 2. Conventions used in this document..............................3 64 3. Background.....................................................4 65 3.1. Multiple Instances of one Service Function................4 66 3.2. Multiple ways for expressing Service Chain Instance Path..4 67 3.3. Virtualized Service Function Instances impact to Service 68 Chain..........................................................6 69 4. Local Repair of Service Function Instances.....................7 70 5. Global Repair of Service function instances....................8 71 6. Regional Repair of Service function instances.................10 72 7. Conclusion and Recommendation.................................10 73 8. Manageability Considerations..................................10 74 9. Security Considerations.......................................10 75 10. IANA Considerations..........................................10 76 11. References...................................................11 77 11.1. Normative References....................................11 78 11.2. Informative References..................................11 79 12. Acknowledgments..............................................11 81 1. Introduction 83 This draft describes the framework for protection and restoration of 84 a Service Chain Instance Path when some instances on the path fail 85 or need to be replaced. 87 Protection and restoration become more crucial in virtualized 88 environments (e.g. ETSI NFV), where there is higher chance of 89 Service function instances failing, being decommissioned or over- 90 utilized. 92 2. Conventions used in this document 94 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 95 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 96 document are to be interpreted as described in RFC-2119 [RFC2119]. 98 In this document, these words will appear with that interpretation 99 only when in ALL CAPS. Lower case uses of these words are not to be 100 interpreted as carrying RFC-2119 significance. 102 3. Definition of Terms 104 NFV: Network Function Virtualization [NFV-Terminology]. 106 SF: Service Function [SFC-Problem]. 108 SFF: Service Function Forwarder. 110 SFIC: Service Function Instance Component. One service function 111 (e.g. NAT44) could have two different service function 112 instantiations, one that applies policy-set-A (NAT44-A) and other 113 that applies policy-set-B (NAT44-B). There could be multiple 114 "entities" of NAT44-B (e.g. one "entity" only has 10G capability), 115 and many "entities" of NAT44-B. Each entity has its own unique 116 address. The "entity" in this context is called "Service Function 117 Instance Component" (SFIC). 119 Service Chain: The sequence of service functions, e.g. Chain#1 {s1, 120 s4, s6}, Chain#2{s4, s7} at functional level. Also see the 121 definition of "Service Function Chain" in [SFC-Problem]. 123 Service Chain Instance Path: The actual Service Function Instance 124 Components selected for a service chain. 126 SFF: Service Function Forwarding Node. 128 VNF: Virtualized Network Function [NFV-Terminology]. 130 4. Background 132 4.1. Multiple Instances of one Service Function 134 One service function (say, NAT44) could have two different service 135 function instantiations, one that applies to policy-set-A (NAT44-A) 136 and other that applies to policy-set-B (NAT44-B). There could be 137 multiple "entities" of NAT44-A (e.g. one "entity" only has 10G 138 capability), and many "entities" of NAT44-B. Each entity has its own 139 unique address (or Locator in [SFC-Reduction]). The "Entity" in this 140 context is called "Service Function Instance Component" (SFIC). 142 Identical SFICs could be attached to different Service Function 143 Forwarder (SFF) nodes. It is also possible to have multiple 144 identical SFICs attached to one Service Function Forwarder (SFF) 145 node, especially in a Network Function Virtualization (NFV) 146 environment where each SFIC is a virtual service function instance 147 with limited capacity. 149 At the functional level, the order of service functions, e.g. 150 Chain#1 {s1, s4, s6}, Chain#2{s4, s7}, is important, but very often 151 which SFIC of the Service Function "s1" is selected for the Chain #1 152 is not. It is also possible that multiple SFICs of one service 153 function can be reached by different network nodes. The actual SFIC 154 selected for a service chain is called "Service Chain Instance 155 Path". 157 4.2. Multiple ways for expressing Service Chain Instance Path 159 How SFICs are selected for a given Service Chain to form the actual 160 Service Chain Instance Path is outside the scope of this draft. It 161 is assumed that there is an entity (e.g. service chain orchestration 162 system) that is responsible for selecting the SFICs for a Service 163 Chain. 165 This document focuses on how Service Function Forwarder nodes or 166 network nodes are informed of the selected SFICs for a particular 167 Service Chain, especially when there are changes of SFICs on the 168 Service Chain. To make description easier, the following Service 169 Chain architecture reference is used: 171 |1 ----- |n |21 ---- |2m 172 +---+---+ +---+---+ +-+---+ +--+-----+ 173 | SF#1 | |SF#n | |SF#i1| |SF#im | 174 | | | | | | | | 175 +---+---+ +---+---+ +--+--+ +--+--+--+ 176 : : : : : 177 : : : : : 178 \ / \ / 179 +--------------+ +--------+ +---------+ 180 -- >| Chain | | SFF | ------ | SFF | ----> 181 |classifier | |Node-1 | | Node-i | 182 +--------------+ +----+---+ +----+--+-+ 183 \ | / 184 \ | SFC Encapsulation / 185 \ | / 186 ,. ......................................._ 187 ,-' `-. 188 / `. 189 | Network | 190 `. / 191 `.__.................................. _,-' 193 Figure 1 Framework of Service Chain 195 Some head end Service Chain Classifier can be configured with (or 196 has the ability to specify) the exact Service Chain Instance Path 197 for a given service chain. Under this scenario, the exact Service 198 Chain Instance Path can be expressed by: 200 - Being encoded in every data packet; 201 - Being signaled in-band via the data path from the head end 202 Service Chain Classifier node to all the relevant nodes to 203 install the appropriate flow steering policies (similar to MPLS 204 traffic engineering signaling); 205 - Being sent as out-of-band control messages to all the relevant 206 nodes to install the appropriate flow steering policies (similar 207 to GMPLS signaling); or 208 - Being provisioned into each node by a centralized network 209 controller (similar to SDN) or by a network management system. 211 The benefit of encoding the exact path in every data packet is less 212 contention when there the Service Chain Instance Path changes. 213 However, there are major drawbacks, such as 215 - extra packet header fields are needed to carry the exact 216 instance path, that can increase the likelihood of packet 217 fragmentation due to MTU size, and 218 - extra encapsulation processing load at the head end Service 219 Chain classifier node. 221 Packet fragmentation and reassembly is very processor and memory 222 intensive. Good practice is to avoid packet fragmentation and 223 reassembly as much as possible. Carry an exact instance path in every 224 packet might be possible if service function instances can be 225 represented by compact labels, similar to the MPLS label stack. 227 When the in-band or out-of-band signaling methods are used, i.e. 228 sending flow steering policies to relevant SFF nodes or network 229 nodes, the packets associated with a specific flow can be classified 230 with a simple identifier (or Service Chain ID). Packet size is 231 smaller and processing at the SC Classifier can be simpler as well. 233 The out-of-band method doesn't even require the head end Service 234 Chain Classifier to be configured with, nor has the capability to 235 specify, the exact Service Chain Instance Path. The out-of-band 236 steering policies can be sent from an external entity, such as a 237 centralized network controller or service chain orchestration system. 238 Under this scenario, it doesn't require the head end Chain Classifier 239 node to be aware of any change to the instances on the chain. 241 At times it might not be feasible for the head end Service Chain 242 Classifier to be aware of the exact instances selected for a given 243 Service Chain because they are managed by different administrative 244 entities. 246 If each Service Function has a large number of SFICs, it scales 247 better if the Service Chain classifier only identifies the service 248 chain at the functional level, and there is another entity managing 249 the detailed service instance path. 251 4.3. Virtualized Service Function Instances impact to Service Chain 253 When Service Chain Instance Path consists of virtualized service 254 function instances, e.g. in an ETSI NFV environment, the likelihood 255 or frequent changes to the Service Chain Instance Path might be 256 higher due to: 258 - Higher failure rate of virtualized service function instances 259 because most of them will not have build-in protection mechanism 260 - When some instances are over-utilized, it is relatively easy to 261 replace them by other instances or instantiate more instances to 262 take over the work load. 264 5. Local Restoration of Service Function Instances 266 When one SF Forwarder (SFF) node has multiple Service Function 267 Instance Components (SFICs) of the same service function attached, 268 the SFF can make a local decision on which instance is selected for 269 a specific service chain. 271 E.g. In the diagram below, The SF Forwarder (SFF) "A" has two 272 instances of Service Function #7(SF7-1 & SF7-2), and 3 instances of 273 Service Function #2 (SF2-2, SF2-4, SF2-5). 275 +----+ +---+ +---+ +---+ 276 | SF2| |SF2| |SF2| |SFx| 277 | -2 | |-4 | |-5 | |-1 | 278 +----+ +---+ +---+ +---+ 279 | | | | 280 +------+-------+-------+ 281 | 282 +----+ +---+ | +---+ +---+ 283 | SF7| |SF7| | |SF5| |SF5| 284 | -1 | |-2 | | |-2 | |-4 | 285 +----+ +---+ | +---+ +---+ 286 : / / / 287 : / / /-----/ 288 \ / / / 289 +--------------+ +---------- +----+ 290 -- >| Chain |-- | SFF |------| SFF| ----> 291 |classifier | | A | | C | 292 +--------------+ +----------+ +----+ 294 Figure 2 Local Restoration among multiple service instances 296 For a service chain that consists of "Service Function #7" followed 297 by "Service Function #2", which is represented by SF7->SF2, the 298 steering policy to SFF "A" could be: 300 {SF7-1, SF7-3} -> {SF2-2, SF2-4, SF2-5}. 302 The multiple components within the {} represents the equal function 303 instances that SFF "A" can select locally. 305 When one service function instance fails, the SFF "A" can locally 306 choose another instance without informing the SC Classifier node, or 307 other SFF or network nodes. 309 The local protection and restoration is relatively simple and clean. 310 ECMP can be used to balance all the available service function 311 instances attached locally. 313 6. Global Restoration of Service function instances 315 Sometimes changing the Service Chain Instance Path involves using 316 service function instances at different SF Forwarding (SSF) nodes. 318 For example, for a Chain #7 -> #2 -> #3 -> #5 in the figure above: 320 - Original instance path: #7 & #2 at SFF "A"; #3 & #5 at SFF 321 "C". 323 - New instance path: #7 at SFF "A" and #2& #3 & #5 at SFF "C". 325 This section examines possible ways to achieve the restoration when 326 the change of instance path involves multiple nodes. 328 6.1. Encoding the Exact Instance Path in Data Packets 330 If the detailed Service Chain Instance Path is encoded in data 331 packets, the SC Classifier can be notified of the change and encode 332 the new instance path in the data packets of the flow. This method 333 won't cause any contention issue among all the involved nodes. 335 As mentioned in the previous section, encoding exact instance path in 336 every packet can cause packets fragmentation, which is very 337 processing intensive. Therefore, it's not optimal to require every 338 data packet to carry an exact instance path, especially when the 339 Service Chain instance path doesn't change very frequently, as in 340 minutes or hours. 342 6.2. In-Band Signaling of an Instance Path change 344 A similar method to MPLS RSVP-TE [RSVP-TE] signaling can be 345 considered for the head end node to signal a required service 346 instance path, and then let the data packets traverse the 347 established path. 349 The drawback of this approach is that the head end node might 350 receive packets belonging to the service chain before the instance 351 path has been established. It is very similar to the issues 352 encountered by MPLS Fast Reroute [FRR]. MPLS FRR requires that 353 packets be dropped if a restoration path is being dynamically 354 signaled because there was not a pre-established backup path.. 356 6.3. Out-Of-Band Signaling of an Instance Path change 358 If the out-of-band method is used, i.e. sending the updated flow 359 steering policies to indicate the changes of the instance path, there 360 could be issues of synchronization and race conditions. For example, 361 if the SFF "A" and SFF "C" get flow steering policies at slightly 362 different times, some packets of the flow might miss some service 363 functions on the chain. 365 6.4. Provisioning an Instance Path change 367 In SDN or SDN-like environments, changes to the Instance Path can be 368 provisioned or programmed into network nodes via a central controller 369 or Network Management System (NMS). This simplifies the nodes, since 370 they are not required to use a signaling protocol, but there may be 371 problems introduced (such as loops or dropped packets) if network 372 nodes are not updated in the proper order or very soon to each other; 373 the nodes should be updated in a similar time scale to the use of a 374 signaling protocol. In addition, the network may have a single point 375 of failure if the controller or NMS is not itself redundant. 377 6.5. Hybrid Method 379 For global restoration of service function instances, it is 380 worthwhile to explore a hybrid mode, i.e. when there are changes 381 involving using service instances at different SFF nodes, the SC 382 Classifier node is informed to encode the detailed instance path to 383 data packets until all the involved SFF nodes complete the 384 installation of the new steering policy for the flow. 386 7. Regional Restoration of Service Function Instances 388 It might not be always be feasible for the head end Service Chain 389 Classifier to be aware of the exact instances selected for a given 390 Service Chain due to being managed by multiple administrative 391 entities. Then Regional restoration should be considered. 393 Regional restoration can take the similar approach as the Global 394 restoration: choosing a regional ingress node that can take over the 395 responsibility of installing the new steering policies to the 396 involved SFF nodes or network nodes. 398 The Regional ingress node should be: 400 - on the data path of the flow of the given service chain; 402 - in front of the relevant the SFF nodes or network nodes that 403 are impacted by the change of the Service Chain Instance 404 Path; 406 - capable of encoding the detailed Service Chain Instance Path 407 to the data packets of the identified flow; and 409 - capable of removing the detailed Service Chain Instance Path 410 encoding in data packets after all the impacted SFF nodes and 411 network nodes completed the policy installation. 413 8. Conclusion and Recommendation 415 TBD 417 9. Manageability Considerations 419 TBD 421 10. Security Considerations 423 TBD 425 11. IANA Considerations 427 This document requires no IANA actions. RFC Editor: Please remove 428 this section before publication. 430 12. References 432 12.1. Normative References 434 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 435 Requirement Levels", BCP 14, RFC 2119, March 1997. 437 12.2. Informative References 439 [SFC-Problem] P. Quinn, et al, "Service Function Chaining Problem 440 statement", draft-ietf-sfc-problem-statement-02, work in 441 progress, April 2014 443 [NFV-Terminology] ETSI NFV ISG, "Network Functions Virtualisation 444 (NFV); Terminology for Main Concepts in NFV", ETSI GS NFV 445 003 V1.1.1, Oct. 2013, 446 http://www.etsi.org/deliver/etsi_gs/NFV/001_099/003/01.01. 447 01_60/gs_NFV003v010101p.pdf 449 [SFC-Reduction] R. Parker, "Service Function Chaining: Chain to Path 450 Reduction", draft-parker-sfc-chain-to-path-00, work in 451 progress, Nov. 2013 453 [RSVP-TE] D. Awduche, Berger, L., Gan, D., Li, T., Srinivasan, V., 454 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 455 Tunnels", RFC 3209, December 2001. 457 [FRR] P. Pan, Swallow, G., and Atlas, A., "Fast Reroute 458 Extensions to RSVP-TE for LSP Tunnels", RFC 4090, May 2005 460 13. Acknowledgments 462 Many thanks to Ron Bonica for the discussion in formulating the 463 content for the draft. 465 This document was prepared using 2-Word-v2.0.template.dot. 467 Authors' Addresses 469 Linda Dunbar 470 Huawei Technologies 471 5340 Legacy Drive, Suite 175 472 Plano, TX 75024, USA 473 Phone: (469) 277 5840 474 Email: ldunbar@huawei.com 476 USA 477 Email: rbonica@juniper.net 479 Andrew G. Malis 480 Huawei Technologies 481 USA 482 Email: agmalis@gmail.com