idnits 2.17.1 draft-liu-dyncast-reqs-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([I-D.liu-dyncast-ps-usecases]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 140: '... o MUST provide a discovery and mapp...' RFC 2119 keyword, line 182: '... o MUST maintain "instance affinity"...' RFC 2119 keyword, line 183: '... packets from the same flow MUST go to...' RFC 2119 keyword, line 208: '... o MUST avoid keeping fine runtime-s...' RFC 2119 keyword, line 211: '... o MUST provide mechanism to free cl...' (14 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (7 March 2022) is 780 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-04) exists of draft-liu-dyncast-ps-usecases-02 Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 rtgwg P. Liu 3 Internet-Draft T. Jiang 4 Intended status: Informational China Mobile 5 Expires: 8 September 2022 P. Eardley 6 British Telecom 7 D. Trossen 8 C. Li 9 Huawei Technologies 10 7 March 2022 12 Dynamic-Anycast (Dyncast) Requirements 13 draft-liu-dyncast-reqs-02 15 Abstract 17 This draft provides requirements for an architecture addressing the 18 problems outlined in the use case and problem statement draft for 19 Dyncast[I-D.liu-dyncast-ps-usecases] . 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at https://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on 8 September 2022. 38 Copyright Notice 40 Copyright (c) 2022 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 45 license-info) in effect on the date of publication of this document. 46 Please review these documents carefully, as they describe your rights 47 and restrictions with respect to this document. Code Components 48 extracted from this document must include Revised BSD License text as 49 described in Section 4.e of the Trust Legal Provisions and are 50 provided without warranty as described in the Revised BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 55 2. Definition of Terms . . . . . . . . . . . . . . . . . . . . . 3 56 3. Desirable System Characteristics and Requirements . . . . . . 3 57 3.1. Anycast-based Service Addressing Methodology . . . . . . 3 58 3.2. Instance Affinity . . . . . . . . . . . . . . . . . . . . 4 59 3.3. Proper Runtime-state Granularity and Keeping . . . . . . 5 60 3.4. Encoding Metrics . . . . . . . . . . . . . . . . . . . . 5 61 3.5. Signaling Metrics . . . . . . . . . . . . . . . . . . . . 6 62 3.6. Using Metrics in Routing Decisions . . . . . . . . . . . 6 63 3.7. Supporting Service Dynamism . . . . . . . . . . . . . . . 7 64 4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 8 65 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 66 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 67 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 8 68 8. Informative References . . . . . . . . . . . . . . . . . . . 8 69 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 9 70 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 72 1. Introduction 74 Computing service instances instantiated at multiple geographical 75 edge sites are used to better realize an edge computing service in 76 edge computing use cases, as shown in[I-D.liu-dyncast-ps-usecases]. 77 To optimally deliver the service request to the most appropriate 78 service instance is the fundamental requirement in such deployments. 79 As shown in [I-D.liu-dyncast-ps-usecases], choosing the most 80 appropriate service instance should take both, the computing 81 resources available and the network path quality, into consideration. 82 "Optimal" here additionally means the architecture and overall 83 mechanism should be efficient, support high dynamism, while 84 maintaining instance affinity, as shown in 85 [I-D.liu-dyncast-ps-usecases]. 87 This draft provides the requirements to realize the potential dynamic 88 anycast architecture by alleviating the problems of existing 89 solutions outlined in [I-D.liu-dyncast-ps-usecases] 91 2. Definition of Terms 93 Service: A monolithic functionality that is provided by an endpoint 94 according to the specification for said service. A composite 95 service can be built by orchestrating monolithic services. 97 Service instance: Running environment (e.g., a node) that makes the 98 functionality of a service available. One service can have several 99 instances running at different network locations. 101 Service identifier: Used to uniquely identify a service, at the same 102 time identifying the whole set of service instances that each 103 represent the same service behaviour, no matter where those service 104 instances are running. 106 Anycast: An addressing and packet sending methodology that assign an 107 "anycast" identifier for one or more service instances to which 108 requests to an "anycast" identifier could be routed, following the 109 definition in [RFC4786] as anycast being "the practice of making a 110 particular Service Address available in multiple, discrete, 111 autonomous locations, such that datagrams sent are routed to one of 112 several available locations". 114 Dyncast: Dynamic Anycast, taking the dynamic nature of computing 115 resource metrics into account to steer an anycast-like decision in 116 sending an incoming service request. 118 3. Desirable System Characteristics and Requirements 120 In the following, we outline the desirable characteristics of a 121 system to overcome the observed problems in 122 [I-D.liu-dyncast-ps-usecases] for the realization of the use cases 123 described in that document. 125 3.1. Anycast-based Service Addressing Methodology 127 A unique service identifier is used by all the service instances for 128 a specific service no matter which edge it attaches to. An anycast 129 like addressing and routing methodology among multiple edges makes 130 sure the data packet can potentially reach any of the edges with the 131 service instance attached. At the same time, each service instance 132 has its own unicast address to be used by the attaching edge to 133 access the service.Since a client will use the service identifier as 134 the destination addressing, mapping of the service identifier to the 135 unicast address will need to happen in-band, considering the metrics 136 for selection to make this selection service-specific. From an 137 addressing perspective, a desirable system for the realization of the 138 use cases described in that document. 140 o MUST provide a discovery and mapping methodology for the in-band 141 mapping of the service identifier (an anycast address) to a specific 142 unicast address. 144 3.2. Instance Affinity 146 A routing relation between a client and a service exists not at the 147 packet but at the service request level in the sense that one or more 148 service requests, possibly consisting of one or many more routing- 149 level packets, must be ensured to be sent to said service.Each 150 service may be provided by one or more service instances, each 151 providing equivalent service functionality to their respective 152 clients, while those service instances may be deployed at different 153 locations in the network. With that, the routing problem becomes one 154 between the client and a selected service instance for at least the 155 duration of the service-level request, but possibly more than just 156 one request. 158 This relationship between the client and the chosen service instance 159 is described as "instance affinity" in the following, where the 160 "affinity" spans across the aforementioned one or more service 161 requests. This impacts the routing decision to be taken in that the 162 normal packet level communication, i.e., each packet is forwarded 163 individually based on the forwarding table at the time, will need 164 extending with the notion of instance affinity since otherwise 165 individual packets may be sent to different places when the network 166 status changes, possibly segmenting individual requests and breaking 167 service-level semantics. 169 The nature of this affinity is highly dependent on the nature of the 170 specific service. The minimal affinity of a single request 171 represents a stateless service, where each service request may be 172 responded to without any state being held at the service instance for 173 fulfilling the request. Providing any necessary information/state 174 in-band as part of the service request, e.g., in the form of a multi- 175 form body in an HTTP request or through the URL provided as part of 176 the request, is one way to achieve such stateless nature. 177 Alternatively, the affinity to a particular service instance may span 178 more than one request, as in our VR example in 179 [I-D.liu-dyncast-ps-usecases], where previous client input is needed 180 to render subsequent frames. Therefore, a desirable system 182 o MUST maintain "instance affinity" which MAY span one or more 183 service requests, i.e., all the packets from the same flow MUST go to 184 the same service instance. 186 3.3. Proper Runtime-state Granularity and Keeping 188 The instance affinity, as outlined in Section 3.2, requires a client 189 and the chosen service instance to keep persistent relationship 190 across one or more service requests. For a multi-request session, 191 this determines that the mapping logic has to consistently pick up 192 the same service instance. This type of affinity can be normally 193 achieved by deploying a mapping device to keep in-place all the 194 necessary states. However, a client, e.g., a mobile UE, has 195 generally many applications running. If all, or majority, of the 196 applications request the dyncast-like services, then the runtime 197 states that need to be created and accordingly maintained would 198 require high granularity. In the extreme scenario, this granular 199 requirement could reach the level of per-UE per-APP per-(sub)flow 200 with regard to a service instance. 202 Evidently, these fine-granular runtime states can potentially become 203 heavy burden for network devices if they have to dynamically create 204 and maintain them. On the other hand, it is not appropriate either 205 to place the state-keeping task on clients themselves. Therefore, a 206 desirable system 208 o MUST avoid keeping fine runtime-state granularity in network nodes 209 in order to achieve instance affinity. 211 o MUST provide mechanism to free clients from maintaining granular 212 runtime-states in order to achieve instance affinity. 214 3.4. Encoding Metrics 216 As outlined in the scenarios in [I-D.liu-dyncast-ps-usecases], 217 metrics can have many different semantics, particularly if considered 218 to be service- specific. Even the notion of a "computing load" 219 metric may be computed in many different ways. What is crucial, 220 however, is the representation and encoding of that metric when being 221 conveyed to the routing fabric in order for the routing elements to 222 act upon those metrics. Such representation may entail information 223 on the semantics of the metric or it may be purely one or more 224 semantic-free numerals. Agreement of the chosen representation among 225 all service and network elements participating in the service- 226 specific routing decision is important. Specifically, a desirable 227 system 229 o MUST agree on the service-specific metrics and their representation 230 between service elements in the participating edges in the network 231 and network elements acting upon them. 233 o MAY obfuscate the specific semantic of the metric to preserve 234 privacy of the service provider information towards the network 235 provider. 237 o MAY include routing protocol metrics 239 3.5. Signaling Metrics 241 The aforementioned representation of metrics needs conveyance to the 242 network elements that will need to act upon them. Depending on the 243 service-specific decision logic, one or more metrics will need to be 244 conveyed. Problems to be addressed here may be that of loop 245 avoidance of any advertisement of metrics as well as the frequency of 246 such conveyance and therefore the overall load that the signaling may 247 add to the overall network traffic. While existing routing protocols 248 may serve as a baseline for signaling metrics, other means to convey 249 the metrics can equally be realized. Specifically, a desirable 250 system 252 o MUST provide mechanisms to signal the metrics for using in routing 253 decisions 255 o MUST realize means for rate control for signaling of metrics 257 o MUST implement mechanisms for loop avoidance in signaling metrics, 258 when necessary 260 3.6. Using Metrics in Routing Decisions 262 Metrics being conveyed, as outlined in Section 3.4, in the agreed 263 manner, as outlined in Section 3.3, will ultimately need suitable 264 action in the routers of the network. Routing decisions can be 265 manifold, possibly including (i) min or max over all metrics, (ii) 266 extending previous action with a random or first choice when more 267 than one min/max entry found, (iii) weighted round robin of all 268 entries, among others. It is important for the proper work of the 269 service-specific routing decision, that it is understood to both 270 network and service provider, which action (out of a possible set of 271 supported actions) is to be used for a particular set of metrics. 272 Specifically, a desirable system 274 Further, different network nodes, e.g., routers, switches, etc., bear 275 diversified capabilities even in the same routing domain, let alone 276 in different administrative domains. So, the service-specific 277 metrics that have been adopted by some nodes might not be supported 278 by others, either due to technical reasons, administrative reasons, 279 or something else. There could be some scenario that a node 280 supporting service-specific metrics might prefer some type of metrics 281 to others [3GPP-TR22.847], or, in another scenario, even not utilize 282 any at all. Therefore, there must exist flexibility in term of 283 metrics handling and routing decisions in a network. 285 o MUST specify a default action to be taken, if more than one action 286 possible 288 o MUST allow a network node not supporting service-specific metrics 289 to interoperate with the supporting ones, i.e., providing backward 290 compatibility. 292 o SHOULD allow the prioritization of using the service-specific 293 metrics when compared to the currently widely-used networking 294 metrics, like bandwidth, delay, loss, etc. 296 o SHOULD enable other alternative actions to be taken. (1)Any 297 solution MUST provide appropriate signaling of the desired action to 298 the router. For this, the action MAY be signaled in combination with 299 signaling the metric (see Section 3.4). (2)Any solution SHOULD allow 300 associating the desired action to a specific service identifier. 302 3.7. Supporting Service Dynamism 304 Network cost in the current routing system usually does not change 305 very frequently. However, computing load and service-specific 306 metrics in general can be highly dynamic, e.g., changing rapidly with 307 the number of sessions, CPU/GPU utilization and memory space. It has 308 to be determined at what interval or events such information needs to 309 be distributed among edges. More frequent distribution of more 310 accurate synchronization may result in more overhead in terms of 311 signaling. 313 Choosing the least path cost is the most common rule in routing. 314 However, the logic does not work well when routing should be aware of 315 service-specific metrics. Choosing the least computing load may 316 result in oscillation. The least loaded edge can quickly be flooded 317 by the huge number of new computing demands and soon become 318 overloaded with tidal effects possibly following. 320 Generally, a single instance may have very dynamic resource 321 availability over time in order to serve service requests. This 322 availability may be affected by computing resource capability and 323 load, network path quality, and others. The balancing mechanisms 324 should adapt to the service dynamism quickly and seamlessly. With 325 this, the relationship between a single client and the set of 326 possible service instances may possibly be very dynamic in that one 327 request that is being dispatched to instance A may be followed by a 328 request that is being dispatched to instance B and so on, generally 329 within the notion of the service-specific service affinity discussed 330 before in Section 3.2. With this in mind, a desirable system 332 o MUST support the dynamics of metrics changing on, e.g., a per flow 333 basis, without violating the metrics defined in the selection of the 334 specific service instance, while taking into account the requirements 335 for the signaling of metrics and routing decision (see Section 3.4 336 and 3.5). 338 4. Conclusion 340 This document presents high-level requirements for solutions to 341 Dyncast, where the architecture should address how to distribute the 342 resource information and how to assure instance affinity in an 343 anycast based service addressing environment, while realizing 344 appropriate routing actions to satisfy the metrics provided. 346 5. Security Considerations 348 TBD 350 6. IANA Considerations 352 No IANA action is required so far. 354 7. Contributors 356 The following people have substantially contributed to this document: 358 Peter Willis 359 BT 361 8. Informative References 363 [RFC4786] Abley, J. and K. Lindqvist, "Operation of Anycast 364 Services", BCP 126, RFC 4786, DOI 10.17487/RFC4786, 365 December 2006, . 367 [I-D.liu-dyncast-ps-usecases] 368 Liu, P., Willis, P., Trossen, D., and C. Li, "Dynamic- 369 Anycast (Dyncast) Use Cases & Problem Statement", Work in 370 Progress, Internet-Draft, draft-liu-dyncast-ps-usecases- 371 02, 17 January 2022, . 374 [TR22.874] 3GPP, "Study on traffic characteristics and performance 375 requirements for AI/ML model transfer in 5GS (Release 376 18)", 2020. 378 Acknowledgements 380 The author would like to thank Yizhou Li, Luigi IANNONE and Geng 381 Liang for their valuable suggestions to this document. 383 Authors' Addresses 385 Peng Liu 386 China Mobile 387 Email: liupengyjy@chinamobile.com 389 Tianji Jiang 390 China Mobile 391 Email: jiangtianji@chinamobile.com 393 Philip Eardley 394 British Telecom 395 Email: philip.eardley@bt.com 397 Dirk Trossen 398 Huawei Technologies 399 Email: dirk.trossen@huawei.com 401 Cheng Li 402 Huawei Technologies 403 Email: c.l@huawei.com