idnits 2.17.1 draft-tiesel-taps-communitgrany-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 26, 2017) is 2364 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-03) exists of draft-trammell-taps-post-sockets-01 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TAPS Working Group P. Tiesel 3 Internet-Draft T. Enghardt 4 Intended status: Informational TU Berlin 5 Expires: April 29, 2018 October 26, 2017 7 Communication Units Granularity Considerations for Multi-Path Aware 8 Transport Selection 9 draft-tiesel-taps-communitgrany-01 11 Abstract 13 This document provides guidelines how to reason about the composition 14 of multi-path aware systems and how to compose the functionality 15 needed by stacking existing protocols. It discusses fundamental 16 mechanisms that are used in multi-path systems and the consequences 17 of applying them to different granularities of communication units. 18 This document is targeted as consideration basis for automation of 19 destination selection, path selection, and transport protocol 20 selection. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at https://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on April 29, 2018. 39 Copyright Notice 41 Copyright (c) 2017 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (https://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 1.1. Communication Units vs. Layering . . . . . . . . . . . . 3 58 2. Abstract Hierarchy of Communication Units . . . . . . . . . . 4 59 2.1. Message . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 2.2. Stream . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 2.3. Association, Flow . . . . . . . . . . . . . . . . . . . . 5 62 2.4. Association Set, Flow Set (Flow-Group) . . . . . . . . . 5 63 3. Mechanisms Used in Multi-Path Systems . . . . . . . . . . . . 5 64 3.1. Destination Selection . . . . . . . . . . . . . . . . . . 5 65 3.2. Path Selection . . . . . . . . . . . . . . . . . . . . . 6 66 3.3. Chunking . . . . . . . . . . . . . . . . . . . . . . . . 7 67 3.4. Scheduling . . . . . . . . . . . . . . . . . . . . . . . 7 68 4. Cost of Transport Option Selection . . . . . . . . . . . . . 8 69 5. Involvement of On-Path Elements . . . . . . . . . . . . . . . 8 70 6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 71 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 72 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 73 9. Informative References . . . . . . . . . . . . . . . . . . . 9 74 Appendix A. Changes . . . . . . . . . . . . . . . . . . . . . . 10 75 A.1. Since -00 . . . . . . . . . . . . . . . . . . . . . . . . 10 76 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 78 1. Introduction 80 Today's Internet architecture faces a communication endpoint with a 81 set of choices, including choosing a transport protocol and picking 82 an IP protocol version. In many cases, e.g., when fetching data from 83 a CDN, an endpoint has also the choice of which endpoint instance, 84 [I-D.pauly-taps-guidelines] calls these instances "Derived Endpoint", 85 to contact as DNS can return multiple alternative addresses. 87 If endpoints want to take advantage of multiple available paths, 88 there is another bunch of, partially interdependent, choices: 90 o Which path(s) between the endpoints could be used? 92 o Which path(s) between the endpoints should be used? 94 o Should the paths be used in an active/active way or only as 95 active/fallback? 97 o Which protocols or sets of protocols should be used? 99 o Which role will other on-path elements, e.g. middle-boxes, take in 100 servicing this flow? 102 Implementing an heuristic or strategy for choosing from this 103 overwhelming set of transport options by each application puts a huge 104 burden on the application developer. Thus, the decisions regarding 105 all transport options mentioned so far should be supported and, if 106 requested by the application, automated within a the transport layer. 107 In order to build such automatization, we need to be able to compare 108 the product of all transport options (destinations, paths, transport 109 protocols and protocol options) available to choose the most 110 appropriate. 112 As the protocols to be used are not known a priori and can differ 113 depending on other transport options, this reasoning has to be 114 independent of a specific protocol or implementation and allow to 115 compare them even if they operate on different communication unit 116 granularities. 118 1.1. Communication Units vs. Layering 120 When reasoning about network systems, layering traditionally has been 121 the main guidance on where functionality is placed. Looking at 122 modern systems, the classical concept of layers and their mapping to 123 protocols becomes blurry. Protocols can operate on different 124 granularities of communication units, i.e., the semantic units such 125 as messages that the protocols distinguish. These communication 126 units often do not match the PDUs used by the protocols, e.g., TCP 127 segments do not necessarily align with messages at the application 128 layer. 130 In this document, we do not want to take a protocol-centric 131 perspective, but we focus on mechanisms a multi-access system is 132 composed of and the communication units they operate on. This has 133 several advantages: 135 o We can much easier abstract from the protocols used and look at 136 the composition itself. 138 o By disseminate on which kind of communication unit these 139 mechanisms can operate, we can reason about the overall design 140 space. 142 o If seeing the same mechanism multiple times within the same system 143 composition, we can reason about possibly conflicting 144 optimizations. 146 Overall, this perspective allows us to compare mechanism like 147 distributing requests of an application among different paths, MPTCP 148 and using bandwidth aggregation proxies (as discussed within the IETF 149 in the BANANA working group) despite their different nature and layer 150 of implementation. 152 2. Abstract Hierarchy of Communication Units 154 These communication units definitions are primarily used for 155 reasoning about automatic stack composition. Therefore, depending on 156 the protocol stack instance, a communication unit can span multiple 157 protocol instances. 159 Some of these hierarchy levels correspond to objects in 160 [I-D.gjessing-taps-minset], but in case of Association and 161 Association Set, we have to split categories as they may indeed be 162 separate on the transport. Note the naming confusion concerning the 163 term "flow" deriving from different perspective. 165 We also annotate the corresponding terminology used in 166 [I-D.trammell-taps-post-sockets] if applicable. 168 2.1. Message 170 An Message is a piece of data that has a meaning for the application. 171 It is the smallest communication unit that we consider. 173 [I-D.gjessing-taps-minset] correspondent: Message 175 [I-D.trammell-taps-post-sockets] correspondent: Message 177 Examples: 179 o A HTTP-Request/Response-Header/Body for HTTP/2 181 o An XML message in XMPP 183 2.2. Stream 185 A Stream is an ordered sequence of related Messages that should be 186 treated the same by the transport system. 188 [I-D.gjessing-taps-minset] correspondent: Flow 190 [I-D.trammell-taps-post-sockets] correspondent: Stream 192 Examples: 194 o A Stream in QUIC or SCTP 196 o A TCP connection used as transport for XMPP 198 2.3. Association, Flow 200 An Association multiplexes a set of Messages or Streams within the 201 same Flow with common source and destination. Therefore these 202 communication units become indistinguishable for the network. 203 Association and flow describe the same concept, the former from the 204 perspective of the application, the latter from the perspective of 205 the network. 207 [I-D.gjessing-taps-minset] correspondent: Flow-Group 209 [I-D.trammell-taps-post-sockets] correspondent: Association 211 Examples: 213 o A TCP connection carrying HTTP/2 frames 215 o A set of IP packets that carry TCP or UDP segments and share the 216 same 5-tuple of src-address, dst-address, protocol, src-port, 217 dest-port. 219 2.4. Association Set, Flow Set (Flow-Group) 221 An Association Set or Flow Set is a set of Associations or Flows that 222 belong together from an application point of view. 224 [I-D.gjessing-taps-minset] correspondent: Flow-Group 226 [I-D.trammell-taps-post-sockets] correspondent: Association 228 Examples: 230 o Two flows, one carrying RTP payloads and one used for RTCP control 231 messages. 233 3. Mechanisms Used in Multi-Path Systems 235 3.1. Destination Selection 237 Destination Selection refers to selecting one of multiple different 238 destinations. This mechanism is applicable to any kind of 239 communication unit and can occur on all layers. 241 Typical cases for destination selection include: 243 o Choosing one address of a multi-homed server for an upcoming 244 communication. 246 o Choosing a server among a list of servers retuned by DNS, e.g for 247 servers that host the same content as part of a CDN. 249 o Choosing a backend server within a load balancer. 251 In practice, destination address selection is often tied to name 252 resolution. As name resolution relies on both local decisions on the 253 endpoint as well as decisions within the DNS infrastructure, this 254 mechanism spreads across different administrative domains which each 255 independently contribute to the overall selection result. 257 3.2. Path Selection 259 Path Selection refers to choosing which of the available paths to 260 use. and can occur on the network layer and any layer below. 262 o Within an end-host, path selection is usually realized by choosing 263 the source IP address and thus choosing one of the local network 264 interfaces for the communication to the remote endpoint. 266 o Within a path layer traffic system like an MPTCP-Proxy or a 267 BANANA-Box, path selection is usually realized by choosing the 268 outer source and destination address. 270 o In case of an ECMP router, path selection is usually done based on 271 a 3- or 5-tupel and just determines the interface to the next hop. 273 o Within MPTCP, each TCP segment has to be assigned to one or more 274 subflows for transmission to the receiver. 276 While path selection involves a choice of access network it does not 277 need knowledge of or changes to the routing choices within the core 278 network. 280 When doing path selection on small communication units like TCP 281 segments, it is not uncommon to split path selection into two 282 subproblems: _Candidate Path Selection_ determines feasible and 283 preferred choices, e.g., in case of MPTCP by establishing subflows. 284 Afterwards, _Per-Chunk Path Selection_ selects among these 285 alternatives for each chunk. Thus, the first can be more expensive 286 while the latter should be easy to execute. 288 TODO: Discuss difference between Multiple Provisioning Domains 289 [RFC7556] or multiple access networks within the same provisioning 290 domain - especially when it comes to integrating 3GPP mechanisms like 291 [RFC5555] or [RFC7864]. 293 3.3. Chunking 295 Chunking refers to splitting an message, a stream or a set of 296 associations into one or more parts. Typically, chunking splits only 297 large messages or streams into multiple ones while keeping smaller 298 entities untouched. Associations or Flows are typically not split, 299 but sets of Associations or Flows might be partitioned. Once split 300 into chunks, each chunk can be transferred individually over 301 different transfer options. 303 Chunking can and does occur at different layers within a system: 305 o A Web site consists of multiple objects or files. Thus, the files 306 can be seen as the natural chunks of a Web site. 308 o TCP takes as input a byte stream and chunks it into segments. TCP 309 chunking (segmentation) occurs at arbitrary byte ranges, thus it 310 will most likely not align with boundaries of Messages that were 311 multiplexed within an application layer Association on top of a 312 TCP connection. 314 In practice, chunking is often constrained in order to maintain 315 certain properties that are desirable for the overall system. 316 Examples such restrictions include the following: 318 o Segmentation in TCP restrict the chunk size, i.e. TCP segment 319 size, to the IP MTU or IP Path MTU to avoid fragmentation at the 320 IP layer. 322 o Equal cost multipath routing does not distribute packets, but 323 Flows to avoid reordering. 325 3.4. Scheduling 327 Scheduling refers to distributing chunks or sets of chunks across 328 multiple pre-chosen path. Thus, depending on the objectives, it can 329 make sense to see scheduling as is nothing else than per-chunk path 330 selection as defined above. In other cases, e.g. when trying to 331 balance traffic, it makes sense to look at scheduling as a concept 332 itself that uses chunking and per-chunk path selection as sub- 333 mechanisms. 335 Examples of scheduling strategies include: 337 o Schedule all chunks on one path as long as this path is available, 338 otherwise fall pack to another. 340 o Distribute chunks based on path capacity. 342 4. Cost of Transport Option Selection 344 Transport option selection mechanisms are often intertwined. Which 345 mechanism is used by which layer or which network component depends 346 on the transfer objectives as well as the state of the network, e.g., 347 availability, path throughput, path RTT, server load. 349 The cost and complexity of transport option selection depends on the 350 network state used and the number of transfer options. If the 351 transfer option selection only uses local state e.g., link 352 availability, and the mechanism is predetermined and/or uses simple 353 mechanisms, e.g., a simple hash function, the cost can even be 354 negligible. An example where transfer option selection is cheap is 355 ECMP within a router. In other cases, the cost can be non-trivial, 356 e.g. when the selection involves queries to remote entities or even 357 active network performance measurements. Such examples include DNS 358 or DHT lookups, as used by some file sharing protocols, or network 359 measurements like RTT and bandwidth estimations used by many video 360 streaming applications. Indeed, costs may be prohibitive, e.g when 361 requiring multiple DNS lookups for every 1 second chunk of a 20 362 minute video. 364 5. Involvement of On-Path Elements 366 It may become necessary to take path layer components (middle-boxes) 367 into account that interfere with the transport layer. 369 While the classical "End-To-End Arguments in System Design" 370 [End-To-End] advocates for a dumb network and placing functionality 371 as close to the edge and up in the stack as possible, there are 372 always tussles of moving functionality up or down the stack. This 373 document does not argue against pushing some multi-path functionality 374 down the stack, but advocates to maintain the control of the overall 375 system composition at the end host. Functionality provided by a path 376 can indeed be a reason to choose this path for a given communication 377 unit. 379 Some flow off-loading mechanisms that come in gestalt of of logical 380 interfaces, e.g., [RFC7847]. These interfaces treat some association 381 sets differently, which can be considered on-path functionality. 383 6. Security Considerations 385 Security related transport service request must take priority over 386 performance, therefore, transport options or stack compositions that 387 don't provide the transport service requested should be ignored for 388 transport option selection. 390 Note: This discussion is not exhaustive - more considerations will 391 be added in later versions of this draft. 393 7. IANA Considerations 395 None 397 8. Acknowledgements 399 This work has been supported by Leibniz Prize project funds of DFG - 400 German Research Foundation: Gottfried Wilhelm Leibniz-Preis 2011 (FKZ 401 FE 570/4-1). 403 9. Informative References 405 [End-To-End] 406 Saltzer, J., Reed, D., and D. Clark, "End-to-end arguments 407 in system design", ACM Transactions on Computer 408 Systems Vol. 2, pp. 277-288, DOI 10.1145/357401.357402, 409 November 1984. 411 [I-D.gjessing-taps-minset] 412 Gjessing, S. and M. Welzl, "A Minimal Set of Transport 413 Services for TAPS Systems", draft-gjessing-taps-minset-05 414 (work in progress), June 2017. 416 [I-D.pauly-taps-guidelines] 417 Pauly, T., "Guidelines for Racing During Connection 418 Establishment", draft-pauly-taps-guidelines-01 (work in 419 progress), October 2017. 421 [I-D.trammell-taps-post-sockets] 422 Trammell, B., Perkins, C., Pauly, T., Kuehlewind, M., and 423 C. Wood, "Post Sockets, An Abstract Programming Interface 424 for the Transport Layer", draft-trammell-taps-post- 425 sockets-01 (work in progress), September 2017. 427 [RFC5555] Soliman, H., Ed., "Mobile IPv6 Support for Dual Stack 428 Hosts and Routers", RFC 5555, DOI 10.17487/RFC5555, June 429 2009, . 431 [RFC7556] Anipko, D., Ed., "Multiple Provisioning Domain 432 Architecture", RFC 7556, DOI 10.17487/RFC7556, June 2015, 433 . 435 [RFC7847] Melia, T., Ed. and S. Gundavelli, Ed., "Logical-Interface 436 Support for IP Hosts with Multi-Access Support", RFC 7847, 437 DOI 10.17487/RFC7847, May 2016, 438 . 440 [RFC7864] Bernardos, CJ., Ed., "Proxy Mobile IPv6 Extensions to 441 Support Flow Mobility", RFC 7864, DOI 10.17487/RFC7864, 442 May 2016, . 444 Appendix A. Changes 446 A.1. Since -00 448 o Replaced granularity "Object" with "Message" to align with other 449 TAPS documents. 451 o Removed empty section on protocol instance selection - this topic 452 will go into a separate document later. 454 o Minor clarifications. 456 o Removed definition of normative terms not needed for this document 458 o Added acknowledgments and updated authors' affiliation 459 (compliance). 461 Authors' Addresses 463 Philipp S. Tiesel 464 TU Berlin 465 Marchstr. 23 466 Berlin 467 Germany 469 Email: philipp@inet.tu-berlin.de 471 Theresa Enghardt 472 TU Berlin 473 Marchstr. 23 474 Berlin 475 Germany 477 Email: theresa@inet.tu-berlin.de