idnits 2.17.1 draft-ietf-taps-minset-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 5, 2018) is 2054 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'SUBCATEGORY' is mentioned on line 730, but not defined == Outdated reference: A later version (-12) exists of draft-ietf-taps-transport-security-02 -- Unexpected draft version: The latest known version of draft-tsvwg-le-phb is -00, but you're referring to -03. Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TAPS M. Welzl 3 Internet-Draft S. Gjessing 4 Intended status: Informational University of Oslo 5 Expires: March 9, 2019 September 5, 2018 7 A Minimal Set of Transport Services for End Systems 8 draft-ietf-taps-minset-08 10 Abstract 12 This draft recommends a minimal set of Transport Services offered by 13 end systems, and gives guidance on choosing among the available 14 mechanisms and protocols. It is based on the set of transport 15 features in RFC 8303. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at https://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on March 9, 2019. 34 Copyright Notice 36 Copyright (c) 2018 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (https://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 52 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 53 3. The Minimal Set of Transport Features . . . . . . . . . . . . 5 54 3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION . . . . . . . 5 55 3.2. MAINTENANCE . . . . . . . . . . . . . . . . . . . . . . . 8 56 3.2.1. Connection groups . . . . . . . . . . . . . . . . . . 8 57 3.2.2. Individual connections . . . . . . . . . . . . . . . 10 58 3.3. DATA Transfer . . . . . . . . . . . . . . . . . . . . . . 10 59 3.3.1. Sending Data . . . . . . . . . . . . . . . . . . . . 10 60 3.3.2. Receiving Data . . . . . . . . . . . . . . . . . . . 11 61 4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 62 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 63 6. Security Considerations . . . . . . . . . . . . . . . . . . . 12 64 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 65 7.1. Normative References . . . . . . . . . . . . . . . . . . 12 66 7.2. Informative References . . . . . . . . . . . . . . . . . 13 67 Appendix A. Deriving the minimal set . . . . . . . . . . . . . . 14 68 A.1. Step 1: Categorization -- The Superset of Transport 69 Features . . . . . . . . . . . . . . . . . . . . . . . . 15 70 A.1.1. CONNECTION Related Transport Features . . . . . . . . 17 71 A.1.2. DATA Transfer Related Transport Features . . . . . . 33 72 A.2. Step 2: Reduction -- The Reduced Set of Transport 73 Features . . . . . . . . . . . . . . . . . . . . . . . . 38 74 A.2.1. CONNECTION Related Transport Features . . . . . . . . 39 75 A.2.2. DATA Transfer Related Transport Features . . . . . . 40 76 A.3. Step 3: Discussion . . . . . . . . . . . . . . . . . . . 41 77 A.3.1. Sending Messages, Receiving Bytes . . . . . . . . . . 41 78 A.3.2. Stream Schedulers Without Streams . . . . . . . . . . 42 79 A.3.3. Early Data Transmission . . . . . . . . . . . . . . . 43 80 A.3.4. Sender Running Dry . . . . . . . . . . . . . . . . . 44 81 A.3.5. Capacity Profile . . . . . . . . . . . . . . . . . . 44 82 A.3.6. Security . . . . . . . . . . . . . . . . . . . . . . 45 83 A.3.7. Packet Size . . . . . . . . . . . . . . . . . . . . . 45 84 Appendix B. Revision information . . . . . . . . . . . . . . . . 46 85 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 47 87 1. Introduction 89 Currently, the set of transport services that most applications use 90 is based on TCP and UDP (and protocols that are layered on top of 91 them); this limits the ability for the network stack to make use of 92 features of other transport protocols. For example, if a protocol 93 supports out-of-order message delivery but applications always assume 94 that the network provides an ordered bytestream, then the network 95 stack can not immediately deliver a message that arrives out-of- 96 order: doing so would break a fundamental assumption of the 97 application. The net result is unnecessary head-of-line blocking 98 delay. 100 By exposing the transport services of multiple transport protocols, a 101 transport system can make it possible for applications to use these 102 services without being statically bound to a specific transport 103 protocol. The first step towards the design of such a system was 104 taken by [RFC8095], which surveys a large number of transports, and 105 [RFC8303] as well as [RFC8304], which identify the specific transport 106 features that are exposed to applications by the protocols TCP, 107 MPTCP, UDP(-Lite) and SCTP as well as the LEDBAT congestion control 108 mechanism. This memo is based on these documents and follows the 109 same terminology (also listed below). Because the considered 110 transport protocols conjointly cover a wide range of transport 111 features, there is reason to hope that the resulting set (and the 112 reasoning that led to it) will also apply to many aspects of other 113 transport protocols that may be in use today, or may be designed in 114 the future. 116 By decoupling applications from transport protocols, a transport 117 system provides a different abstraction level than the Berkeley 118 sockets interface. As with high- vs. low-level programming 119 languages, a higher abstraction level allows more freedom for 120 automation below the interface, yet it takes some control away from 121 the application programmer. This is the design trade-off that a 122 transport system developer is facing, and this document provides 123 guidance on the design of this abstraction level. Some transport 124 features are currently rarely offered by APIs, yet they must be 125 offered or they can never be used. Other transport features are 126 offered by the APIs of the protocols covered here, but not exposing 127 them in an API would allow for more freedom to automate protocol 128 usage in a transport system. The minimal set presented in this 129 document is an effort to find a middle ground that can be recommended 130 for transport systems to implement, on the basis of the transport 131 features discussed in [RFC8303]. 133 Applications use a wide variety of APIs today. The transport 134 features in the minimal set in this document must be reflected in 135 *all* network APIs in order for the underlying functionality to 136 become usable everywhere. For example, it does not help an 137 application that talks to a library which offers its own 138 communication interface if the underlying Berkeley Sockets API is 139 extended to offer "unordered message delivery", but the library only 140 exposes an ordered bytestream. Both the Berkeley Sockets API and the 141 library would have to expose the "unordered message delivery" 142 transport feature (alternatively, there may be ways for certain types 143 of libraries to use this transport feature without exposing it, based 144 on knowledge about the applications -- but this is not the general 145 case). In most situations, in the interest of being as flexible and 146 efficient as possible, the best choice will be for a library to 147 expose at least all of the transport features that are recommended as 148 a "minimal set" here. 150 This "minimal set" can be implemented "one-sided" over TCP. This 151 means that a sender-side transport system can talk to a standard TCP 152 receiver, and a receiver-side transport system can talk to a standard 153 TCP sender. If certain limitations are put in place, the "minimal 154 set" can also be implemented "one-sided" over UDP. 156 2. Terminology 158 Transport Feature: a specific end-to-end feature that the transport 159 layer provides to an application. Examples include 160 confidentiality, reliable delivery, ordered delivery, message- 161 versus-stream orientation, etc. 162 Transport Service: a set of Transport Features, without an 163 association to any given framing protocol, which provides a 164 complete service to an application. 165 Transport Protocol: an implementation that provides one or more 166 different transport services using a specific framing and header 167 format on the wire. 168 Transport Service Instance: an arrangement of transport protocols 169 with a selected set of features and configuration parameters that 170 implements a single transport service, e.g., a protocol stack (RTP 171 over UDP). 172 Application: an entity that uses the transport layer for end-to-end 173 delivery data across the network (this may also be an upper layer 174 protocol or tunnel encapsulation). 175 Application-specific knowledge: knowledge that only applications 176 have. 177 Endpoint: an entity that communicates with one or more other 178 endpoints using a transport protocol. 179 Connection: shared state of two or more endpoints that persists 180 across messages that are transmitted between these endpoints. 181 Connection Group: a set of connections which share the same 182 configuration (configuring one of them causes all other 183 connections in the same group to be configured in the same way). 184 We call connections that belong to a connection group "grouped", 185 while "ungrouped" connections are not a part of a connection 186 group. 187 Socket: the combination of a destination IP address and a 188 destination port number. 190 Moreover, throughout the document, the protocol name "UDP(-Lite)" is 191 used when discussing transport features that are equivalent for UDP 192 and UDP-Lite; similarly, the protocol name "TCP" refers to both TCP 193 and MPTCP. 195 3. The Minimal Set of Transport Features 197 Based on the categorization, reduction, and discussion in Appendix A, 198 this section describes a minimal set of transport features that end 199 systems should offer. The described transport system can be 200 implemented over TCP. Elements of the system that are not marked 201 with "!UDP" can also be implemented over UDP. 203 The arguments laid out in Appendix A.3 ("discussion") were used to 204 make the final representation of the minimal set as short, simple and 205 general as possible. There may be situations where these arguments 206 do not apply -- e.g., implementers may have specific reasons to 207 expose multi-streaming as a visible functionality to applications, or 208 the restrictive open / close semantics may be problematic under some 209 circumstances. In such cases, the representation in Appendix A.2 210 ("reduction") should be considered. 212 As in Appendix A, Appendix A.2 and [RFC8303], we categorize the 213 minimal set of transport features as 1) CONNECTION related 214 (ESTABLISHMENT, AVAILABILITY, MAINTENANCE, TERMINATION) and 2) DATA 215 Transfer related (Sending Data, Receiving Data, Errors). Here, the 216 focus is on connections that the transport system offers as an 217 abstraction to the application, as opposed to connections of 218 transport protocols that the transport system uses. 220 3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION 222 A connection must first be "created" to allow for some initial 223 configuration to be carried out before the transport system can 224 actively or passively establish communication with a remote endpoint. 225 All configuration parameters in Section 3.2 can be used initially, 226 although some of them may only take effect when a connection has been 227 established with a chosen transport protocol. Configuring a 228 connection early helps a transport system make the right decisions. 229 For example, grouping information can influence the transport system 230 to implement a connection as a stream of a multi-streaming protocol's 231 existing association or not. 233 For ungrouped connections, early configuration is necessary because 234 it allows the transport system to know which protocols it should try 235 to use. In particular, a transport system that only makes a one-time 236 choice for a particular protocol must know early about strict 237 requirements that must be kept, or it can end up in a deadlock 238 situation (e.g., having chosen UDP and later be asked to support 239 reliable transfer). As an example description of how to correctly 240 handle these cases, we provide the following decision tree (this is 241 derived from Appendix A.2.1 excluding authentication, as explained in 242 Section 6): 244 - Will it ever be necessary to offer any of the following? 245 * Reliably transfer data 246 * Notify the peer of closing/aborting 247 * Preserve data ordering 249 Yes: SCTP or TCP can be used. 250 - Is any of the following useful to the application? 251 * Choosing a scheduler to operate between connections 252 in a group, with the possibility to configure a priority 253 or weight per connection 254 * Configurable message reliability 255 * Unordered message delivery 256 * Request not to delay the acknowledgement (SACK) of a message 258 Yes: SCTP is preferred. 259 No: 260 - Is any of the following useful to the application? 261 * Hand over a message to reliably transfer (possibly 262 multiple times) before connection establishment 263 * Suggest timeout to the peer 264 * Notification of Excessive Retransmissions (early 265 warning below abortion threshold) 266 * Notification of ICMP error message arrival 268 Yes: TCP is preferred. 269 No: SCTP and TCP are equally preferable. 271 No: all protocols can be used. 272 - Is any of the following useful to the application? 273 * Specify checksum coverage used by the sender 274 * Specify minimum checksum coverage required by receiver 276 Yes: UDP-Lite is preferred. 277 No: UDP is preferred. 279 Note that this decision tree is not optimal for all cases. For 280 example, if an application wants to use "Specify checksum coverage 281 used by the sender", which is only offered by UDP-Lite, and 282 "Configure priority or weight for a scheduler", which is only offered 283 by SCTP, the above decision tree will always choose UDP-Lite, making 284 it impossible to use SCTP's schedulers with priorities between 285 grouped connections. We caution implementers to be aware of the full 286 set of trade-offs, for which we recommend consulting the list in 287 Appendix A.2.1 when deciding how to initialize a connection. 289 To summarize, the following parameters serve as input for the 290 transport system to help it choose and configure a suitable protocol: 292 o Reliability: a boolean that should be set to true when any of the 293 following will be useful to the application: reliably transfer 294 data; notify the peer of closing/aborting; preserve data ordering. 295 o Checksum coverage: a boolean to specify whether it will be useful 296 to the application to specify checksum coverage when sending or 297 receiving. 298 o Configure message priority: a boolean that should be set to true 299 when any of the following per-message configuration or 300 prioritization mechanisms will be useful to the application: 301 choosing a scheduler to operate between grouped connections, with 302 the possibility to configure a priority or weight per connection; 303 configurable message reliability; unordered message delivery; 304 requesting not to delay the acknowledgement (SACK) of a message. 305 o Early message timeout notifications: a boolean that should be set 306 to true when any of the following will be useful to the 307 application: hand over a message to reliably transfer (possibly 308 multiple times) before connection establishment; suggest timeout 309 to the peer; notification of excessive retransmissions (early 310 warning below abortion threshold); notification of ICMP error 311 message arrival. 313 Once a connection is created, it can be queried for the maximum 314 amount of data that an application can possibly expect to have 315 reliably transmitted before or during transport connection 316 establishment (with zero being a possible answer) (see 317 Section 3.2.1). An application can also give the connection a 318 message for reliable transmission before or during connection 319 establishment (!UDP); the transport system will then try to transmit 320 it as early as possible. An application can facilitate sending a 321 message particularly early by marking it as "idempotent" (see 322 Section 3.3.1); in this case, the receiving application must be 323 prepared to potentially receive multiple copies of the message 324 (because idempotent messages are reliably transferred, asking for 325 idempotence is not necessary for systems that support UDP). 327 After creation, a transport system can actively establish 328 communication with a peer, or it can passively listen for incoming 329 connection requests. Note that active establishment may or may not 330 trigger a notification on the listening side. It is possible that 331 the first notification on the listening side is the arrival of the 332 first data that the active side sends (a receiver-side transport 333 system could handle this by continuing to block a "Listen" call, 334 immediately followed by issuing "Receive", for example; callback- 335 based implementations could simply skip the equivalent of "Listen"). 336 This also means that the active opening side is assumed to be the 337 first side sending data. 339 A transport system can actively close a connection, i.e. terminate it 340 after reliably delivering all remaining data to the peer (if reliable 341 data delivery was requested earlier (!UDP)), in which case the peer 342 is notified that the connection is closed. Alternatively, a 343 connection can be aborted without delivering outstanding data to the 344 peer. In case reliable or partially reliable data delivery was 345 requested earlier (!UDP), the peer is notified that the connection is 346 aborted. A timeout can be configured to abort a connection when data 347 could not be delivered for too long (!UDP); however, timeout-based 348 abortion does not notify the peer application that the connection has 349 been aborted. Because half-closed connections are not supported, 350 when a host implementing a transport system receives a notification 351 that the peer is closing or aborting the connection (!UDP), its peer 352 may not be able to read outstanding data. This means that 353 unacknowledged data residing a transport system's send buffer may 354 have to be dropped from that buffer upon arrival of a "close" or 355 "abort" notification from the peer. 357 3.2. MAINTENANCE 359 A transport system must offer means to group connections, but it 360 cannot guarantee truly grouping them using the transport protocols 361 that it uses (e.g., it cannot be guaranteed that connections become 362 multiplexed as streams on a single SCTP association when SCTP may not 363 be available). The transport system must therefore ensure that 364 group- versus non-group-configurations are handled correctly in some 365 way (e.g., by applying the configuration to all grouped connections 366 even when they are not multiplexed, or informing the application 367 about grouping success or failure). 369 As a general rule, any configuration described below should be 370 carried out as early as possible to aid the transport system's 371 decision making. 373 3.2.1. Connection groups 375 The following transport features and notifications (some directly 376 from Appendix A.2, some new or changed, based on the discussion in 377 Appendix A.3) automatically apply to all grouped connections: 379 (!UDP) Configure a timeout: this can be done with the following 380 parameters: 382 o A timeout value for aborting connections, in seconds 383 o A timeout value to be suggested to the peer (if possible), in 384 seconds 385 o The number of retransmissions after which the application should 386 be notifed of "Excessive Retransmissions" 388 Configure urgency: this can be done with the following parameters: 390 o A number to identify the type of scheduler that should be used to 391 operate between connections in the group (no guarantees given). 392 Schedulers are defined in [RFC8260]. 393 o A "capacity profile" number to identify how an application wants 394 to use its available capacity. Choices can be "lowest possible 395 latency at the expense of overhead" (which would disable any 396 Nagle-like algorithm), "scavenger", or values that help determine 397 the DSCP value for a connection (e.g. similar to table 1 in 398 [I-D.ietf-tsvwg-rtcweb-qos]). 399 o A buffer limit (in bytes); when the sender has less than the 400 provided limit of bytes in the buffer, the application may be 401 notified. Notifications are not guaranteed, and it is optional 402 for a transport system to support buffer limit values greater than 403 0. Note that this limit and its notification should operate 404 across the buffers of the whole transport system, i.e. also any 405 potential buffers that the transport system itself may use on top 406 of the transport's send buffer. 408 Following Appendix A.3.7, these properties can be queried: 410 o The maximum message size that may be sent without fragmentation 411 via the configured interface. This is optional for a transport 412 system to offer, and may return an error ("not available"). It 413 can aid applications implementing Path MTU Discovery. 414 o The maximum transport message size that can be sent, in bytes. 415 Irrespective of fragmentation, there is a size limit for the 416 messages that can be handed over to SCTP or UDP(-Lite); because 417 the service provided by a transport system is independent of the 418 transport protocol, it must allow an application to query this 419 value -- the maximum size of a message in an Application-Framed- 420 Bytestream (see Appendix A.3.1). This may also return an error 421 when data is not delimited ("not available"). 422 o The maximum transport message size that can be received from the 423 configured interface, in bytes (or "not available"). 424 o The maximum amount of data that can possibly be sent before or 425 during connection establishment, in bytes. 427 In addition to the already mentioned closing / aborting notifications 428 and possible send errors, the following notifications can occur: 430 o Excessive Retransmissions: the configured (or a default) number of 431 retransmissions has been reached, yielding this early warning 432 below an abortion threshold. 433 o ICMP Arrival (parameter: ICMP message): an ICMP packet carrying 434 the conveyed ICMP message has arrived. 435 o ECN Arrival (parameter: ECN value): a packet carrying the conveyed 436 ECN value has arrived. This can be useful for applications 437 implementing congestion control. 438 o Timeout (parameter: s seconds): data could not be delivered for s 439 seconds. 440 o Drain: the send buffer has either drained below the configured 441 buffer limit or it has become completely empty. This is a generic 442 notification that tries to enable uniform access to 443 "TCP_NOTSENT_LOWAT" as well as the "SENDER DRY" notification (as 444 discussed in Appendix A.3.4 -- SCTP's "SENDER DRY" is a special 445 case where the threshold (for unsent data) is 0 and there is also 446 no more unacknowledged data in the send buffer). 448 3.2.2. Individual connections 450 Configure priority or weight for a scheduler, as described in 451 [RFC8260]. 453 Configure checksum usage: this can be done with the following 454 parameters, but there is no guarantee that any checksum limitations 455 will indeed be enforced (the default behavior is "full coverage, 456 checksum enabled"): 458 o A boolean to enable / disable usage of a checksum when sending 459 o The desired coverage (in bytes) of the checksum used when sending 460 o A boolean to enable / disable requiring a checksum when receiving 461 o The required minimum coverage (in bytes) of the checksum when 462 receiving 464 3.3. DATA Transfer 466 3.3.1. Sending Data 468 When sending a message, no guarantees are given about the 469 preservation of message boundaries to the peer; if message boundaries 470 are needed, the receiving application at the peer must know about 471 them beforehand (or the transport system cannot use TCP). Note that 472 an application should already be able to hand over data before the 473 transport system establishes a connection with a chosen transport 474 protocol. Regarding the message that is being handed over, the 475 following parameters can be used: 477 o Reliability: this parameter is used to convey a choice of: fully 478 reliable with congestion control (!UDP), unreliable without 479 congestion control, unreliable with congestion control (!UDP), 480 partially reliable with congestion control (see [RFC3758] and 481 [RFC7496] for details on how to specify partial reliability) 482 (!UDP). The latter two choices are optional for a transport 483 system to offer and may result in full reliability. Note that 484 applications sending unreliable data without congestion control 485 should themselves perform congestion control in accordance with 486 [RFC2914]. 487 o (!UDP) Ordered: this boolean parameter lets an application choose 488 between ordered message delivery (true) and possibly unordered, 489 potentially faster message delivery (false). 490 o Bundle: a boolean that expresses a preference for allowing to 491 bundle messages (true) or not (false). No guarantees are given. 492 o DelAck: a boolean that, if false, lets an application request that 493 the peer would not delay the acknowledgement for this message. 494 o Fragment: a boolean that expresses a preference for allowing to 495 fragment messages (true) or not (false), at the IP level. No 496 guarantees are given. 497 o (!UDP) Idempotent: a boolean that expresses whether a message is 498 idempotent (true) or not (false). Idempotent messages may arrive 499 multiple times at the receiver (but they will arrive at least 500 once). When data is idempotent it can be used by the receiver 501 immediately on a connection establishment attempt. Thus, if data 502 is handed over before the transport system establishes a 503 connection with a chosen transport protocol, stating that a 504 message is idempotent facilitates transmitting it to the peer 505 application particularly early. 507 An application can be notified of a failure to send a specific 508 message. There is no guarantee of such notifications, i.e. send 509 failures can also silently occur. 511 3.3.2. Receiving Data 513 A receiving application obtains an "Application-Framed Bytestream" 514 (AFra-Bytestream); this concept is further described in 515 Appendix A.3.1). In line with TCP's receiver semantics, an AFra- 516 Bytestream is just a stream of bytes to the receiver. If message 517 boundaries were specified by the sender, a receiver-side transport 518 system implementing only the minimum set of transport services 519 defined here will still not inform the receiving application about 520 them (this limitation is only needed for transport systems that are 521 implemented to directly use TCP). 523 Different from TCP's semantics, if the sending application has 524 allowed that messages are not fully reliably transferred, or 525 delivered out of order, then such re-ordering or unreliability may be 526 reflected per message in the arriving data. Messages will always 527 stay intact - i.e. if an incomplete message is contained at the end 528 of the arriving data block, this message is guaranteed to continue in 529 the next arriving data block. 531 4. Acknowledgements 533 The authors would like to thank all the participants of the TAPS 534 Working Group and the NEAT and MAMI research projects for valuable 535 input to this document. We especially thank Michael Tuexen for help 536 with connection connection establishment/teardown, Gorry Fairhurst 537 for his suggestions regarding fragmentation and packet sizes, and 538 Spencer Dawkins for his extremely detailed and constructive review. 539 This work has received funding from the European Union's Horizon 2020 540 research and innovation programme under grant agreement No. 644334 541 (NEAT). 543 5. IANA Considerations 545 This memo includes no request to IANA. 547 6. Security Considerations 549 Authentication, confidentiality protection, and integrity protection 550 are identified as transport features by [RFC8095]. As currently 551 deployed in the Internet, these features are generally provided by a 552 protocol or layer on top of the transport protocol; no current full- 553 featured standards-track transport protocol provides all of these 554 transport features on its own. Therefore, these transport features 555 are not considered in this document, with the exception of native 556 authentication capabilities of TCP and SCTP for which the security 557 considerations in [RFC5925] and [RFC4895] apply. The minimum 558 requirements for a secure transport system are discussed in a 559 separate document (Section 5 on Security Features and Transport 560 Dependencies of [I-D.ietf-taps-transport-security]). 562 7. References 564 7.1. Normative References 566 [I-D.ietf-taps-transport-security] 567 Pauly, T., Perkins, C., Rose, K., and C. Wood, "A Survey 568 of Transport Security Protocols", draft-ietf-taps- 569 transport-security-02 (work in progress), June 2018. 571 [RFC8095] Fairhurst, G., Ed., Trammell, B., Ed., and M. Kuehlewind, 572 Ed., "Services Provided by IETF Transport Protocols and 573 Congestion Control Mechanisms", RFC 8095, 574 DOI 10.17487/RFC8095, March 2017, 575 . 577 [RFC8303] Welzl, M., Tuexen, M., and N. Khademi, "On the Usage of 578 Transport Features Provided by IETF Transport Protocols", 579 RFC 8303, DOI 10.17487/RFC8303, February 2018, 580 . 582 7.2. Informative References 584 [COBS] Cheshire, S. and M. Baker, "Consistent Overhead Byte 585 Stuffing", IEEE/ACM Transactions on Networking Vol. 7, No. 586 2, April 1999. 588 [I-D.ietf-tsvwg-rtcweb-qos] 589 Jones, P., Dhesikan, S., Jennings, C., and D. Druta, "DSCP 590 Packet Markings for WebRTC QoS", draft-ietf-tsvwg-rtcweb- 591 qos-18 (work in progress), August 2016. 593 [LBE-draft] 594 Bless, R., "A Lower Effort Per-Hop Behavior (LE PHB)", 595 Internet-draft draft-tsvwg-le-phb-03, February 2018. 597 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 598 RFC 2914, DOI 10.17487/RFC2914, September 2000, 599 . 601 [RFC3758] Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P. 602 Conrad, "Stream Control Transmission Protocol (SCTP) 603 Partial Reliability Extension", RFC 3758, 604 DOI 10.17487/RFC3758, May 2004, 605 . 607 [RFC4895] Tuexen, M., Stewart, R., Lei, P., and E. Rescorla, 608 "Authenticated Chunks for the Stream Control Transmission 609 Protocol (SCTP)", RFC 4895, DOI 10.17487/RFC4895, August 610 2007, . 612 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 613 Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, 614 . 616 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 617 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 618 June 2010, . 620 [RFC7305] Lear, E., Ed., "Report from the IAB Workshop on Internet 621 Technology Adoption and Transition (ITAT)", RFC 7305, 622 DOI 10.17487/RFC7305, July 2014, 623 . 625 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 626 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 627 . 629 [RFC7496] Tuexen, M., Seggelmann, R., Stewart, R., and S. Loreto, 630 "Additional Policies for the Partially Reliable Stream 631 Control Transmission Protocol Extension", RFC 7496, 632 DOI 10.17487/RFC7496, April 2015, 633 . 635 [RFC8260] Stewart, R., Tuexen, M., Loreto, S., and R. Seggelmann, 636 "Stream Schedulers and User Message Interleaving for the 637 Stream Control Transmission Protocol", RFC 8260, 638 DOI 10.17487/RFC8260, November 2017, 639 . 641 [RFC8304] Fairhurst, G. and T. Jones, "Transport Features of the 642 User Datagram Protocol (UDP) and Lightweight UDP (UDP- 643 Lite)", RFC 8304, DOI 10.17487/RFC8304, February 2018, 644 . 646 [WWDC2015] 647 Lakhera, P. and S. Cheshire, "Your App and Next Generation 648 Networks", Apple Worldwide Developers Conference 2015, San 649 Francisco, USA, June 2015, 650 . 652 Appendix A. Deriving the minimal set 654 We approach the construction of a minimal set of transport features 655 in the following way: 657 1. Categorization (Appendix A.1): the superset of transport features 658 from [RFC8303] is presented, and transport features are 659 categorized for later reduction. 660 2. Reduction (Appendix A.2): a shorter list of transport features is 661 derived from the categorization in the first step. This removes 662 all transport features that do not require application-specific 663 knowledge or would result in semantically incorrect behavior if 664 they were implemented over TCP or UDP. 665 3. Discussion (Appendix A.3): the resulting list shows a number of 666 peculiarities that are discussed, to provide a basis for 667 constructing the minimal set. 669 4. Construction (Section 3): Based on the reduced set and the 670 discussion of the transport features therein, a minimal set is 671 constructed. 673 A.1. Step 1: Categorization -- The Superset of Transport Features 675 Following [RFC8303], we divide the transport features into two main 676 groups as follows: 678 1. CONNECTION related transport features 679 - ESTABLISHMENT 680 - AVAILABILITY 681 - MAINTENANCE 682 - TERMINATION 684 2. DATA Transfer related transport features 685 - Sending Data 686 - Receiving Data 687 - Errors 689 We assume that applications have no specific requirements that need 690 knowledge about the network, e.g. regarding the choice of network 691 interface or the end-to-end path. Even with these assumptions, there 692 are certain requirements that are strictly kept by transport 693 protocols today, and these must also be kept by a transport system. 694 Some of these requirements relate to transport features that we call 695 "Functional". 697 Functional transport features provide functionality that cannot be 698 used without the application knowing about them, or else they violate 699 assumptions that might cause the application to fail. For example, 700 ordered message delivery is a functional transport feature: it cannot 701 be configured without the application knowing about it because the 702 application's assumption could be that messages always arrive in 703 order. Failure includes any change of the application behavior that 704 is not performance oriented, e.g. security. 706 "Change DSCP" and "Disable Nagle algorithm" are examples of transport 707 features that we call "Optimizing": if a transport system 708 autonomously decides to enable or disable them, an application will 709 not fail, but a transport system may be able to communicate more 710 efficiently if the application is in control of this optimizing 711 transport feature. These transport features require application- 712 specific knowledge (e.g., about delay/bandwidth requirements or the 713 length of future data blocks that are to be transmitted). 715 The transport features of IETF transport protocols that do not 716 require application-specific knowledge and could therefore be 717 utilized by a transport system on its own without involving the 718 application are called "Automatable". 720 Finally, in three cases, transport features are aggregated and/or 721 slightly changed from [RFC8303] in the description below. These 722 transport features are marked as "ADDED". These do not add any new 723 functionality but just represent a simple refactoring step that helps 724 to streamline the derivation process (e.g., by removing a choice of a 725 parameter for the sake of applications that may not care about this 726 choice). The corresponding transport features are automatable, and 727 they are listed immediately below the "ADDED" transport feature. 729 In this description, transport services are presented following the 730 nomenclature "CATEGORY.[SUBCATEGORY].SERVICENAME.PROTOCOL", 731 equivalent to "pass 2" in [RFC8303]. We also sketch how functional 732 or optimizing transport features can be implemented by a transport 733 system. The "minimal set" derived in this document is meant to be 734 implementable "one-sided" over TCP, and, with limitations, UDP. 735 Hence, for all transport features that are categorized as 736 "functional" or "optimizing", and for which no matching TCP and/or 737 UDP primitive exists in "pass 2" of [RFC8303], a brief discussion on 738 how to implement them over TCP and/or UDP is included. 740 We designate some transport features as "automatable" on the basis of 741 a broader decision that affects multiple transport features: 743 o Most transport features that are related to multi-streaming were 744 designated as "automatable". This was done because the decision 745 on whether to use multi-streaming or not does not depend on 746 application-specific knowledge. This means that a connection that 747 is exhibited to an application could be implemented by using a 748 single stream of an SCTP association instead of mapping it to a 749 complete SCTP association or TCP connection. This could be 750 achieved by using more than one stream when an SCTP association is 751 first established (CONNECT.SCTP parameter "outbound stream 752 count"), maintaining an internal stream number, and using this 753 stream number when sending data (SEND.SCTP parameter "stream 754 number"). Closing or aborting a connection could then simply free 755 the stream number for future use. This is discussed further in 756 Appendix A.3.2. 757 o All transport features that are related to using multiple paths or 758 the choice of the network interface were designated as 759 "automatable". Choosing a path or an interface does not depend on 760 application-specific knowledge. For example, "Listen" could 761 always listen on all available interfaces and "Connect" could use 762 the default interface for the destination IP address. 764 A.1.1. CONNECTION Related Transport Features 766 ESTABLISHMENT: 768 o Connect 769 Protocols: TCP, SCTP, UDP(-Lite) 770 Functional because the notion of a connection is often reflected 771 in applications as an expectation to be able to communicate after 772 a "Connect" succeeded, with a communication sequence relating to 773 this transport feature that is defined by the application 774 protocol. 775 Implementation: via CONNECT.TCP, CONNECT.SCTP or CONNECT.UDP(- 776 Lite). 778 o Specify which IP Options must always be used 779 Protocols: TCP, UDP(-Lite) 780 Automatable because IP Options relate to knowledge about the 781 network, not the application. 783 o Request multiple streams 784 Protocols: SCTP 785 Automatable because using multi-streaming does not require 786 application-specific knowledge. 787 Implementation: see Appendix A.3.2. 789 o Limit the number of inbound streams 790 Protocols: SCTP 791 Automatable because using multi-streaming does not require 792 application-specific knowledge. 793 Implementation: see Appendix A.3.2. 795 o Specify number of attempts and/or timeout for the first 796 establishment message 797 Protocols: TCP, SCTP 798 Functional because this is closely related to potentially assumed 799 reliable data delivery for data that is sent before or during 800 connection establishment. 801 Implementation: Using a parameter of CONNECT.TCP and CONNECT.SCTP. 802 Implementation over UDP: Do nothing (this is irrelevant in case of 803 UDP because there, reliable data delivery is not assumed). 805 o Obtain multiple sockets 806 Protocols: SCTP 807 Automatable because the usage of multiple paths to communicate to 808 the same end host relates to knowledge about the network, not the 809 application. 811 o Disable MPTCP 812 Protocols: MPTCP 813 Automatable because the usage of multiple paths to communicate to 814 the same end host relates to knowledge about the network, not the 815 application. 816 Implementation: via a boolean parameter in CONNECT.MPTCP. 818 o Configure authentication 819 Protocols: TCP, SCTP 820 Functional because this has a direct influence on security. 821 Implementation: via parameters in CONNECT.TCP and CONNECT.SCTP. 822 With TCP, this allows to configure Master Key Tuples (MKTs) to 823 authenticate complete segments (including the TCP IPv4 824 pseudoheader, TCP header, and TCP data). With SCTP, this allows 825 to specify which chunk types must always be authenticated. 826 Authenticating only certain chunk types creates a reduced level of 827 security that is not supported by TCP; to be compatible, this 828 should therefore only allow to authenticate all chunk types. Key 829 material must be provided in a way that is compatible with both 830 [RFC4895] and [RFC5925]. 831 Implementation over UDP: Not possible (UDP does not offer this 832 functionality). 834 o Indicate (and/or obtain upon completion) an Adaptation Layer via 835 an adaptation code point 836 Protocols: SCTP 837 Functional because it allows to send extra data for the sake of 838 identifying an adaptation layer, which by itself is application- 839 specific. 840 Implementation: via a parameter in CONNECT.SCTP. 841 Implementation over TCP: not possible (TCP does not offer this 842 functionality). 843 Implementation over UDP: not possible (UDP does not offer this 844 functionality). 846 o Request to negotiate interleaving of user messages 847 Protocols: SCTP 848 Automatable because it requires using multiple streams, but 849 requesting multiple streams in the CONNECTION.ESTABLISHMENT 850 category is automatable. 851 Implementation: via a parameter in CONNECT.SCTP. 853 o Hand over a message to reliably transfer (possibly multiple times) 854 before connection establishment 855 Protocols: TCP 856 Functional because this is closely tied to properties of the data 857 that an application sends or expects to receive. 858 Implementation: via a parameter in CONNECT.TCP. 859 Implementation over UDP: not possible (UDP does not provide 860 reliability). 862 o Hand over a message to reliably transfer during connection 863 establishment 864 Protocols: SCTP 865 Functional because this can only work if the message is limited in 866 size, making it closely tied to properties of the data that an 867 application sends or expects to receive. 868 Implementation: via a parameter in CONNECT.SCTP. 869 Implementation over TCP: not possible (TCP does not allow 870 identification of message boundaries because it provides a byte 871 stream service) 872 Implementation over UDP: not possible (UDP is unreliable). 874 o Enable UDP encapsulation with a specified remote UDP port number 875 Protocols: SCTP 876 Automatable because UDP encapsulation relates to knowledge about 877 the network, not the application. 879 AVAILABILITY: 881 o Listen 882 Protocols: TCP, SCTP, UDP(-Lite) 883 Functional because the notion of accepting connection requests is 884 often reflected in applications as an expectation to be able to 885 communicate after a "Listen" succeeded, with a communication 886 sequence relating to this transport feature that is defined by the 887 application protocol. 888 ADDED. This differs from the 3 automatable transport features 889 below in that it leaves the choice of interfaces for listening 890 open. 891 Implementation: by listening on all interfaces via LISTEN.TCP (not 892 providing a local IP address) or LISTEN.SCTP (providing SCTP port 893 number / address pairs for all local IP addresses). LISTEN.UDP(- 894 Lite) supports both methods. 896 o Listen, 1 specified local interface 897 Protocols: TCP, SCTP, UDP(-Lite) 898 Automatable because decisions about local interfaces relate to 899 knowledge about the network and the Operating System, not the 900 application. 902 o Listen, N specified local interfaces 903 Protocols: SCTP 904 Automatable because decisions about local interfaces relate to 905 knowledge about the network and the Operating System, not the 906 application. 908 o Listen, all local interfaces 909 Protocols: TCP, SCTP, UDP(-Lite) 910 Automatable because decisions about local interfaces relate to 911 knowledge about the network and the Operating System, not the 912 application. 914 o Specify which IP Options must always be used 915 Protocols: TCP, UDP(-Lite) 916 Automatable because IP Options relate to knowledge about the 917 network, not the application. 919 o Disable MPTCP 920 Protocols: MPTCP 921 Automatable because the usage of multiple paths to communicate to 922 the same end host relates to knowledge about the network, not the 923 application. 925 o Configure authentication 926 Protocols: TCP, SCTP 927 Functional because this has a direct influence on security. 928 Implementation: via parameters in LISTEN.TCP and LISTEN.SCTP. 929 Implementation over TCP: With TCP, this allows to configure Master 930 Key Tuples (MKTs) to authenticate complete segments (including the 931 TCP IPv4 pseudoheader, TCP header, and TCP data). With SCTP, this 932 allows to specify which chunk types must always be authenticated. 933 Authenticating only certain chunk types creates a reduced level of 934 security that is not supported by TCP; to be compatible, this 935 should therefore only allow to authenticate all chunk types. Key 936 material must be provided in a way that is compatible with both 937 [RFC4895] and [RFC5925]. 938 Implementation over UDP: not possible (UDP does not offer 939 authentication). 941 o Obtain requested number of streams 942 Protocols: SCTP 943 Automatable because using multi-streaming does not require 944 application-specific knowledge. 945 Implementation: see Appendix A.3.2. 947 o Limit the number of inbound streams 948 Protocols: SCTP 949 Automatable because using multi-streaming does not require 950 application-specific knowledge. 951 Implementation: see Appendix A.3.2. 953 o Indicate (and/or obtain upon completion) an Adaptation Layer via 954 an adaptation code point 955 Protocols: SCTP 956 Functional because it allows to send extra data for the sake of 957 identifying an adaptation layer, which by itself is application- 958 specific. 959 Implementation: via a parameter in LISTEN.SCTP. 960 Implementation over TCP: not possible (TCP does not offer this 961 functionality). 962 Implementation over UDP: not possible (UDP does not offer this 963 functionality). 965 o Request to negotiate interleaving of user messages 966 Protocols: SCTP 967 Automatable because it requires using multiple streams, but 968 requesting multiple streams in the CONNECTION.ESTABLISHMENT 969 category is automatable. 970 Implementation: via a parameter in LISTEN.SCTP. 972 MAINTENANCE: 974 o Change timeout for aborting connection (using retransmit limit or 975 time value) 976 Protocols: TCP, SCTP 977 Functional because this is closely related to potentially assumed 978 reliable data delivery. 979 Implementation: via CHANGE_TIMEOUT.TCP or CHANGE_TIMEOUT.SCTP. 980 Implementation over UDP: not possible (UDP is unreliable and there 981 is no connection timeout). 983 o Suggest timeout to the peer 984 Protocols: TCP 985 Functional because this is closely related to potentially assumed 986 reliable data delivery. 987 Implementation: via CHANGE_TIMEOUT.TCP. 988 Implementation over UDP: not possible (UDP is unreliable and there 989 is no connection timeout). 991 o Disable Nagle algorithm 992 Protocols: TCP, SCTP 993 Optimizing because this decision depends on knowledge about the 994 size of future data blocks and the delay between them. 995 Implementation: via DISABLE_NAGLE.TCP and DISABLE_NAGLE.SCTP. 996 Implementation over UDP: do nothing (UDP does not implement the 997 Nagle algorithm). 999 o Request an immediate heartbeat, returning success/failure 1000 Protocols: SCTP 1001 Automatable because this informs about network-specific knowledge. 1003 o Notification of Excessive Retransmissions (early warning below 1004 abortion threshold) 1005 Protocols: TCP 1006 Optimizing because it is an early warning to the application, 1007 informing it of an impending functional event. 1008 Implementation: via ERROR.TCP. 1009 Implementation over UDP: do nothing (there is no abortion 1010 threshold). 1012 o Add path 1013 Protocols: MPTCP, SCTP 1014 MPTCP Parameters: source-IP; source-Port; destination-IP; 1015 destination-Port 1016 SCTP Parameters: local IP address 1017 Automatable because the usage of multiple paths to communicate to 1018 the same end host relates to knowledge about the network, not the 1019 application. 1021 o Remove path 1022 Protocols: MPTCP, SCTP 1023 MPTCP Parameters: source-IP; source-Port; destination-IP; 1024 destination-Port 1025 SCTP Parameters: local IP address 1026 Automatable because the usage of multiple paths to communicate to 1027 the same end host relates to knowledge about the network, not the 1028 application. 1030 o Set primary path 1031 Protocols: SCTP 1032 Automatable because the usage of multiple paths to communicate to 1033 the same end host relates to knowledge about the network, not the 1034 application. 1036 o Suggest primary path to the peer 1037 Protocols: SCTP 1038 Automatable because the usage of multiple paths to communicate to 1039 the same end host relates to knowledge about the network, not the 1040 application. 1042 o Configure Path Switchover 1043 Protocols: SCTP 1044 Automatable because the usage of multiple paths to communicate to 1045 the same end host relates to knowledge about the network, not the 1046 application. 1048 o Obtain status (query or notification) 1049 Protocols: SCTP, MPTCP 1050 SCTP parameters: association connection state; destination 1051 transport address list; destination transport address reachability 1052 states; current local and peer receiver window size; current local 1053 congestion window sizes; number of unacknowledged DATA chunks; 1054 number of DATA chunks pending receipt; primary path; most recent 1055 SRTT on primary path; RTO on primary path; SRTT and RTO on other 1056 destination addresses; MTU per path; interleaving supported yes/no 1057 MPTCP parameters: subflow-list (identified by source-IP; source- 1058 Port; destination-IP; destination-Port) 1059 Automatable because these parameters relate to knowledge about the 1060 network, not the application. 1062 o Specify DSCP field 1063 Protocols: TCP, SCTP, UDP(-Lite) 1064 Optimizing because choosing a suitable DSCP value requires 1065 application-specific knowledge. 1066 Implementation: via SET_DSCP.TCP / SET_DSCP.SCTP / SET_DSCP.UDP(- 1067 Lite) 1069 o Notification of ICMP error message arrival 1070 Protocols: TCP, UDP(-Lite) 1071 Optimizing because these messages can inform about success or 1072 failure of functional transport features (e.g., host unreachable 1073 relates to "Connect") 1074 Implementation: via ERROR.TCP or ERROR.UDP(-Lite). 1076 o Obtain information about interleaving support 1077 Protocols: SCTP 1078 Automatable because it requires using multiple streams, but 1079 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1080 category is automatable. 1081 Implementation: via STATUS.SCTP. 1083 o Change authentication parameters 1084 Protocols: TCP, SCTP 1085 Functional because this has a direct influence on security. 1086 Implementation: via SET_AUTH.TCP and SET_AUTH.SCTP. 1087 Implementation over TCP: With SCTP, this allows to adjust key_id, 1088 key, and hmac_id. With TCP, this allows to change the preferred 1089 outgoing MKT (current_key) and the preferred incoming MKT 1090 (rnext_key), respectively, for a segment that is sent on the 1091 connection. Key material must be provided in a way that is 1092 compatible with both [RFC4895] and [RFC5925]. 1093 Implementation over UDP: not possible (UDP does not offer 1094 authentication). 1096 o Obtain authentication information 1097 Protocols: SCTP 1098 Functional because authentication decisions may have been made by 1099 the peer, and this has an influence on the necessary application- 1100 level measures to provide a certain level of security. 1101 Implementation: via GET_AUTH.SCTP. 1102 Implementation over TCP: With SCTP, this allows to obtain key_id 1103 and a chunk list. With TCP, this allows to obtain current_key and 1104 rnext_key from a previously received segment. Key material must 1105 be provided in a way that is compatible with both [RFC4895] and 1106 [RFC5925]. 1107 Implementation over UDP: not possible (UDP does not offer 1108 authentication). 1110 o Reset Stream 1111 Protocols: SCTP 1112 Automatable because using multi-streaming does not require 1113 application-specific knowledge. 1114 Implementation: see Appendix A.3.2. 1116 o Notification of Stream Reset 1117 Protocols: STCP 1118 Automatable because using multi-streaming does not require 1119 application-specific knowledge. 1120 Implementation: see Appendix A.3.2. 1122 o Reset Association 1123 Protocols: SCTP 1124 Automatable because deciding to reset an association does not 1125 require application-specific knowledge. 1126 Implementation: via RESET_ASSOC.SCTP. 1128 o Notification of Association Reset 1129 Protocols: STCP 1130 Automatable because this notification does not relate to 1131 application-specific knowledge. 1133 o Add Streams 1134 Protocols: SCTP 1135 Automatable because using multi-streaming does not require 1136 application-specific knowledge. 1137 Implementation: see Appendix A.3.2. 1139 o Notification of Added Stream 1140 Protocols: STCP 1141 Automatable because using multi-streaming does not require 1142 application-specific knowledge. 1143 Implementation: see Appendix A.3.2. 1145 o Choose a scheduler to operate between streams of an association 1146 Protocols: SCTP 1147 Optimizing because the scheduling decision requires application- 1148 specific knowledge. However, if a transport system would not use 1149 this, or wrongly configure it on its own, this would only affect 1150 the performance of data transfers; the outcome would still be 1151 correct within the "best effort" service model. 1152 Implementation: using SET_STREAM_SCHEDULER.SCTP. 1153 Implementation over TCP: do nothing (streams are not available in 1154 TCP, but no guarantee is given that this transport feature has any 1155 effect). 1156 Implementation over UDP: do nothing (streams are not available in 1157 UDP, but no guarantee is given that this transport feature has any 1158 effect). 1160 o Configure priority or weight for a scheduler 1161 Protocols: SCTP 1162 Optimizing because the priority or weight requires application- 1163 specific knowledge. However, if a transport system would not use 1164 this, or wrongly configure it on its own, this would only affect 1165 the performance of data transfers; the outcome would still be 1166 correct within the "best effort" service model. 1167 Implementation: using CONFIGURE_STREAM_SCHEDULER.SCTP. 1168 Implementation over TCP: do nothing (streams are not available in 1169 TCP, but no guarantee is given that this transport feature has any 1170 effect). 1171 Implementation over UDP: do nothing (streams are not available in 1172 UDP, but no guarantee is given that this transport feature has any 1173 effect). 1175 o Configure send buffer size 1176 Protocols: SCTP 1177 Automatable because this decision relates to knowledge about the 1178 network and the Operating System, not the application (see also 1179 the discussion in Appendix A.3.4). 1181 o Configure receive buffer (and rwnd) size 1182 Protocols: SCTP 1183 Automatable because this decision relates to knowledge about the 1184 network and the Operating System, not the application. 1186 o Configure message fragmentation 1187 Protocols: SCTP 1188 Automatable because fragmentation relates to knowledge about the 1189 network and the Operating System, not the application. 1190 Implementation: by always enabling it with 1191 CONFIG_FRAGMENTATION.SCTP and auto-setting the fragmentation size 1192 based on network or Operating System conditions. 1194 o Configure PMTUD 1195 Protocols: SCTP 1196 Automatable because Path MTU Discovery relates to knowledge about 1197 the network, not the application. 1199 o Configure delayed SACK timer 1200 Protocols: SCTP 1201 Automatable because the receiver-side decision to delay sending 1202 SACKs relates to knowledge about the network, not the application 1203 (it can be relevant for a sending application to request not to 1204 delay the SACK of a message, but this is a different transport 1205 feature). 1207 o Set Cookie life value 1208 Protocols: SCTP 1209 Functional because it relates to security (possibly weakened by 1210 keeping a cookie very long) versus the time between connection 1211 establishment attempts. Knowledge about both issues can be 1212 application-specific. 1213 Implementation over TCP: the closest specified TCP functionality 1214 is the cookie in TCP Fast Open; for this, [RFC7413] states that 1215 the server "can expire the cookie at any time to enhance security" 1216 and section 4.1.2 describes an example implementation where 1217 updating the key on the server side causes the cookie to expire. 1218 Alternatively, for implementations that do not support TCP Fast 1219 Open, this transport feature could also affect the validity of SYN 1220 cookies (see Section 3.6 of [RFC4987]). 1221 Implementation over UDP: not possible (UDP does not offer this 1222 functionality). 1224 o Set maximum burst 1225 Protocols: SCTP 1226 Automatable because it relates to knowledge about the network, not 1227 the application. 1229 o Configure size where messages are broken up for partial delivery 1230 Protocols: SCTP 1231 Functional because this is closely tied to properties of the data 1232 that an application sends or expects to receive. 1233 Implementation over TCP: not possible (TCP does not offer 1234 identification of message boundaries). 1235 Implementation over UDP: not possible (UDP does not fragment 1236 messages). 1238 o Disable checksum when sending 1239 Protocols: UDP 1240 Functional because application-specific knowledge is necessary to 1241 decide whether it can be acceptable to lose data integrity. 1242 Implementation: via SET_CHECKSUM_ENABLED.UDP. 1243 Implementation over TCP: do nothing (TCP does not offer to disable 1244 the checksum, but transmitting data with an intact checksum will 1245 not yield a semantically wrong result). 1247 o Disable checksum requirement when receiving 1248 Protocols: UDP 1249 Functional because application-specific knowledge is necessary to 1250 decide whether it can be acceptable to lose data integrity. 1251 Implementation: via SET_CHECKSUM_REQUIRED.UDP. 1252 Implementation over TCP: do nothing (TCP does not offer to disable 1253 the checksum, but transmitting data with an intact checksum will 1254 not yield a semantically wrong result). 1256 o Specify checksum coverage used by the sender 1257 Protocols: UDP-Lite 1258 Functional because application-specific knowledge is necessary to 1259 decide for which parts of the data it can be acceptable to lose 1260 data integrity. 1261 Implementation: via SET_CHECKSUM_COVERAGE.UDP-Lite. 1262 Implementation over TCP: do nothing (TCP does not offer to limit 1263 the checksum length, but transmitting data with an intact checksum 1264 will not yield a semantically wrong result). 1265 Implementation over UDP: if checksum coverage is set to cover 1266 payload data, do nothing. Else, either do nothing (transmitting 1267 data with an intact checksum will not yield a semantically wrong 1268 result), or use the transport feature "Disable checksum when 1269 sending". 1271 o Specify minimum checksum coverage required by receiver 1272 Protocols: UDP-Lite 1273 Functional because application-specific knowledge is necessary to 1274 decide for which parts of the data it can be acceptable to lose 1275 data integrity. 1276 Implementation: via SET_MIN_CHECKSUM_COVERAGE.UDP-Lite. 1277 Implementation over TCP: do nothing (TCP does not offer to limit 1278 the checksum length, but transmitting data with an intact checksum 1279 will not yield a semantically wrong result). 1280 Implementation over UDP: if checksum coverage is set to cover 1281 payload data, do nothing. Else, either do nothing (transmitting 1282 data with an intact checksum will not yield a semantically wrong 1283 result), or use the transport feature "Disable checksum 1284 requirement when receiving". 1286 o Specify DF field 1287 Protocols: UDP(-Lite) 1288 Optimizing because the DF field can be used to carry out Path MTU 1289 Discovery, which can lead an application to choose message sizes 1290 that can be transmitted more efficiently. 1291 Implementation: via MAINTENANCE.SET_DF.UDP(-Lite) and 1292 SEND_FAILURE.UDP(-Lite). 1293 Implementation over TCP: do nothing (with TCP, the sending 1294 application is not in control of transport message sizes, making 1295 this functionality irrelevant). 1297 o Get max. transport-message size that may be sent using a non- 1298 fragmented IP packet from the configured interface 1299 Protocols: UDP(-Lite) 1300 Optimizing because this can lead an application to choose message 1301 sizes that can be transmitted more efficiently. 1302 Implementation over TCP: do nothing (this information is not 1303 available with TCP). 1305 o Get max. transport-message size that may be received from the 1306 configured interface 1307 Protocols: UDP(-Lite) 1308 Optimizing because this can, for example, influence an 1309 application's memory management. 1310 Implementation over TCP: do nothing (this information is not 1311 available with TCP). 1313 o Specify TTL/Hop count field 1314 Protocols: UDP(-Lite) 1315 Automatable because a transport system can use a large enough 1316 system default to avoid communication failures. Allowing an 1317 application to configure it differently can produce notifications 1318 of ICMP error message arrivals that yield information which only 1319 relates to knowledge about the network, not the application. 1321 o Obtain TTL/Hop count field 1322 Protocols: UDP(-Lite) 1323 Automatable because the TTL/Hop count field relates to knowledge 1324 about the network, not the application. 1326 o Specify ECN field 1327 Protocols: UDP(-Lite) 1328 Automatable because the ECN field relates to knowledge about the 1329 network, not the application. 1331 o Obtain ECN field 1332 Protocols: UDP(-Lite) 1333 Optimizing because this information can be used by an application 1334 to better carry out congestion control (this is relevant when 1335 choosing a data transmission transport service that does not 1336 already do congestion control). 1337 Implementation over TCP: do nothing (this information is not 1338 available with TCP). 1340 o Specify IP Options 1341 Protocols: UDP(-Lite) 1342 Automatable because IP Options relate to knowledge about the 1343 network, not the application. 1345 o Obtain IP Options 1346 Protocols: UDP(-Lite) 1347 Automatable because IP Options relate to knowledge about the 1348 network, not the application. 1350 o Enable and configure a "Low Extra Delay Background Transfer" 1351 Protocols: A protocol implementing the LEDBAT congestion control 1352 mechanism 1353 Optimizing because whether this service is appropriate or not 1354 depends on application-specific knowledge. However, wrongly using 1355 this will only affect the speed of data transfers (albeit 1356 including other transfers that may compete with the transport 1357 system's transfer in the network), so it is still correct within 1358 the "best effort" service model. 1359 Implementation: via CONFIGURE.LEDBAT and/or SET_DSCP.TCP / 1360 SET_DSCP.SCTP / SET_DSCP.UDP(-Lite) [LBE-draft]. 1361 Implementation over TCP: do nothing (TCP does not support LEDBAT 1362 congestion control, but not implementing this functionality will 1363 not yield a semantically wrong behavior). 1364 Implementation over UDP: do nothing (UDP does not offer congestion 1365 control). 1367 TERMINATION: 1369 o Close after reliably delivering all remaining data, causing an 1370 event informing the application on the other side 1371 Protocols: TCP, SCTP 1372 Functional because the notion of a connection is often reflected 1373 in applications as an expectation to have all outstanding data 1374 delivered and no longer be able to communicate after a "Close" 1375 succeeded, with a communication sequence relating to this 1376 transport feature that is defined by the application protocol. 1377 Implementation: via CLOSE.TCP and CLOSE.SCTP. 1378 Implementation over UDP: not possible (UDP is unreliable and hence 1379 does not know when all remaining data is delivered; it does also 1380 not offer to cause an event related to closing at the peer). 1382 o Abort without delivering remaining data, causing an event 1383 informing the application on the other side 1384 Protocols: TCP, SCTP 1385 Functional because the notion of a connection is often reflected 1386 in applications as an expectation to potentially not have all 1387 outstanding data delivered and no longer be able to communicate 1388 after an "Abort" succeeded. On both sides of a connection, an 1389 application protocol may define a communication sequence relating 1390 to this transport feature. 1391 Implementation: via ABORT.TCP and ABORT.SCTP. 1392 Implementation over UDP: not possible (UDP does not offer to cause 1393 an event related to aborting at the peer). 1395 o Abort without delivering remaining data, not causing an event 1396 informing the application on the other side 1397 Protocols: UDP(-Lite) 1398 Functional because the notion of a connection is often reflected 1399 in applications as an expectation to potentially not have all 1400 outstanding data delivered and no longer be able to communicate 1401 after an "Abort" succeeded. On both sides of a connection, an 1402 application protocol may define a communication sequence relating 1403 to this transport feature. 1404 Implementation: via ABORT.UDP(-Lite). 1405 Implementation over TCP: stop using the connection, wait for a 1406 timeout. 1408 o Timeout event when data could not be delivered for too long 1409 Protocols: TCP, SCTP 1410 Functional because this notifies that potentially assumed reliable 1411 data delivery is no longer provided. 1412 Implementation: via TIMEOUT.TCP and TIMEOUT.SCTP. 1413 Implementation over UDP: do nothing (this event will not occur 1414 with UDP). 1416 A.1.2. DATA Transfer Related Transport Features 1418 A.1.2.1. Sending Data 1420 o Reliably transfer data, with congestion control 1421 Protocols: TCP, SCTP 1422 Functional because this is closely tied to properties of the data 1423 that an application sends or expects to receive. 1424 Implementation: via SEND.TCP and SEND.SCTP. 1425 Implementation over UDP: not possible (UDP is unreliable). 1427 o Reliably transfer a message, with congestion control 1428 Protocols: SCTP 1429 Functional because this is closely tied to properties of the data 1430 that an application sends or expects to receive. 1431 Implementation: via SEND.SCTP. 1432 Implementation over TCP: via SEND.TCP. With SEND.TCP, message 1433 boundaries will not be identifiable by the receiver, because TCP 1434 provides a byte stream service. 1435 Implementation over UDP: not possible (UDP is unreliable). 1437 o Unreliably transfer a message 1438 Protocols: SCTP, UDP(-Lite) 1439 Optimizing because only applications know about the time 1440 criticality of their communication, and reliably transfering a 1441 message is never incorrect for the receiver of a potentially 1442 unreliable data transfer, it is just slower. 1443 ADDED. This differs from the 2 automatable transport features 1444 below in that it leaves the choice of congestion control open. 1445 Implementation: via SEND.SCTP or SEND.UDP(-Lite). 1446 Implementation over TCP: use SEND.TCP. With SEND.TCP, messages 1447 will be sent reliably, and message boundaries will not be 1448 identifiable by the receiver. 1450 o Unreliably transfer a message, with congestion control 1451 Protocols: SCTP 1452 Automatable because congestion control relates to knowledge about 1453 the network, not the application. 1455 o Unreliably transfer a message, without congestion control 1456 Protocols: UDP(-Lite) 1457 Automatable because congestion control relates to knowledge about 1458 the network, not the application. 1460 o Configurable Message Reliability 1461 Protocols: SCTP 1462 Optimizing because only applications know about the time 1463 criticality of their communication, and reliably transfering a 1464 message is never incorrect for the receiver of a potentially 1465 unreliable data transfer, it is just slower. 1466 Implementation: via SEND.SCTP. 1467 Implementation over TCP: By using SEND.TCP and ignoring this 1468 configuration: based on the assumption of the best-effort service 1469 model, unnecessarily delivering data does not violate application 1470 expectations. Moreover, it is not possible to associate the 1471 requested reliability to a "message" in TCP anyway. 1472 Implementation over UDP: not possible (UDP is unreliable). 1474 o Choice of stream 1475 Protocols: SCTP 1476 Automatable because it requires using multiple streams, but 1477 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1478 category is automatable. Implementation: see Appendix A.3.2. 1480 o Choice of path (destination address) 1481 Protocols: SCTP 1482 Automatable because it requires using multiple sockets, but 1483 obtaining multiple sockets in the CONNECTION.ESTABLISHMENT 1484 category is automatable. 1486 o Ordered message delivery (potentially slower than unordered) 1487 Protocols: SCTP 1488 Functional because this is closely tied to properties of the data 1489 that an application sends or expects to receive. 1490 Implementation: via SEND.SCTP. 1491 Implementation over TCP: By using SEND.TCP. With SEND.TCP, 1492 messages will not be identifiable by the receiver. 1493 Implementation over UDP: not possible (UDP does not offer any 1494 guarantees regarding ordering). 1496 o Unordered message delivery (potentially faster than ordered) 1497 Protocols: SCTP, UDP(-Lite) 1498 Functional because this is closely tied to properties of the data 1499 that an application sends or expects to receive. 1500 Implementation: via SEND.SCTP. 1501 Implementation over TCP: By using SEND.TCP and always sending data 1502 ordered: based on the assumption of the best-effort service model, 1503 ordered delivery may just be slower and does not violate 1504 application expectations. Moreover, it is not possible to 1505 associate the requested delivery order to a "message" in TCP 1506 anyway. 1508 o Request not to bundle messages 1509 Protocols: SCTP 1510 Optimizing because this decision depends on knowledge about the 1511 size of future data blocks and the delay between them. 1512 Implementation: via SEND.SCTP. 1513 Implementation over TCP: By using SEND.TCP and DISABLE_NAGLE.TCP 1514 to disable the Nagle algorithm when the request is made and enable 1515 it again when the request is no longer made. Note that this is 1516 not fully equivalent because it relates to the time of issuing the 1517 request rather than a specific message. 1518 Implementation over UDP: do nothing (UDP never bundles messages). 1520 o Specifying a "payload protocol-id" (handed over as such by the 1521 receiver) 1522 Protocols: SCTP 1523 Functional because it allows to send extra application data with 1524 every message, for the sake of identification of data, which by 1525 itself is application-specific. 1526 Implementation: SEND.SCTP. 1527 Implementation over TCP: not possible (this functionality is not 1528 available in TCP). 1530 Implementation over UDP: not possible (this functionality is not 1531 available in UDP). 1533 o Specifying a key id to be used to authenticate a message 1534 Protocols: SCTP 1535 Functional because this has a direct influence on security. 1536 Implementation: via a parameter in SEND.SCTP. 1537 Implementation over TCP: This could be emulated by using 1538 SET_AUTH.TCP before and after the message is sent. Note that this 1539 is not fully equivalent because it relates to the time of issuing 1540 the request rather than a specific message. 1541 Implementation over UDP: not possible (UDP does not offer 1542 authentication). 1544 o Request not to delay the acknowledgement (SACK) of a message 1545 Protocols: SCTP 1546 Optimizing because only an application knows for which message it 1547 wants to quickly be informed about success / failure of its 1548 delivery. 1549 Implementation over TCP: do nothing (TCP does not offer this 1550 functionality, but ignoring this request from the application will 1551 not yield a semantically wrong behavior). 1552 Implementation over UDP: do nothing (UDP does not offer this 1553 functionality, but ignoring this request from the application will 1554 not yield a semantically wrong behavior). 1556 A.1.2.2. Receiving Data 1558 o Receive data (with no message delimiting) 1559 Protocols: TCP 1560 Functional because a transport system must be able to send and 1561 receive data. 1562 Implementation: via RECEIVE.TCP. 1563 Implementation over UDP: do nothing (UDP only works on messages; 1564 these can be handed over, the application can still ignore the 1565 message boundaries). 1567 o Receive a message 1568 Protocols: SCTP, UDP(-Lite) 1569 Functional because this is closely tied to properties of the data 1570 that an application sends or expects to receive. 1571 Implementation: via RECEIVE.SCTP and RECEIVE.UDP(-Lite). 1572 Implementation over TCP: not possible (TCP does not support 1573 identification of message boundaries). 1575 o Choice of stream to receive from 1576 Protocols: SCTP 1577 Automatable because it requires using multiple streams, but 1578 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1579 category is automatable. 1580 Implementation: see Appendix A.3.2. 1582 o Information about partial message arrival 1583 Protocols: SCTP 1584 Functional because this is closely tied to properties of the data 1585 that an application sends or expects to receive. 1586 Implementation: via RECEIVE.SCTP. 1587 Implementation over TCP: do nothing (this information is not 1588 available with TCP). 1589 Implementation over UDP: do nothing (this information is not 1590 available with UDP). 1592 A.1.2.3. Errors 1594 This section describes sending failures that are associated with a 1595 specific call to in the "Sending Data" category (Appendix A.1.2.1). 1597 o Notification of send failures 1598 Protocols: SCTP, UDP(-Lite) 1599 Functional because this notifies that potentially assumed reliable 1600 data delivery is no longer provided. 1601 ADDED. This differs from the 2 automatable transport features 1602 below in that it does not distinugish between unsent and 1603 unacknowledged messages. 1604 Implementation: via SENDFAILURE-EVENT.SCTP and SEND_FAILURE.UDP(- 1605 Lite). 1606 Implementation over TCP: do nothing (this notification is not 1607 available and will therefore not occur with TCP). 1609 o Notification of an unsent (part of a) message 1610 Protocols: SCTP, UDP(-Lite) 1611 Automatable because the distinction between unsent and 1612 unacknowledged is network-specific. 1614 o Notification of an unacknowledged (part of a) message 1615 Protocols: SCTP 1616 Automatable because the distinction between unsent and 1617 unacknowledged is network-specific. 1619 o Notification that the stack has no more user data to send 1620 Protocols: SCTP 1621 Optimizing because reacting to this notification requires the 1622 application to be involved, and ensuring that the stack does not 1623 run dry of data (for too long) can improve performance. 1624 Implementation over TCP: do nothing (see the discussion in 1625 Appendix A.3.4). 1626 Implementation over UDP: do nothing (this notification is not 1627 available and will therefore not occur with UDP). 1629 o Notification to a receiver that a partial message delivery has 1630 been aborted 1631 Protocols: SCTP 1632 Functional because this is closely tied to properties of the data 1633 that an application sends or expects to receive. 1634 Implementation over TCP: do nothing (this notification is not 1635 available and will therefore not occur with TCP). 1636 Implementation over UDP: do nothing (this notification is not 1637 available and will therefore not occur with UDP). 1639 A.2. Step 2: Reduction -- The Reduced Set of Transport Features 1641 By hiding automatable transport features from the application, a 1642 transport system can gain opportunities to automate the usage of 1643 network-related functionality. This can facilitate using the 1644 transport system for the application programmer and it allows for 1645 optimizations that may not be possible for an application. For 1646 instance, system-wide configurations regarding the usage of multiple 1647 interfaces can better be exploited if the choice of the interface is 1648 not entirely up to the application. Therefore, since they are not 1649 strictly necessary to expose in a transport system, we do not include 1650 automatable transport features in the reduced set of transport 1651 features. This leaves us with only the transport features that are 1652 either optimizing or functional. 1654 A transport system should be able to communicate via TCP or UDP if 1655 alternative transport protocols are found not to work. For many 1656 transport features, this is possible -- often by simply not doing 1657 anything when a specific request is made. For some transport 1658 features, however, it was identified that direct usage of neither TCP 1659 nor UDP is possible: in these cases, even not doing anything would 1660 incur semantically incorrect behavior. Whenever an application would 1661 make use of one of these transport features, this would eliminate the 1662 possibility to use TCP or UDP. Thus, we only keep the functional and 1663 optimizing transport features for which an implementation over either 1664 TCP or UDP is possible in our reduced set. 1666 The "minimal set" derived in this document is meant to be 1667 implementable "one-sided" over TCP, and, with limitations, UDP. In 1668 the following list, we therefore precede a transport feature with 1669 "T:" if an implementation over TCP is possible, "U:" if an 1670 implementation over UDP is possible, and "TU:" if an implementation 1671 over either TCP or UDP is possible. 1673 A.2.1. CONNECTION Related Transport Features 1675 ESTABLISHMENT: 1677 o T,U: Connect 1678 o T,U: Specify number of attempts and/or timeout for the first 1679 establishment message 1680 o T: Configure authentication 1681 o T: Hand over a message to reliably transfer (possibly multiple 1682 times) before connection establishment 1683 o T: Hand over a message to reliably transfer during connection 1684 establishment 1686 AVAILABILITY: 1688 o T,U: Listen 1689 o T: Configure authentication 1691 MAINTENANCE: 1693 o T: Change timeout for aborting connection (using retransmit limit 1694 or time value) 1695 o T: Suggest timeout to the peer 1696 o T,U: Disable Nagle algorithm 1697 o T,U: Notification of Excessive Retransmissions (early warning 1698 below abortion threshold) 1699 o T,U: Specify DSCP field 1700 o T,U: Notification of ICMP error message arrival 1701 o T: Change authentication parameters 1702 o T: Obtain authentication information 1703 o T,U: Set Cookie life value 1704 o T,U: Choose a scheduler to operate between streams of an 1705 association 1706 o T,U: Configure priority or weight for a scheduler 1707 o T,U: Disable checksum when sending 1708 o T,U: Disable checksum requirement when receiving 1709 o T,U: Specify checksum coverage used by the sender 1710 o T,U: Specify minimum checksum coverage required by receiver 1711 o T,U: Specify DF field 1712 o T,U: Get max. transport-message size that may be sent using a non- 1713 fragmented IP packet from the configured interface 1714 o T,U: Get max. transport-message size that may be received from the 1715 configured interface 1716 o T,U: Obtain ECN field 1717 o T,U: Enable and configure a "Low Extra Delay Background Transfer" 1719 TERMINATION: 1721 o T: Close after reliably delivering all remaining data, causing an 1722 event informing the application on the other side 1723 o T: Abort without delivering remaining data, causing an event 1724 informing the application on the other side 1725 o T,U: Abort without delivering remaining data, not causing an event 1726 informing the application on the other side 1727 o T,U: Timeout event when data could not be delivered for too long 1729 A.2.2. DATA Transfer Related Transport Features 1731 A.2.2.1. Sending Data 1733 o T: Reliably transfer data, with congestion control 1734 o T: Reliably transfer a message, with congestion control 1735 o T,U: Unreliably transfer a message 1736 o T: Configurable Message Reliability 1737 o T: Ordered message delivery (potentially slower than unordered) 1738 o T,U: Unordered message delivery (potentially faster than ordered) 1739 o T,U: Request not to bundle messages 1740 o T: Specifying a key id to be used to authenticate a message 1741 o T,U: Request not to delay the acknowledgement (SACK) of a message 1743 A.2.2.2. Receiving Data 1745 o T,U: Receive data (with no message delimiting) 1746 o U: Receive a message 1747 o T,U: Information about partial message arrival 1749 A.2.2.3. Errors 1751 This section describes sending failures that are associated with a 1752 specific call to in the "Sending Data" category (Appendix A.1.2.1). 1754 o T,U: Notification of send failures 1755 o T,U: Notification that the stack has no more user data to send 1756 o T,U: Notification to a receiver that a partial message delivery 1757 has been aborted 1759 A.3. Step 3: Discussion 1761 The reduced set in the previous section exhibits a number of 1762 peculiarities, which we will discuss in the following. This section 1763 focuses on TCP because, with the exception of one particular 1764 transport feature ("Receive a message" -- we will discuss this in 1765 Appendix A.3.1), the list shows that UDP is strictly a subset of TCP. 1766 We can first try to understand how to build a transport system that 1767 can run over TCP, and then narrow down the result further to allow 1768 that the system can always run over either TCP or UDP (which 1769 effectively means removing everything related to reliability, 1770 ordering, authentication and closing/aborting with a notification to 1771 the peer). 1773 Note that, because the functional transport features of UDP are -- 1774 with the exception of "Receive a message" -- a subset of TCP, TCP can 1775 be used as a replacement for UDP whenever an application does not 1776 need message delimiting (e.g., because the application-layer protocol 1777 already does it). This has been recognized by many applications that 1778 already do this in practice, by trying to communicate with UDP at 1779 first, and falling back to TCP in case of a connection failure. 1781 A.3.1. Sending Messages, Receiving Bytes 1783 For implementing a transport system over TCP, there are several 1784 transport features related to sending, but only a single transport 1785 feature related to receiving: "Receive data (with no message 1786 delimiting)" (and, strangely, "information about partial message 1787 arrival"). Notably, the transport feature "Receive a message" is 1788 also the only non-automatable transport feature of UDP(-Lite) for 1789 which no implementation over TCP is possible. 1791 To support these TCP receiver semantics, we define an "Application- 1792 Framed Bytestream" (AFra-Bytestream). AFra-Bytestreams allow senders 1793 to operate on messages while minimizing changes to the TCP socket 1794 API. In particular, nothing changes on the receiver side - data can 1795 be accepted via a normal TCP socket. 1797 In an AFra-Bytestream, the sending application can optionally inform 1798 the transport about message boundaries and required properties per 1799 message (configurable order and reliability, or embedding a request 1800 not to delay the acknowledgement of a message). Whenever the sending 1801 application specifies per-message properties that relax the notion of 1802 reliable in-order delivery of bytes, it must assume that the 1803 receiving application is 1) able to determine message boundaries, 1804 provided that messages are always kept intact, and 2) able to accept 1805 these relaxed per-message properties. Any signaling of such 1806 information to the peer is up to an application-layer protocol and 1807 considered out of scope of this document. 1809 For example, if an application requests to transfer fixed-size 1810 messages of 100 bytes with partial reliability, this needs the 1811 receiving application to be prepared to accept data in chunks of 100 1812 bytes. If, then, some of these 100-byte messages are missing (e.g., 1813 if SCTP with Configurable Reliability is used), this is the expected 1814 application behavior. With TCP, no messages would be missing, but 1815 this is also correct for the application, and the possible 1816 retransmission delay is acceptable within the best effort service 1817 model (see [RFC7305], Section 3.5). Still, the receiving application 1818 would separate the byte stream into 100-byte chunks. 1820 Note that this usage of messages does not require all messages to be 1821 equal in size. Many application protocols use some form of Type- 1822 Length-Value (TLV) encoding, e.g. by defining a header including 1823 length fields; another alternative is the use of byte stuffing 1824 methods such as COBS [COBS]. If an application needs message 1825 numbers, e.g. to restore the correct sequence of messages, these must 1826 also be encoded by the application itself, as the sequence number 1827 related transport features of SCTP are not provided by the "minimum 1828 set" (in the interest of enabling usage of TCP). 1830 A.3.2. Stream Schedulers Without Streams 1832 We have already stated that multi-streaming does not require 1833 application-specific knowledge. Potential benefits or disadvantages 1834 of, e.g., using two streams of an SCTP association versus using two 1835 separate SCTP associations or TCP connections are related to 1836 knowledge about the network and the particular transport protocol in 1837 use, not the application. However, the transport features "Choose a 1838 scheduler to operate between streams of an association" and 1839 "Configure priority or weight for a scheduler" operate on streams. 1840 Here, streams identify communication channels between which a 1841 scheduler operates, and they can be assigned a priority. Moreover, 1842 the transport features in the MAINTENANCE category all operate on 1843 assocations in case of SCTP, i.e. they apply to all streams in that 1844 assocation. 1846 With only these semantics necessary to represent, the interface to a 1847 transport system becomes easier if we assume that connections may be 1848 a transport protocol's connection or association, but could also be a 1849 stream of an existing SCTP association, for example. We only need to 1850 allow for a way to define a possible grouping of connections. Then, 1851 all MAINTENANCE transport features can be said to operate on 1852 connection groups, not connections, and a scheduler operates on the 1853 connections within a group. 1855 To be compatible with multiple transport protocols and uniformly 1856 allow access to both transport connections and streams of a multi- 1857 streaming protocol, the semantics of opening and closing need to be 1858 the most restrictive subset of all of the underlying options. For 1859 example, TCP's support of half-closed connections can be seen as a 1860 feature on top of the more restrictive "ABORT"; this feature cannot 1861 be supported because not all protocols used by a transport system 1862 (including streams of an association) support half-closed 1863 connections. 1865 A.3.3. Early Data Transmission 1867 There are two transport features related to transferring a message 1868 early: "Hand over a message to reliably transfer (possibly multiple 1869 times) before connection establishment", which relates to TCP Fast 1870 Open [RFC7413], and "Hand over a message to reliably transfer during 1871 connection establishment", which relates to SCTP's ability to 1872 transfer data together with the COOKIE-Echo chunk. Also without TCP 1873 Fast Open, TCP can transfer data during the handshake, together with 1874 the SYN packet -- however, the receiver of this data may not hand it 1875 over to the application until the handshake has completed. Also, 1876 different from TCP Fast Open, this data is not delimited as a message 1877 by TCP (thus, not visible as a ``message''). This functionality is 1878 commonly available in TCP and supported in several implementations, 1879 even though the TCP specification does not explain how to provide it 1880 to applications. 1882 A transport system could differentiate between the cases of 1883 transmitting data "before" (possibly multiple times) or "during" the 1884 handshake. Alternatively, it could also assume that data that are 1885 handed over early will be transmitted as early as possible, and 1886 "before" the handshake would only be used for messages that are 1887 explicitly marked as "idempotent" (i.e., it would be acceptable to 1888 transfer them multiple times). 1890 The amount of data that can successfully be transmitted before or 1891 during the handshake depends on various factors: the transport 1892 protocol, the use of header options, the choice of IPv4 and IPv6 and 1893 the Path MTU. A transport system should therefore allow a sending 1894 application to query the maximum amount of data it can possibly 1895 transmit before (or, if exposed, during) connection establishment. 1897 A.3.4. Sender Running Dry 1899 The transport feature "Notification that the stack has no more user 1900 data to send" relates to SCTP's "SENDER DRY" notification. Such 1901 notifications can, in principle, be used to avoid having an 1902 unnecessarily large send buffer, yet ensure that the transport sender 1903 always has data available when it has an opportunity to transmit it. 1904 This has been found to be very beneficial for some applications 1905 [WWDC2015]. However, "SENDER DRY" truly means that the entire send 1906 buffer (including both unsent and unacknowledged data) has emptied -- 1907 i.e., when it notifies the sender, it is already too late, the 1908 transport protocol already missed an opportunity to send data. Some 1909 modern TCP implementations now include the unspecified 1910 "TCP_NOTSENT_LOWAT" socket option that was proposed in [WWDC2015], 1911 which limits the amount of unsent data that TCP can keep in the 1912 socket buffer; this allows to specify at which buffer filling level 1913 the socket becomes writable, rather than waiting for the buffer to 1914 run empty. 1916 SCTP allows to configure the sender-side buffer too: the automatable 1917 Transport Feature "Configure send buffer size" provides this 1918 functionality, but only for the complete buffer, which includes both 1919 unsent and unacknowledged data. SCTP does not allow to control these 1920 two sizes separately. It therefore makes sense for a transport 1921 system to allow for uniform access to "TCP_NOTSENT_LOWAT" as well as 1922 the "SENDER DRY" notification. 1924 A.3.5. Capacity Profile 1926 The transport features: 1928 o Disable Nagle algorithm 1929 o Enable and configure a "Low Extra Delay Background Transfer" 1930 o Specify DSCP field 1932 all relate to a QoS-like application need such as "low latency" or 1933 "scavenger". In the interest of flexibility of a transport system, 1934 they could therefore be offered in a uniform, more abstract way, 1935 where a transport system could e.g. decide by itself how to use 1936 combinations of LEDBAT-like congestion control and certain DSCP 1937 values, and an application would only specify a general "capacity 1938 profile" (a description of how it wants to use the available 1939 capacity). A need for "lowest possible latency at the expense of 1940 overhead" could then translate into automatically disabling the Nagle 1941 algorithm. 1943 In some cases, the Nagle algorithm is best controlled directly by the 1944 application because it is not only related to a general profile but 1945 also to knowledge about the size of future messages. For fine-grain 1946 control over Nagle-like functionality, the "Request not to bundle 1947 messages" is available. 1949 A.3.6. Security 1951 Both TCP and SCTP offer authentication. TCP authenticates complete 1952 segments. SCTP allows to configure which of SCTP's chunk types must 1953 always be authenticated -- if this is exposed as such, it creates an 1954 undesirable dependency on the transport protocol. For compatibility 1955 with TCP, a transport system should only allow to configure complete 1956 transport layer packets, including headers, IP pseudo-header (if any) 1957 and payload. 1959 Security is discussed in a separate document 1960 [I-D.ietf-taps-transport-security]. The minimal set presented in the 1961 present document excludes all security related transport features: 1962 "Configure authentication", "Change authentication parameters", 1963 "Obtain authentication information" and and "Set Cookie life value" 1964 as well as "Specifying a key id to be used to authenticate a 1965 message". 1967 A.3.7. Packet Size 1969 UDP(-Lite) has a transport feature called "Specify DF field". This 1970 yields an error message in case of sending a message that exceeds the 1971 Path MTU, which is necessary for a UDP-based application to be able 1972 to implement Path MTU Discovery (a function that UDP-based 1973 applications must do by themselves). The "Get max. transport-message 1974 size that may be sent using a non-fragmented IP packet from the 1975 configured interface" transport feature yields an upper limit for the 1976 Path MTU (minus headers) and can therefore help to implement Path MTU 1977 Discovery more efficiently. 1979 Appendix B. Revision information 1981 XXX RFC-Ed please remove this section prior to publication. 1983 -02: implementation suggestions added, discussion section added, 1984 terminology extended, DELETED category removed, various other fixes; 1985 list of Transport Features adjusted to -01 version of [RFC8303] 1986 except that MPTCP is not included. 1988 -03: updated to be consistent with -02 version of [RFC8303]. 1990 -04: updated to be consistent with -03 version of [RFC8303]. 1991 Reorganized document, rewrote intro and conclusion, and made a first 1992 stab at creating a real "minimal set". 1994 -05: updated to be consistent with -05 version of [RFC8303] (minor 1995 changes). Fixed a mistake regarding Cookie Life value. Exclusion of 1996 security related transport features (to be covered in a separate 1997 document). Reorganized the document (now begins with the minset, 1998 derivation is in the appendix). First stab at an abstract API for 1999 the minset. 2001 draft-ietf-taps-minset-00: updated to be consistent with -08 version 2002 of [RFC8303] ("obtain message delivery number" was removed, as this 2003 has also been removed in [RFC8303] because it was a mistake in 2004 RFC4960. This led to the removal of two more transport features that 2005 were only designated as functional because they affected "obtain 2006 message delivery number"). Fall-back to UDP incorporated (this was 2007 requested at IETF-99); this also affected the transport feature 2008 "Choice between unordered (potentially faster) or ordered delivery of 2009 messages" because this is a boolean which is always true for one 2010 fall-back protocol, and always false for the other one. This was 2011 therefore now divided into two features, one for ordered, one for 2012 unordered delivery. The word "reliably" was added to the transport 2013 features "Hand over a message to reliably transfer (possibly multiple 2014 times) before connection establishment" and "Hand over a message to 2015 reliably transfer during connection establishment" to make it clearer 2016 why this is not supported by UDP. Clarified that the "minset 2017 abstract interface" is not proposing a specific API for all TAPS 2018 systems to implement, but it is just a way to describe the minimum 2019 set. Author order changed. 2021 WG -01: "fall-back to" (TCP or UDP) replaced (mostly with 2022 "implementation over"). References to post-sockets removed (these 2023 were statments that assumed that post-sockets requires two-sided 2024 implementation). Replaced "flow" with "TAPS Connection" and "frame" 2025 with "message" to avoid introducing new terminology. Made sections 3 2026 and 4 in line with the categorization that is already used in the 2027 appendix and [RFC8303], and changed style of section 4 to be even 2028 shorter and less interface-like. Updated reference draft-ietf-tsvwg- 2029 sctp-ndata to RFC8260. 2031 WG -02: rephrased "the TAPS system" and "TAPS connection" etc. to 2032 more generally talk about transport after the intro (mostly replacing 2033 "TAPS system" with "transport system" and "TAPS connection" with 2034 "connection". Merged sections 3 and 4 to form a new section 3. 2036 WG -03: updated sentence referencing 2037 [I-D.ietf-taps-transport-security] to say that "the minimum security 2038 requirements for a taps system are discussed in a separate security 2039 document", wrote "example" in the paragraph introducing the decision 2040 tree. Removed reference draft-grinnemo-taps-he-03 and the sentence 2041 that referred to it. 2043 WG -04: addressed comments from Theresa Enghardt and Tommy Pauly. As 2044 part of that, removed "TAPS" as a term everywhere (abstract, intro, 2045 ..). 2047 WG -05: addressed comments from Spencer Dawkins. 2049 WG -06: Fixed nits. 2051 WG -07: Addressed Genart comments from Robert Sparks. 2053 WG -08: Addressed one more Genart comment from Robert Sparks. 2055 Authors' Addresses 2057 Michael Welzl 2058 University of Oslo 2059 PO Box 1080 Blindern 2060 Oslo N-0316 2061 Norway 2063 Phone: +47 22 85 24 20 2064 Email: michawe@ifi.uio.no 2066 Stein Gjessing 2067 University of Oslo 2068 PO Box 1080 Blindern 2069 Oslo N-0316 2070 Norway 2072 Phone: +47 22 85 24 44 2073 Email: steing@ifi.uio.no