idnits 2.17.1 draft-ietf-taps-minset-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 20, 2018) is 2076 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'SUBCATEGORY' is mentioned on line 748, but not defined == Outdated reference: A later version (-12) exists of draft-ietf-taps-transport-security-01 -- Unexpected draft version: The latest known version of draft-tsvwg-le-phb is -00, but you're referring to -03. Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TAPS M. Welzl 3 Internet-Draft S. Gjessing 4 Intended status: Informational University of Oslo 5 Expires: February 21, 2019 August 20, 2018 7 A Minimal Set of Transport Services for End Systems 8 draft-ietf-taps-minset-05 10 Abstract 12 This draft recommends a minimal set of Transport Services offered by 13 end systems, and gives guidance on choosing among the available 14 mechanisms and protocols. It is based on the set of transport 15 features in RFC 8303. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at https://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on February 21, 2019. 34 Copyright Notice 36 Copyright (c) 2018 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (https://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 52 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 53 3. The Minimal Set of Transport Features . . . . . . . . . . . . 5 54 3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION . . . . . . . 5 55 3.2. MAINTENANCE . . . . . . . . . . . . . . . . . . . . . . . 8 56 3.2.1. Connection groups . . . . . . . . . . . . . . . . . . 8 57 3.2.2. Individual connections . . . . . . . . . . . . . . . 10 58 3.3. DATA Transfer . . . . . . . . . . . . . . . . . . . . . . 10 59 3.3.1. Sending Data . . . . . . . . . . . . . . . . . . . . 10 60 3.3.2. Receiving Data . . . . . . . . . . . . . . . . . . . 11 61 4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 12 62 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 63 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 64 7. Security Considerations . . . . . . . . . . . . . . . . . . . 13 65 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 66 8.1. Normative References . . . . . . . . . . . . . . . . . . 13 67 8.2. Informative References . . . . . . . . . . . . . . . . . 13 68 Appendix A. Deriving the minimal set . . . . . . . . . . . . . . 15 69 A.1. Step 1: Categorization -- The Superset of Transport 70 Features . . . . . . . . . . . . . . . . . . . . . . . . 15 71 A.1.1. CONNECTION Related Transport Features . . . . . . . . 17 72 A.1.2. DATA Transfer Related Transport Features . . . . . . 33 73 A.2. Step 2: Reduction -- The Reduced Set of Transport 74 Features . . . . . . . . . . . . . . . . . . . . . . . . 39 75 A.2.1. CONNECTION Related Transport Features . . . . . . . . 40 76 A.2.2. DATA Transfer Related Transport Features . . . . . . 41 77 A.3. Step 3: Discussion . . . . . . . . . . . . . . . . . . . 42 78 A.3.1. Sending Messages, Receiving Bytes . . . . . . . . . . 42 79 A.3.2. Stream Schedulers Without Streams . . . . . . . . . . 43 80 A.3.3. Early Data Transmission . . . . . . . . . . . . . . . 44 81 A.3.4. Sender Running Dry . . . . . . . . . . . . . . . . . 44 82 A.3.5. Capacity Profile . . . . . . . . . . . . . . . . . . 45 83 A.3.6. Security . . . . . . . . . . . . . . . . . . . . . . 46 84 A.3.7. Packet Size . . . . . . . . . . . . . . . . . . . . . 46 85 Appendix B. Revision information . . . . . . . . . . . . . . . . 46 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 48 88 1. Introduction 90 The task of a transport system is to offer transport services to its 91 applications, i.e. the applications running on top of the transport 92 system. Ideally, it does so without statically binding applications 93 to particular transport protocols. Currently, the set of transport 94 services that most applications use is based on TCP and UDP (and 95 protocols that are layered on top of them); this limits the ability 96 for the network stack to make use of features of other transport 97 protocols. For example, if a protocol supports out-of-order message 98 delivery but applications always assume that the network provides an 99 ordered bytestream, then the network stack can not immediately 100 deliver a message that arrives out-of-order: doing so would break a 101 fundamental assumption of the application. The net result is 102 unnecessary head-of-line blocking delay. 104 By exposing the transport services of multiple transport protocols, a 105 transport system can make it possible to use these services without 106 having to statically bind an application to a specific transport 107 protocol. The first step towards the design of such a system was 108 taken by [RFC8095], which surveys a large number of transports, and 109 [RFC8303] as well as [RFC8304], which identify the specific transport 110 features that are exposed to applications by the protocols TCP, 111 MPTCP, UDP(-Lite) and SCTP as well as the LEDBAT congestion control 112 mechanism. This memo is based on these documents and follows the 113 same terminology (also listed below). Because the considered 114 transport protocols conjointly cover a wide range of transport 115 features, there is reason to hope that the resulting set (and the 116 reasoning that led to it) will also apply to many aspects of other 117 transport protocols that may be in use today, or may be designed in 118 the future. 120 The number of transport features of current IETF transports is large, 121 and exposing all of them has a number of disadvantages: generally, 122 the more functionality is exposed, the less freedom a transport 123 system has to automate usage of the various functions of its 124 available set of transport protocols. Some functions only exist in 125 one particular protocol, and if an application used them, this would 126 statically tie the application to this protocol, limiting the 127 flexibility of the transport system. Also, if the number of exposed 128 features is exceedingly large, a transport system might become very 129 difficult to use for an application programmer. Taking [RFC8303] as 130 a basis, this document therefore develops a minimal set of transport 131 features, removing the ones that could get in the way of transport 132 flexibility but keeping the ones that must be retained for 133 applications to benefit from useful transport functionality. 135 Applications use a wide variety of APIs today. The transport 136 features in the minimal set in this document must be reflected in 137 *all* network APIs in order for the underlying functionality to 138 become usable everywhere. For example, it does not help an 139 application that talks to a library which offers its own 140 communication interface if the underlying Berkeley Sockets API is 141 extended to offer "unordered message delivery", but the library only 142 exposes an ordered bytestream. Both the Berkeley Sockets API and the 143 library would have to expose the "unordered message delivery" 144 transport feature (alternatively, there may be ways for certain types 145 of libraries to use this transport feature without exposing it, based 146 on knowledge about the applications -- but this is not the general 147 case). In most situations, in the interest of being as flexible and 148 efficient as possible, the best choice will be for a library to 149 expose at least all of the transport features that are recommended as 150 a "minimal set" here. 152 This "minimal set" can be implemented "one-sided" over TCP. This 153 means that a sender-side transport system can talk to a standard TCP 154 receiver, and a receiver-side transport system can talk to a standard 155 TCP sender. If certain limitations are put in place, the "minimal 156 set" can also be implemented "one-sided" over UDP. 158 2. Terminology 160 Transport Feature: a specific end-to-end feature that the transport 161 layer provides to an application. Examples include 162 confidentiality, reliable delivery, ordered delivery, message- 163 versus-stream orientation, etc. 164 Transport Service: a set of Transport Features, without an 165 association to any given framing protocol, which provides a 166 complete service to an application. 167 Transport Protocol: an implementation that provides one or more 168 different transport services using a specific framing and header 169 format on the wire. 170 Transport Service Instance: an arrangement of transport protocols 171 with a selected set of features and configuration parameters that 172 implements a single transport service, e.g., a protocol stack (RTP 173 over UDP). 174 Application: an entity that uses the transport layer for end-to-end 175 delivery data across the network (this may also be an upper layer 176 protocol or tunnel encapsulation). 177 Application-specific knowledge: knowledge that only applications 178 have. 179 Endpoint: an entity that communicates with one or more other 180 endpoints using a transport protocol. 181 Connection: shared state of two or more endpoints that persists 182 across messages that are transmitted between these endpoints. 183 Connection Group: a set of connections which share the same 184 configuration (configuring one of them causes all other 185 connections in the same group to be configured in the same way). 186 We call connections that belong to a connection group "grouped", 187 while "ungrouped" connections are not a part of a connection 188 group. 189 Socket: the combination of a destination IP address and a 190 destination port number. 192 Moreover, throughout the document, the protocol name "UDP(-Lite)" is 193 used when discussing transport features that are equivalent for UDP 194 and UDP-Lite; similarly, the protocol name "TCP" refers to both TCP 195 and MPTCP. 197 3. The Minimal Set of Transport Features 199 Based on the categorization, reduction, and discussion in Appendix A, 200 this section describes a minimal set of transport features that end 201 systems should offer. The described transport system can be 202 implemented over TCP. Elements of the system that are not marked 203 with "!UDP" can also be implemented over UDP. 205 The arguments laid out in Appendix A.3 ("discussion") were used to 206 make the final representation of the minimal set as short, simple and 207 general as possible. There may be situations where these arguments 208 do not apply -- e.g., implementers may have specific reasons to 209 expose multi-streaming as a visible functionality to applications, or 210 the restrictive open / close semantics may be problematic under some 211 circumstances. In such cases, the representation in Appendix A.2 212 ("reduction") should be considered. 214 As in Appendix A, Appendix A.2 and [RFC8303], we categorize the 215 minimal set of transport features as 1) CONNECTION related 216 (ESTABLISHMENT, AVAILABILITY, MAINTENANCE, TERMINATION) and 2) DATA 217 Transfer related (Sending Data, Receiving Data, Errors). Here, the 218 focus is on connections that the transport system offers as an 219 abstraction to the application, as opposed to connections of 220 transport protocols that the transport system uses. 222 3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION 224 A connection must first be "created" to allow for some initial 225 configuration to be carried out before the transport system can 226 actively or passively establish communication with a remote endpoint. 227 All configuration parameters in Section 3.2 can be used initially, 228 although some of them may only take effect when a connection has been 229 established with a chosen transport protocol. Configuring a 230 connection early helps a transport system make the right decisions. 231 For example, grouping information can influence the transport system 232 to implement a connection as a stream of a multi-streaming protocol's 233 existing association or not. 235 For ungrouped connections, early configuration is necessary because 236 it allows the transport system to know which protocols it should try 237 to use. In particular, a transport system that only makes a one-time 238 choice for a particular protocol must know early about strict 239 requirements that must be kept, or it can end up in a deadlock 240 situation (e.g., having chosen UDP and later be asked to support 241 reliable transfer). As an example description of how to correctly 242 handle these cases, we provide the following decision tree (this is 243 derived from Appendix A.2.1 excluding authentication, as explained in 244 Section 7): 246 - Will it ever be necessary to offer any of the following? 247 * Reliably transfer data 248 * Notify the peer of closing/aborting 249 * Preserve data ordering 251 Yes: SCTP or TCP can be used. 252 - Is any of the following useful to the application? 253 * Choosing a scheduler to operate between connections 254 in a group, with the possibility to configure a priority 255 or weight per connection 256 * Configurable message reliability 257 * Unordered message delivery 258 * Request not to delay the acknowledgement (SACK) of a message 260 Yes: SCTP is preferred. 261 No: 262 - Is any of the following useful to the application? 263 * Hand over a message to reliably transfer (possibly 264 multiple times) before connection establishment 265 * Suggest timeout to the peer 266 * Notification of Excessive Retransmissions (early 267 warning below abortion threshold) 268 * Notification of ICMP error message arrival 270 Yes: TCP is preferred. 271 No: SCTP and TCP are equally preferable. 273 No: all protocols can be used. 274 - Is any of the following useful to the application? 275 * Specify checksum coverage used by the sender 276 * Specify minimum checksum coverage required by receiver 278 Yes: UDP-Lite is preferred. 279 No: UDP is preferred. 281 Note that this decision tree is not optimal for all cases. For 282 example, if an application wants to use "Specify checksum coverage 283 used by the sender", which is only offered by UDP-Lite, and 284 "Configure priority or weight for a scheduler", which is only offered 285 by SCTP, the above decision tree will always choose UDP-Lite, making 286 it impossible to use SCTP's schedulers with priorities between 287 grouped connections. We caution implementers to be aware of the full 288 set of trade-offs, for which we recommend consulting the list in 289 Appendix A.2.1 when deciding how to initialize a connection. 291 To summarize, the following parameters serve as input for the 292 transport system to help it choose and configure a suitable protocol: 294 o Reliability: a boolean that should be set to true when any of the 295 following will be useful to the application: reliably transfer 296 data; notify the peer of closing/aborting; preserve data ordering. 297 o Checksum coverage: a boolean to specify whether it will be useful 298 to the application to specify checksum coverage when sending or 299 receiving. 300 o Configure message priority: a boolean that should be set to true 301 when any of the following per-message configuration or 302 prioritization mechanisms will be useful to the application: 303 choosing a scheduler to operate between grouped connections, with 304 the possibility to configure a priority or weight per connection; 305 configurable message reliability; unordered message delivery; 306 requesting not to delay the acknowledgement (SACK) of a message. 307 o Early message timeout notifications: a boolean that should be set 308 to true when any of the following will be useful to the 309 application: hand over a message to reliably transfer (possibly 310 multiple times) before connection establishment; suggest timeout 311 to the peer; notification of excessive retransmissions (early 312 warning below abortion threshold); notification of ICMP error 313 message arrival. 315 Once a connection is created, it can be queried for the maximum 316 amount of data that an application can possibly expect to have 317 reliably transmitted before or during transport connection 318 establishment (with zero being a possible answer) (see 319 Section 3.2.1). An application can also give the connection a 320 message for reliable transmission before or during connection 321 establishment (!UDP); the transport system will then try to transmit 322 it as early as possible. An application can facilitate sending a 323 message particularly early by marking it as "idempotent" (see 324 Section 3.3.1); in this case, the receiving application must be 325 prepared to potentially receive multiple copies of the message 326 (because idempotent messages are reliably transferred, asking for 327 idempotence is not necessary for systems that support UDP). 329 After creation, a transport system can actively establish 330 communication with a peer, or it can passively listen for incoming 331 connection requests. Note that active establishment may or may not 332 trigger a notification on the listening side. It is possible that 333 the first notification on the listening side is the arrival of the 334 first data that the active side sends (a receiver-side transport 335 system could handle this by continuing to block a "Listen" call, 336 immediately followed by issuing "Receive", for example; callback- 337 based implementations could simply skip the equivalent of "Listen"). 338 This also means that the active opening side is assumed to be the 339 first side sending data. 341 A transport system can actively close a connection, i.e. terminate it 342 after reliably delivering all remaining data to the peer (if reliable 343 data delivery was requested earlier (!UDP)), in which case the peer 344 is notified that the connection is closed. Alternatively, a 345 connection can be aborted without delivering outstanding data to the 346 peer. In case reliable or partially reliable data delivery was 347 requested earlier (!UDP), the peer is notified that the connection is 348 aborted. A timeout can be configured to abort a connection when data 349 could not be delivered for too long (!UDP); however, timeout-based 350 abortion does not notify the peer application that the connection has 351 been aborted. Because half-closed connections are not supported, 352 when a host implementing a transport system receives a notification 353 that the peer is closing or aborting the connection (!UDP), its peer 354 may not be able to read outstanding data. This means that 355 unacknowledged data residing a transport system's send buffer may 356 have to be dropped from that buffer upon arrival of a "close" or 357 "abort" notification from the peer. 359 3.2. MAINTENANCE 361 A transport system must offer means to group connections, but it 362 cannot guarantee truly grouping them using the transport protocols 363 that it uses (e.g., it cannot be guaranteed that connections become 364 multiplexed as streams on a single SCTP association when SCTP may not 365 be available). The transport system must therefore ensure that 366 group- versus non-group-configurations are handled correctly in some 367 way (e.g., by applying the configuration to all grouped connections 368 even when they are not multiplexed, or informing the application 369 about grouping success or failure). 371 As a general rule, any configuration described below should be 372 carried out as early as possible to aid the transport system's 373 decision making. 375 3.2.1. Connection groups 377 The following transport features and notifications (some directly 378 from Appendix A.2, some new or changed, based on the discussion in 379 Appendix A.3) automatically apply to all grouped connections: 381 (!UDP) Configure a timeout: this can be done with the following 382 parameters: 384 o A timeout value for aborting connections, in seconds 385 o A timeout value to be suggested to the peer (if possible), in 386 seconds 387 o The number of retransmissions after which the application should 388 be notifed of "Excessive Retransmissions" 390 Configure urgency: this can be done with the following parameters: 392 o A number to identify the type of scheduler that should be used to 393 operate between connections in the group (no guarantees given). 394 Schedulers are defined in [RFC8260]. 395 o A "capacity profile" number to identify how an application wants 396 to use its available capacity. Choices can be "lowest possible 397 latency at the expense of overhead" (which would disable any 398 Nagle-like algorithm), "scavenger", or values that help determine 399 the DSCP value for a connection (e.g. similar to table 1 in 400 [I-D.ietf-tsvwg-rtcweb-qos]). 401 o A buffer limit (in bytes); when the sender has less than the 402 provided limit of bytes in the buffer, the application may be 403 notified. Notifications are not guaranteed, and it is optional 404 for a transport system to support buffer limit values greater than 405 0. Note that this limit and its notification should operate 406 across the buffers of the whole transport system, i.e. also any 407 potential buffers that the transport system itself may use on top 408 of the transport's send buffer. 410 Following Appendix A.3.7, these properties can be queried: 412 o The maximum message size that may be sent without fragmentation 413 via the configured interface. This is optional for a transport 414 system to offer, and may return an error ("not available"). It 415 can aid applications implementing Path MTU Discovery. 416 o The maximum transport message size that can be sent, in bytes. 417 Irrespective of fragmentation, there is a size limit for the 418 messages that can be handed over to SCTP or UDP(-Lite); because 419 the service provided by a transport system is independent of the 420 transport protocol, it must allow an application to query this 421 value -- the maximum size of a message in an Application-Framed- 422 Bytestream (see Appendix A.3.1). This may also return an error 423 when data is not delimited ("not available"). 424 o The maximum transport message size that can be received from the 425 configured interface, in bytes (or "not available"). 426 o The maximum amount of data that can possibly be sent before or 427 during connection establishment, in bytes. 429 In addition to the already mentioned closing / aborting notifications 430 and possible send errors, the following notifications can occur: 432 o Excessive Retransmissions: the configured (or a default) number of 433 retransmissions has been reached, yielding this early warning 434 below an abortion threshold. 435 o ICMP Arrival (parameter: ICMP message): an ICMP packet carrying 436 the conveyed ICMP message has arrived. 437 o ECN Arrival (parameter: ECN value): a packet carrying the conveyed 438 ECN value has arrived. This can be useful for applications 439 implementing congestion control. 440 o Timeout (parameter: s seconds): data could not be delivered for s 441 seconds. 442 o Drain: the send buffer has either drained below the configured 443 buffer limit or it has become completely empty. This is a generic 444 notification that tries to enable uniform access to 445 "TCP_NOTSENT_LOWAT" as well as the "SENDER DRY" notification (as 446 discussed in Appendix A.3.4 -- SCTP's "SENDER DRY" is a special 447 case where the threshold (for unsent data) is 0 and there is also 448 no more unacknowledged data in the send buffer). 450 3.2.2. Individual connections 452 Configure priority or weight for a scheduler, as described in 453 [RFC8260]. 455 Configure checksum usage: this can be done with the following 456 parameters, but there is no guarantee that any checksum limitations 457 will indeed be enforced (the default behavior is "full coverage, 458 checksum enabled"): 460 o A boolean to enable / disable usage of a checksum when sending 461 o The desired coverage (in bytes) of the checksum used when sending 462 o A boolean to enable / disable requiring a checksum when receiving 463 o The required minimum coverage (in bytes) of the checksum when 464 receiving 466 3.3. DATA Transfer 468 3.3.1. Sending Data 470 When sending a message, no guarantees are given about the 471 preservation of message boundaries to the peer; if message boundaries 472 are needed, the receiving application at the peer must know about 473 them beforehand (or the transport system cannot use TCP). Note that 474 an application should already be able to hand over data before the 475 transport system establishes a connection with a chosen transport 476 protocol. Regarding the message that is being handed over, the 477 following parameters can be used: 479 o Reliability: this parameter is used to convey a choice of: fully 480 reliable with congestion control (!UDP), unreliable without 481 congestion control, unreliable with congestion control (!UDP), 482 partially reliable with congestion control (see [RFC3758] and 483 [RFC7496] for details on how to specify partial reliability) 484 (!UDP). The latter two choices are optional for a transport 485 system to offer and may result in full reliability. Note that 486 applications sending unreliable data without congestion control 487 should themselves perform congestion control in accordance with 488 [RFC2914]. 489 o (!UDP) Ordered: this boolean parameter lets an application choose 490 between ordered message delivery (true) and possibly unordered, 491 potentially faster message delivery (false). 492 o Bundle: a boolean that expresses a preference for allowing to 493 bundle messages (true) or not (false). No guarantees are given. 494 o DelAck: a boolean that, if false, lets an application request that 495 the peer would not delay the acknowledgement for this message. 496 o Fragment: a boolean that expresses a preference for allowing to 497 fragment messages (true) or not (false), at the IP level. No 498 guarantees are given. 499 o (!UDP) Idempotent: a boolean that expresses whether a message is 500 idempotent (true) or not (false). Idempotent messages may arrive 501 multiple times at the receiver (but they will arrive at least 502 once). When data is idempotent it can be used by the receiver 503 immediately on a connection establishment attempt. Thus, if data 504 is handed over before the transport system establishes a 505 connection with a chosen transport protocol, stating that a 506 message is idempotent facilitates transmitting it to the peer 507 application particularly early. 509 An application can be notified of a failure to send a specific 510 message. There is no guarantee of such notifications, i.e. send 511 failures can also silently occur. 513 3.3.2. Receiving Data 515 A receiving application obtains an "Application-Framed Bytestream" 516 (AFra-Bytestream); this concept is further described in 517 Appendix A.3.1). In line with TCP's receiver semantics, an AFra- 518 Bytestream is just a stream of bytes to the receiver. If message 519 boundaries were specified by the sender, a receiver-side transport 520 system implementing only the minimum set of transport services 521 defined here will still not inform the receiving application about 522 them (this limitation is only needed for transport systems that are 523 implemented to directly use TCP). 525 Different from TCP's semantics, if the sending application has 526 allowed that messages are not fully reliably transferred, or 527 delivered out of order, then such re-ordering or unreliability may be 528 reflected per message in the arriving data. Messages will always 529 stay intact - i.e. if an incomplete message is contained at the end 530 of the arriving data block, this message is guaranteed to continue in 531 the next arriving data block. 533 4. Conclusion 535 By decoupling applications from transport protocols, a transport 536 system provides a different abstraction level than the Berkeley 537 sockets interface. As with high- vs. low-level programming 538 languages, a higher abstraction level allows more freedom for 539 automation below the interface, yet it takes some control away from 540 the application programmer. This is the design trade-off that a 541 transport system developer is facing, and this document provides 542 guidance on the design of this abstraction level. Some transport 543 features are currently rarely offered by APIs, yet they must be 544 offered or they can never be used ("functional" transport features). 545 Other transport features are offered by the APIs of the protocols 546 covered here, but not exposing them in an API would allow for more 547 freedom to automate protocol usage in a transport system. The 548 minimal set presented in this document is an effort to find a middle 549 ground that can be recommended for transport systems to implement, on 550 the basis of the transport features discussed in [RFC8303]. 552 5. Acknowledgements 554 The authors would like to thank all the participants of the TAPS 555 Working Group and the NEAT and MAMI research projects for valuable 556 input to this document. We especially thank Michael Tuexen for help 557 with connection connection establishment/teardown and Gorry Fairhurst 558 for his suggestions regarding fragmentation and packet sizes, and 559 Spencer Dawkins for his extremely detailed and constructive review. 560 This work has received funding from the European Union's Horizon 2020 561 research and innovation programme under grant agreement No. 644334 562 (NEAT). 564 6. IANA Considerations 566 XX RFC ED - PLEASE REMOVE THIS SECTION XXX 568 This memo includes no request to IANA. 570 7. Security Considerations 572 Authentication, confidentiality protection, and integrity protection 573 are identified as transport features by [RFC8095]. As currently 574 deployed in the Internet, these features are generally provided by a 575 protocol or layer on top of the transport protocol; no current full- 576 featured standards-track transport protocol provides all of these 577 transport features on its own. Therefore, these transport features 578 are not considered in this document, with the exception of native 579 authentication capabilities of TCP and SCTP for which the security 580 considerations in [RFC5925] and [RFC4895] apply. The minimum 581 requirements for a secure transport system are discussed in a 582 separate document (Section 5 of [I-D.ietf-taps-transport-security]). 584 8. References 586 8.1. Normative References 588 [RFC8303] Welzl, M., Tuexen, M., and N. Khademi, "On the Usage of 589 Transport Features Provided by IETF Transport Protocols", 590 RFC 8303, DOI 10.17487/RFC8303, February 2018, 591 . 593 8.2. Informative References 595 [COBS] Cheshire, S. and M. Baker, "Consistent Overhead Byte 596 Stuffing", September 1997, 597 . 599 [I-D.ietf-taps-transport-security] 600 Pauly, T., Perkins, C., Rose, K., and C. Wood, "A Survey 601 of Transport Security Protocols", draft-ietf-taps- 602 transport-security-01 (work in progress), May 2018. 604 [I-D.ietf-tsvwg-rtcweb-qos] 605 Jones, P., Dhesikan, S., Jennings, C., and D. Druta, "DSCP 606 Packet Markings for WebRTC QoS", draft-ietf-tsvwg-rtcweb- 607 qos-18 (work in progress), August 2016. 609 [LBE-draft] 610 Bless, R., "A Lower Effort Per-Hop Behavior (LE PHB)", 611 Internet-draft draft-tsvwg-le-phb-03, February 2018. 613 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 614 RFC 2914, DOI 10.17487/RFC2914, September 2000, 615 . 617 [RFC3758] Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P. 618 Conrad, "Stream Control Transmission Protocol (SCTP) 619 Partial Reliability Extension", RFC 3758, 620 DOI 10.17487/RFC3758, May 2004, 621 . 623 [RFC4895] Tuexen, M., Stewart, R., Lei, P., and E. Rescorla, 624 "Authenticated Chunks for the Stream Control Transmission 625 Protocol (SCTP)", RFC 4895, DOI 10.17487/RFC4895, August 626 2007, . 628 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 629 Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, 630 . 632 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 633 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 634 June 2010, . 636 [RFC7305] Lear, E., Ed., "Report from the IAB Workshop on Internet 637 Technology Adoption and Transition (ITAT)", RFC 7305, 638 DOI 10.17487/RFC7305, July 2014, 639 . 641 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 642 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 643 . 645 [RFC7496] Tuexen, M., Seggelmann, R., Stewart, R., and S. Loreto, 646 "Additional Policies for the Partially Reliable Stream 647 Control Transmission Protocol Extension", RFC 7496, 648 DOI 10.17487/RFC7496, April 2015, 649 . 651 [RFC8095] Fairhurst, G., Ed., Trammell, B., Ed., and M. Kuehlewind, 652 Ed., "Services Provided by IETF Transport Protocols and 653 Congestion Control Mechanisms", RFC 8095, 654 DOI 10.17487/RFC8095, March 2017, 655 . 657 [RFC8260] Stewart, R., Tuexen, M., Loreto, S., and R. Seggelmann, 658 "Stream Schedulers and User Message Interleaving for the 659 Stream Control Transmission Protocol", RFC 8260, 660 DOI 10.17487/RFC8260, November 2017, 661 . 663 [RFC8304] Fairhurst, G. and T. Jones, "Transport Features of the 664 User Datagram Protocol (UDP) and Lightweight UDP (UDP- 665 Lite)", RFC 8304, DOI 10.17487/RFC8304, February 2018, 666 . 668 [WWDC2015] 669 Lakhera, P. and S. Cheshire, "Your App and Next Generation 670 Networks", Apple Worldwide Developers Conference 2015, San 671 Francisco, USA, June 2015, 672 . 674 Appendix A. Deriving the minimal set 676 We approach the construction of a minimal set of transport features 677 in the following way: 679 1. Categorization (Appendix A.1): the superset of transport features 680 from [RFC8303] is presented, and transport features are 681 categorized for later reduction. 682 2. Reduction (Appendix A.2): a shorter list of transport features is 683 derived from the categorization in the first step. This removes 684 all transport features that do not require application-specific 685 knowledge or would result in semantically incorrect behavior if 686 they were implemented over TCP or UDP. 687 3. Discussion (Appendix A.3): the resulting list shows a number of 688 peculiarities that are discussed, to provide a basis for 689 constructing the minimal set. 690 4. Construction (Section 3): Based on the reduced set and the 691 discussion of the transport features therein, a minimal set is 692 constructed. 694 A.1. Step 1: Categorization -- The Superset of Transport Features 696 Following [RFC8303], we divide the transport features into two main 697 groups as follows: 699 1. CONNECTION related transport features 700 - ESTABLISHMENT 701 - AVAILABILITY 702 - MAINTENANCE 703 - TERMINATION 705 2. DATA Transfer related transport features 706 - Sending Data 707 - Receiving Data 708 - Errors 710 We assume that applications have no specific requirements that need 711 knowledge about the network, e.g. regarding the choice of network 712 interface or the end-to-end path. Even with these assumptions, there 713 are certain requirements that are strictly kept by transport 714 protocols today, and these must also be kept by a transport system. 715 Some of these requirements relate to transport features that we call 716 "Functional". 718 Functional transport features provide functionality that cannot be 719 used without the application knowing about them, or else they violate 720 assumptions that might cause the application to fail. For example, 721 ordered message delivery is a functional transport feature: it cannot 722 be configured without the application knowing about it because the 723 application's assumption could be that messages always arrive in 724 order. Failure includes any change of the application behavior that 725 is not performance oriented, e.g. security. 727 "Change DSCP" and "Disable Nagle algorithm" are examples of transport 728 features that we call "Optimizing": if a transport system 729 autonomously decides to enable or disable them, an application will 730 not fail, but a transport system may be able to communicate more 731 efficiently if the application is in control of this optimizing 732 transport feature. These transport features require application- 733 specific knowledge (e.g., about delay/bandwidth requirements or the 734 length of future data blocks that are to be transmitted). 736 The transport features of IETF transport protocols that do not 737 require application-specific knowledge and could therefore be 738 utilized by a transport system on its own without involving the 739 application are called "Automatable". 741 Finally, some transport features are aggregated and/or slightly 742 changed from [RFC8303] in the description below. These transport 743 features are marked as "ADDED". The corresponding transport features 744 are automatable, and they are listed immediately below the "ADDED" 745 transport feature. 747 In this description, transport services are presented following the 748 nomenclature "CATEGORY.[SUBCATEGORY].SERVICENAME.PROTOCOL", 749 equivalent to "pass 2" in [RFC8303]. We also sketch how functional 750 or optimizing transport features can be implemented by a transport 751 system. The "minimal set" derived in this document is meant to be 752 implementable "one-sided" over TCP, and, with limitations, UDP. 753 Hence, for all transport features that are categorized as 754 "functional" or "optimizing", and for which no matching TCP and/or 755 UDP primitive exists in "pass 2" of [RFC8303], a brief discussion on 756 how to implement them over TCP and/or UDP is included. 758 We designate some transport features as "automatable" on the basis of 759 a broader decision that affects multiple transport features: 761 o Most transport features that are related to multi-streaming were 762 designated as "automatable". This was done because the decision 763 on whether to use multi-streaming or not does not depend on 764 application-specific knowledge. This means that a connection that 765 is exhibited to an application could be implemented by using a 766 single stream of an SCTP association instead of mapping it to a 767 complete SCTP association or TCP connection. This could be 768 achieved by using more than one stream when an SCTP association is 769 first established (CONNECT.SCTP parameter "outbound stream 770 count"), maintaining an internal stream number, and using this 771 stream number when sending data (SEND.SCTP parameter "stream 772 number"). Closing or aborting a connection could then simply free 773 the stream number for future use. This is discussed further in 774 Appendix A.3.2. 775 o All transport features that are related to using multiple paths or 776 the choice of the network interface were designated as 777 "automatable". Choosing a path or an interface does not depend on 778 application-specific knowledge. For example, "Listen" could 779 always listen on all available interfaces and "Connect" could use 780 the default interface for the destination IP address. 782 A.1.1. CONNECTION Related Transport Features 784 ESTABLISHMENT: 786 o Connect 787 Protocols: TCP, SCTP, UDP(-Lite) 788 Functional because the notion of a connection is often reflected 789 in applications as an expectation to be able to communicate after 790 a "Connect" succeeded, with a communication sequence relating to 791 this transport feature that is defined by the application 792 protocol. 793 Implementation: via CONNECT.TCP, CONNECT.SCTP or CONNECT.UDP(- 794 Lite). 796 o Specify which IP Options must always be used 797 Protocols: TCP, UDP(-Lite) 798 Automatable because IP Options relate to knowledge about the 799 network, not the application. 801 o Request multiple streams 802 Protocols: SCTP 803 Automatable because using multi-streaming does not require 804 application-specific knowledge. 805 Implementation: see Appendix A.3.2. 807 o Limit the number of inbound streams 808 Protocols: SCTP 809 Automatable because using multi-streaming does not require 810 application-specific knowledge. 811 Implementation: see Appendix A.3.2. 813 o Specify number of attempts and/or timeout for the first 814 establishment message 815 Protocols: TCP, SCTP 816 Functional because this is closely related to potentially assumed 817 reliable data delivery for data that is sent before or during 818 connection establishment. 819 Implementation: Using a parameter of CONNECT.TCP and CONNECT.SCTP. 820 Implementation over UDP: Do nothing (this is irrelevant in case of 821 UDP because there, reliable data delivery is not assumed). 823 o Obtain multiple sockets 824 Protocols: SCTP 825 Automatable because the usage of multiple paths to communicate to 826 the same end host relates to knowledge about the network, not the 827 application. 829 o Disable MPTCP 830 Protocols: MPTCP 831 Automatable because the usage of multiple paths to communicate to 832 the same end host relates to knowledge about the network, not the 833 application. 834 Implementation: via a boolean parameter in CONNECT.MPTCP. 836 o Configure authentication 837 Protocols: TCP, SCTP 838 Functional because this has a direct influence on security. 839 Implementation: via parameters in CONNECT.TCP and CONNECT.SCTP. 840 With TCP, this allows to configure Master Key Tuples (MKTs) to 841 authenticate complete segments (including the TCP IPv4 842 pseudoheader, TCP header, and TCP data). With SCTP, this allows 843 to specify which chunk types must always be authenticated. 844 Authenticating only certain chunk types creates a reduced level of 845 security that is not supported by TCP; to be compatible, this 846 should therefore only allow to authenticate all chunk types. Key 847 material must be provided in a way that is compatible with both 848 [RFC4895] and [RFC5925]. 849 Implementation over UDP: Not possible (UDP does not offer this 850 functionality). 852 o Indicate (and/or obtain upon completion) an Adaptation Layer via 853 an adaptation code point 854 Protocols: SCTP 855 Functional because it allows to send extra data for the sake of 856 identifying an adaptation layer, which by itself is application- 857 specific. 858 Implementation: via a parameter in CONNECT.SCTP. 859 Implementation over TCP: not possible (TCP does not offer this 860 functionality). 861 Implementation over UDP: not possible (UDP does not offer this 862 functionality). 864 o Request to negotiate interleaving of user messages 865 Protocols: SCTP 866 Automatable because it requires using multiple streams, but 867 requesting multiple streams in the CONNECTION.ESTABLISHMENT 868 category is automatable. 869 Implementation: via a parameter in CONNECT.SCTP. 871 o Hand over a message to reliably transfer (possibly multiple times) 872 before connection establishment 873 Protocols: TCP 874 Functional because this is closely tied to properties of the data 875 that an application sends or expects to receive. 876 Implementation: via a parameter in CONNECT.TCP. 877 Implementation over UDP: not possible (UDP does not provide 878 reliability). 880 o Hand over a message to reliably transfer during connection 881 establishment 882 Protocols: SCTP 883 Functional because this can only work if the message is limited in 884 size, making it closely tied to properties of the data that an 885 application sends or expects to receive. 886 Implementation: via a parameter in CONNECT.SCTP. 887 Implementation over TCP: not possible (TCP does not allow 888 identification of message boundaries because it provides a byte 889 stream service) 890 Implementation over UDP: not possible (UDP is unreliable). 892 o Enable UDP encapsulation with a specified remote UDP port number 893 Protocols: SCTP 894 Automatable because UDP encapsulation relates to knowledge about 895 the network, not the application. 897 AVAILABILITY: 899 o Listen 900 Protocols: TCP, SCTP, UDP(-Lite) 901 Functional because the notion of accepting connection requests is 902 often reflected in applications as an expectation to be able to 903 communicate after a "Listen" succeeded, with a communication 904 sequence relating to this transport feature that is defined by the 905 application protocol. 906 ADDED. This differs from the 3 automatable transport features 907 below in that it leaves the choice of interfaces for listening 908 open. 909 Implementation: by listening on all interfaces via LISTEN.TCP (not 910 providing a local IP address) or LISTEN.SCTP (providing SCTP port 911 number / address pairs for all local IP addresses). LISTEN.UDP(- 912 Lite) supports both methods. 914 o Listen, 1 specified local interface 915 Protocols: TCP, SCTP, UDP(-Lite) 916 Automatable because decisions about local interfaces relate to 917 knowledge about the network and the Operating System, not the 918 application. 920 o Listen, N specified local interfaces 921 Protocols: SCTP 922 Automatable because decisions about local interfaces relate to 923 knowledge about the network and the Operating System, not the 924 application. 926 o Listen, all local interfaces 927 Protocols: TCP, SCTP, UDP(-Lite) 928 Automatable because decisions about local interfaces relate to 929 knowledge about the network and the Operating System, not the 930 application. 932 o Specify which IP Options must always be used 933 Protocols: TCP, UDP(-Lite) 934 Automatable because IP Options relate to knowledge about the 935 network, not the application. 937 o Disable MPTCP 938 Protocols: MPTCP 939 Automatable because the usage of multiple paths to communicate to 940 the same end host relates to knowledge about the network, not the 941 application. 943 o Configure authentication 944 Protocols: TCP, SCTP 945 Functional because this has a direct influence on security. 946 Implementation: via parameters in LISTEN.TCP and LISTEN.SCTP. 947 Implementation over TCP: With TCP, this allows to configure Master 948 Key Tuples (MKTs) to authenticate complete segments (including the 949 TCP IPv4 pseudoheader, TCP header, and TCP data). With SCTP, this 950 allows to specify which chunk types must always be authenticated. 951 Authenticating only certain chunk types creates a reduced level of 952 security that is not supported by TCP; to be compatible, this 953 should therefore only allow to authenticate all chunk types. Key 954 material must be provided in a way that is compatible with both 955 [RFC4895] and [RFC5925]. 956 Implementation over UDP: not possible (UDP does not offer 957 authentication). 959 o Obtain requested number of streams 960 Protocols: SCTP 961 Automatable because using multi-streaming does not require 962 application-specific knowledge. 963 Implementation: see Appendix A.3.2. 965 o Limit the number of inbound streams 966 Protocols: SCTP 967 Automatable because using multi-streaming does not require 968 application-specific knowledge. 969 Implementation: see Appendix A.3.2. 971 o Indicate (and/or obtain upon completion) an Adaptation Layer via 972 an adaptation code point 973 Protocols: SCTP 974 Functional because it allows to send extra data for the sake of 975 identifying an adaptation layer, which by itself is application- 976 specific. 977 Implementation: via a parameter in LISTEN.SCTP. 978 Implementation over TCP: not possible (TCP does not offer this 979 functionality). 980 Implementation over UDP: not possible (UDP does not offer this 981 functionality). 983 o Request to negotiate interleaving of user messages 984 Protocols: SCTP 985 Automatable because it requires using multiple streams, but 986 requesting multiple streams in the CONNECTION.ESTABLISHMENT 987 category is automatable. 988 Implementation: via a parameter in LISTEN.SCTP. 990 MAINTENANCE: 992 o Change timeout for aborting connection (using retransmit limit or 993 time value) 994 Protocols: TCP, SCTP 995 Functional because this is closely related to potentially assumed 996 reliable data delivery. 997 Implementation: via CHANGE_TIMEOUT.TCP or CHANGE_TIMEOUT.SCTP. 998 Implementation over UDP: not possible (UDP is unreliable and there 999 is no connection timeout). 1001 o Suggest timeout to the peer 1002 Protocols: TCP 1003 Functional because this is closely related to potentially assumed 1004 reliable data delivery. 1005 Implementation: via CHANGE_TIMEOUT.TCP. 1006 Implementation over UDP: not possible (UDP is unreliable and there 1007 is no connection timeout). 1009 o Disable Nagle algorithm 1010 Protocols: TCP, SCTP 1011 Optimizing because this decision depends on knowledge about the 1012 size of future data blocks and the delay between them. 1013 Implementation: via DISABLE_NAGLE.TCP and DISABLE_NAGLE.SCTP. 1014 Implementation over UDP: do nothing (UDP does not implement the 1015 Nagle algorithm). 1017 o Request an immediate heartbeat, returning success/failure 1018 Protocols: SCTP 1019 Automatable because this informs about network-specific knowledge. 1021 o Notification of Excessive Retransmissions (early warning below 1022 abortion threshold) 1023 Protocols: TCP 1024 Optimizing because it is an early warning to the application, 1025 informing it of an impending functional event. 1026 Implementation: via ERROR.TCP. 1027 Implementation over UDP: do nothing (there is no abortion 1028 threshold). 1030 o Add path 1031 Protocols: MPTCP, SCTP 1032 MPTCP Parameters: source-IP; source-Port; destination-IP; 1033 destination-Port 1034 SCTP Parameters: local IP address 1035 Automatable because the usage of multiple paths to communicate to 1036 the same end host relates to knowledge about the network, not the 1037 application. 1039 o Remove path 1040 Protocols: MPTCP, SCTP 1041 MPTCP Parameters: source-IP; source-Port; destination-IP; 1042 destination-Port 1043 SCTP Parameters: local IP address 1044 Automatable because the usage of multiple paths to communicate to 1045 the same end host relates to knowledge about the network, not the 1046 application. 1048 o Set primary path 1049 Protocols: SCTP 1050 Automatable because the usage of multiple paths to communicate to 1051 the same end host relates to knowledge about the network, not the 1052 application. 1054 o Suggest primary path to the peer 1055 Protocols: SCTP 1056 Automatable because the usage of multiple paths to communicate to 1057 the same end host relates to knowledge about the network, not the 1058 application. 1060 o Configure Path Switchover 1061 Protocols: SCTP 1062 Automatable because the usage of multiple paths to communicate to 1063 the same end host relates to knowledge about the network, not the 1064 application. 1066 o Obtain status (query or notification) 1067 Protocols: SCTP, MPTCP 1068 SCTP parameters: association connection state; destination 1069 transport address list; destination transport address reachability 1070 states; current local and peer receiver window size; current local 1071 congestion window sizes; number of unacknowledged DATA chunks; 1072 number of DATA chunks pending receipt; primary path; most recent 1073 SRTT on primary path; RTO on primary path; SRTT and RTO on other 1074 destination addresses; MTU per path; interleaving supported yes/no 1075 MPTCP parameters: subflow-list (identified by source-IP; source- 1076 Port; destination-IP; destination-Port) 1077 Automatable because these parameters relate to knowledge about the 1078 network, not the application. 1080 o Specify DSCP field 1081 Protocols: TCP, SCTP, UDP(-Lite) 1082 Optimizing because choosing a suitable DSCP value requires 1083 application-specific knowledge. 1084 Implementation: via SET_DSCP.TCP / SET_DSCP.SCTP / SET_DSCP.UDP(- 1085 Lite) 1087 o Notification of ICMP error message arrival 1088 Protocols: TCP, UDP(-Lite) 1089 Optimizing because these messages can inform about success or 1090 failure of functional transport features (e.g., host unreachable 1091 relates to "Connect") 1092 Implementation: via ERROR.TCP or ERROR.UDP(-Lite). 1094 o Obtain information about interleaving support 1095 Protocols: SCTP 1096 Automatable because it requires using multiple streams, but 1097 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1098 category is automatable. 1099 Implementation: via STATUS.SCTP. 1101 o Change authentication parameters 1102 Protocols: TCP, SCTP 1103 Functional because this has a direct influence on security. 1104 Implementation: via SET_AUTH.TCP and SET_AUTH.SCTP. 1105 Implementation over TCP: With SCTP, this allows to adjust key_id, 1106 key, and hmac_id. With TCP, this allows to change the preferred 1107 outgoing MKT (current_key) and the preferred incoming MKT 1108 (rnext_key), respectively, for a segment that is sent on the 1109 connection. Key material must be provided in a way that is 1110 compatible with both [RFC4895] and [RFC5925]. 1111 Implementation over UDP: not possible (UDP does not offer 1112 authentication). 1114 o Obtain authentication information 1115 Protocols: SCTP 1116 Functional because authentication decisions may have been made by 1117 the peer, and this has an influence on the necessary application- 1118 level measures to provide a certain level of security. 1119 Implementation: via GET_AUTH.SCTP. 1121 Implementation over TCP: With SCTP, this allows to obtain key_id 1122 and a chunk list. With TCP, this allows to obtain current_key and 1123 rnext_key from a previously received segment. Key material must 1124 be provided in a way that is compatible with both [RFC4895] and 1125 [RFC5925]. 1126 Implementation over UDP: not possible (UDP does not offer 1127 authentication). 1129 o Reset Stream 1130 Protocols: SCTP 1131 Automatable because using multi-streaming does not require 1132 application-specific knowledge. 1133 Implementation: see Appendix A.3.2. 1135 o Notification of Stream Reset 1136 Protocols: STCP 1137 Automatable because using multi-streaming does not require 1138 application-specific knowledge. 1139 Implementation: see Appendix A.3.2. 1141 o Reset Association 1142 Protocols: SCTP 1143 Automatable because deciding to reset an association does not 1144 require application-specific knowledge. 1145 Implementation: via RESET_ASSOC.SCTP. 1147 o Notification of Association Reset 1148 Protocols: STCP 1149 Automatable because this notification does not relate to 1150 application-specific knowledge. 1152 o Add Streams 1153 Protocols: SCTP 1154 Automatable because using multi-streaming does not require 1155 application-specific knowledge. 1156 Implementation: see Appendix A.3.2. 1158 o Notification of Added Stream 1159 Protocols: STCP 1160 Automatable because using multi-streaming does not require 1161 application-specific knowledge. 1162 Implementation: see Appendix A.3.2. 1164 o Choose a scheduler to operate between streams of an association 1165 Protocols: SCTP 1166 Optimizing because the scheduling decision requires application- 1167 specific knowledge. However, if a transport system would not use 1168 this, or wrongly configure it on its own, this would only affect 1169 the performance of data transfers; the outcome would still be 1170 correct within the "best effort" service model. 1171 Implementation: using SET_STREAM_SCHEDULER.SCTP. 1172 Implementation over TCP: do nothing (streams are not available in 1173 TCP, but no guarantee is given that this transport feature has any 1174 effect). 1175 Implementation over UDP: do nothing (streams are not available in 1176 UDP, but no guarantee is given that this transport feature has any 1177 effect). 1179 o Configure priority or weight for a scheduler 1180 Protocols: SCTP 1181 Optimizing because the priority or weight requires application- 1182 specific knowledge. However, if a transport system would not use 1183 this, or wrongly configure it on its own, this would only affect 1184 the performance of data transfers; the outcome would still be 1185 correct within the "best effort" service model. 1186 Implementation: using CONFIGURE_STREAM_SCHEDULER.SCTP. 1187 Implementation over TCP: do nothing (streams are not available in 1188 TCP, but no guarantee is given that this transport feature has any 1189 effect). 1190 Implementation over UDP: do nothing (streams are not available in 1191 UDP, but no guarantee is given that this transport feature has any 1192 effect). 1194 o Configure send buffer size 1195 Protocols: SCTP 1196 Automatable because this decision relates to knowledge about the 1197 network and the Operating System, not the application (see also 1198 the discussion in Appendix A.3.4). 1200 o Configure receive buffer (and rwnd) size 1201 Protocols: SCTP 1202 Automatable because this decision relates to knowledge about the 1203 network and the Operating System, not the application. 1205 o Configure message fragmentation 1206 Protocols: SCTP 1207 Automatable because fragmentation relates to knowledge about the 1208 network and the Operating System, not the application. 1209 Implementation: by always enabling it with 1210 CONFIG_FRAGMENTATION.SCTP and auto-setting the fragmentation size 1211 based on network or Operating System conditions. 1213 o Configure PMTUD 1214 Protocols: SCTP 1215 Automatable because Path MTU Discovery relates to knowledge about 1216 the network, not the application. 1218 o Configure delayed SACK timer 1219 Protocols: SCTP 1220 Automatable because the receiver-side decision to delay sending 1221 SACKs relates to knowledge about the network, not the application 1222 (it can be relevant for a sending application to request not to 1223 delay the SACK of a message, but this is a different transport 1224 feature). 1226 o Set Cookie life value 1227 Protocols: SCTP 1228 Functional because it relates to security (possibly weakened by 1229 keeping a cookie very long) versus the time between connection 1230 establishment attempts. Knowledge about both issues can be 1231 application-specific. 1232 Implementation over TCP: the closest specified TCP functionality 1233 is the cookie in TCP Fast Open; for this, [RFC7413] states that 1234 the server "can expire the cookie at any time to enhance security" 1235 and section 4.1.2 describes an example implementation where 1236 updating the key on the server side causes the cookie to expire. 1237 Alternatively, for implementations that do not support TCP Fast 1238 Open, this transport feature could also affect the validity of SYN 1239 cookies (see Section 3.6 of [RFC4987]). 1241 Implementation over UDP: not possible (UDP does not offer this 1242 functionality). 1244 o Set maximum burst 1245 Protocols: SCTP 1246 Automatable because it relates to knowledge about the network, not 1247 the application. 1249 o Configure size where messages are broken up for partial delivery 1250 Protocols: SCTP 1251 Functional because this is closely tied to properties of the data 1252 that an application sends or expects to receive. 1253 Implementation over TCP: not possible (TCP does not offer 1254 identification of message boundaries). 1255 Implementation over UDP: not possible (UDP does not fragment 1256 messages). 1258 o Disable checksum when sending 1259 Protocols: UDP 1260 Functional because application-specific knowledge is necessary to 1261 decide whether it can be acceptable to lose data integrity. 1262 Implementation: via SET_CHECKSUM_ENABLED.UDP. 1263 Implementation over TCP: do nothing (TCP does not offer to disable 1264 the checksum, but transmitting data with an intact checksum will 1265 not yield a semantically wrong result). 1267 o Disable checksum requirement when receiving 1268 Protocols: UDP 1269 Functional because application-specific knowledge is necessary to 1270 decide whether it can be acceptable to lose data integrity. 1271 Implementation: via SET_CHECKSUM_REQUIRED.UDP. 1272 Implementation over TCP: do nothing (TCP does not offer to disable 1273 the checksum, but transmitting data with an intact checksum will 1274 not yield a semantically wrong result). 1276 o Specify checksum coverage used by the sender 1277 Protocols: UDP-Lite 1278 Functional because application-specific knowledge is necessary to 1279 decide for which parts of the data it can be acceptable to lose 1280 data integrity. 1282 Implementation: via SET_CHECKSUM_COVERAGE.UDP-Lite. 1283 Implementation over TCP: do nothing (TCP does not offer to limit 1284 the checksum length, but transmitting data with an intact checksum 1285 will not yield a semantically wrong result). Implementation over 1286 UDP: if checksum coverage is set to cover payload data, do 1287 nothing. Else, either do nothing (transmitting data with an 1288 intact checksum will not yield a semantically wrong result), or 1289 use the transport feature "Disable checksum when sending". 1291 o Specify minimum checksum coverage required by receiver 1292 Protocols: UDP-Lite 1293 Functional because application-specific knowledge is necessary to 1294 decide for which parts of the data it can be acceptable to lose 1295 data integrity. 1296 Implementation: via SET_MIN_CHECKSUM_COVERAGE.UDP-Lite. 1297 Implementation over TCP: do nothing (TCP does not offer to limit 1298 the checksum length, but transmitting data with an intact checksum 1299 will not yield a semantically wrong result). Implementation over 1300 UDP: if checksum coverage is set to cover payload data, do 1301 nothing. Else, either do nothing (transmitting data with an 1302 intact checksum will not yield a semantically wrong result), or 1303 use the transport feature "Disable checksum requirement when 1304 receiving". 1306 o Specify DF field 1307 Protocols: UDP(-Lite) 1308 Optimizing because the DF field can be used to carry out Path MTU 1309 Discovery, which can lead an application to choose message sizes 1310 that can be transmitted more efficiently. 1311 Implementation: via MAINTENANCE.SET_DF.UDP(-Lite) and 1312 SEND_FAILURE.UDP(-Lite). 1313 Implementation over TCP: do nothing (with TCP, the sending 1314 application is not in control of transport message sizes, making 1315 this functionality irrelevant). 1317 o Get max. transport-message size that may be sent using a non- 1318 fragmented IP packet from the configured interface 1319 Protocols: UDP(-Lite) 1320 Optimizing because this can lead an application to choose message 1321 sizes that can be transmitted more efficiently. 1322 Implementation over TCP: do nothing (this information is not 1323 available with TCP). 1325 o Get max. transport-message size that may be received from the 1326 configured interface 1327 Protocols: UDP(-Lite) 1328 Optimizing because this can, for example, influence an 1329 application's memory management. 1330 Implementation over TCP: do nothing (this information is not 1331 available with TCP). 1333 o Specify TTL/Hop count field 1334 Protocols: UDP(-Lite) 1335 Automatable because a transport system can use a large enough 1336 system default to avoid communication failures. Allowing an 1337 application to configure it differently can produce notifications 1338 of ICMP error message arrivals that yield information which only 1339 relates to knowledge about the network, not the application. 1341 o Obtain TTL/Hop count field 1342 Protocols: UDP(-Lite) 1343 Automatable because the TTL/Hop count field relates to knowledge 1344 about the network, not the application. 1346 o Specify ECN field 1347 Protocols: UDP(-Lite) 1348 Automatable because the ECN field relates to knowledge about the 1349 network, not the application. 1351 o Obtain ECN field 1352 Protocols: UDP(-Lite) 1353 Optimizing because this information can be used by an application 1354 to better carry out congestion control (this is relevant when 1355 choosing a data transmission transport service that does not 1356 already do congestion control). 1357 Implementation over TCP: do nothing (this information is not 1358 available with TCP). 1360 o Specify IP Options 1361 Protocols: UDP(-Lite) 1362 Automatable because IP Options relate to knowledge about the 1363 network, not the application. 1365 o Obtain IP Options 1366 Protocols: UDP(-Lite) 1367 Automatable because IP Options relate to knowledge about the 1368 network, not the application. 1370 o Enable and configure a "Low Extra Delay Background Transfer" 1371 Protocols: A protocol implementing the LEDBAT congestion control 1372 mechanism 1373 Optimizing because whether this service is appropriate or not 1374 depends on application-specific knowledge. However, wrongly using 1375 this will only affect the speed of data transfers (albeit 1376 including other transfers that may compete with the transport 1377 system's transfer in the network), so it is still correct within 1378 the "best effort" service model. 1379 Implementation: via CONFIGURE.LEDBAT and/or SET_DSCP.TCP / 1380 SET_DSCP.SCTP / SET_DSCP.UDP(-Lite) [LBE-draft]. 1381 Implementation over TCP: do nothing (TCP does not support LEDBAT 1382 congestion control, but not implementing this functionality will 1383 not yield a semantically wrong behavior). 1384 Implementation over UDP: do nothing (UDP does not offer congestion 1385 control). 1387 TERMINATION: 1389 o Close after reliably delivering all remaining data, causing an 1390 event informing the application on the other side 1391 Protocols: TCP, SCTP 1392 Functional because the notion of a connection is often reflected 1393 in applications as an expectation to have all outstanding data 1394 delivered and no longer be able to communicate after a "Close" 1395 succeeded, with a communication sequence relating to this 1396 transport feature that is defined by the application protocol. 1397 Implementation: via CLOSE.TCP and CLOSE.SCTP. 1398 Implementation over UDP: not possible (UDP is unreliable and hence 1399 does not know when all remaining data is delivered; it does also 1400 not offer to cause an event related to closing at the peer). 1402 o Abort without delivering remaining data, causing an event 1403 informing the application on the other side 1404 Protocols: TCP, SCTP 1405 Functional because the notion of a connection is often reflected 1406 in applications as an expectation to potentially not have all 1407 outstanding data delivered and no longer be able to communicate 1408 after an "Abort" succeeded. On both sides of a connection, an 1409 application protocol may define a communication sequence relating 1410 to this transport feature. 1411 Implementation: via ABORT.TCP and ABORT.SCTP. 1412 Implementation over UDP: not possible (UDP does not offer to cause 1413 an event related to aborting at the peer). 1415 o Abort without delivering remaining data, not causing an event 1416 informing the application on the other side 1417 Protocols: UDP(-Lite) 1418 Functional because the notion of a connection is often reflected 1419 in applications as an expectation to potentially not have all 1420 outstanding data delivered and no longer be able to communicate 1421 after an "Abort" succeeded. On both sides of a connection, an 1422 application protocol may define a communication sequence relating 1423 to this transport feature. 1424 Implementation: via ABORT.UDP(-Lite). 1425 Implementation over TCP: stop using the connection, wait for a 1426 timeout. 1428 o Timeout event when data could not be delivered for too long 1429 Protocols: TCP, SCTP 1430 Functional because this notifies that potentially assumed reliable 1431 data delivery is no longer provided. 1432 Implementation: via TIMEOUT.TCP and TIMEOUT.SCTP. 1433 Implementation over UDP: do nothing (this event will not occur 1434 with UDP). 1436 A.1.2. DATA Transfer Related Transport Features 1438 A.1.2.1. Sending Data 1440 o Reliably transfer data, with congestion control 1441 Protocols: TCP, SCTP 1442 Functional because this is closely tied to properties of the data 1443 that an application sends or expects to receive. 1444 Implementation: via SEND.TCP and SEND.SCTP. 1445 Implementation over UDP: not possible (UDP is unreliable). 1447 o Reliably transfer a message, with congestion control 1448 Protocols: SCTP 1449 Functional because this is closely tied to properties of the data 1450 that an application sends or expects to receive. 1451 Implementation: via SEND.SCTP. 1452 Implementation over TCP: via SEND.TCP. With SEND.TCP, message 1453 boundaries will not be identifiable by the receiver, because TCP 1454 provides a byte stream service. 1455 Implementation over UDP: not possible (UDP is unreliable). 1457 o Unreliably transfer a message 1458 Protocols: SCTP, UDP(-Lite) 1459 Optimizing because only applications know about the time 1460 criticality of their communication, and reliably transfering a 1461 message is never incorrect for the receiver of a potentially 1462 unreliable data transfer, it is just slower. 1463 ADDED. This differs from the 2 automatable transport features 1464 below in that it leaves the choice of congestion control open. 1465 Implementation: via SEND.SCTP or SEND.UDP(-Lite). 1466 Implementation over TCP: use SEND.TCP. With SEND.TCP, messages 1467 will be sent reliably, and message boundaries will not be 1468 identifiable by the receiver. 1470 o Unreliably transfer a message, with congestion control 1471 Protocols: SCTP 1472 Automatable because congestion control relates to knowledge about 1473 the network, not the application. 1475 o Unreliably transfer a message, without congestion control 1476 Protocols: UDP(-Lite) 1477 Automatable because congestion control relates to knowledge about 1478 the network, not the application. 1480 o Configurable Message Reliability 1481 Protocols: SCTP 1482 Optimizing because only applications know about the time 1483 criticality of their communication, and reliably transfering a 1484 message is never incorrect for the receiver of a potentially 1485 unreliable data transfer, it is just slower. 1486 Implementation: via SEND.SCTP. 1487 Implementation over TCP: By using SEND.TCP and ignoring this 1488 configuration: based on the assumption of the best-effort service 1489 model, unnecessarily delivering data does not violate application 1490 expectations. Moreover, it is not possible to associate the 1491 requested reliability to a "message" in TCP anyway. 1492 Implementation over UDP: not possible (UDP is unreliable). 1494 o Choice of stream 1495 Protocols: SCTP 1496 Automatable because it requires using multiple streams, but 1497 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1498 category is automatable. Implementation: see Appendix A.3.2. 1500 o Choice of path (destination address) 1501 Protocols: SCTP 1502 Automatable because it requires using multiple sockets, but 1503 obtaining multiple sockets in the CONNECTION.ESTABLISHMENT 1504 category is automatable. 1506 o Ordered message delivery (potentially slower than unordered) 1507 Protocols: SCTP 1508 Functional because this is closely tied to properties of the data 1509 that an application sends or expects to receive. 1510 Implementation: via SEND.SCTP. 1511 Implementation over TCP: By using SEND.TCP. With SEND.TCP, 1512 messages will not be identifiable by the receiver. 1513 Implementation over UDP: not possible (UDP does not offer any 1514 guarantees regarding ordering). 1516 o Unordered message delivery (potentially faster than ordered) 1517 Protocols: SCTP, UDP(-Lite) 1518 Functional because this is closely tied to properties of the data 1519 that an application sends or expects to receive. 1520 Implementation: via SEND.SCTP. 1522 Implementation over TCP: By using SEND.TCP and always sending data 1523 ordered: based on the assumption of the best-effort service model, 1524 ordered delivery may just be slower and does not violate 1525 application expectations. Moreover, it is not possible to 1526 associate the requested delivery order to a "message" in TCP 1527 anyway. 1529 o Request not to bundle messages 1530 Protocols: SCTP 1531 Optimizing because this decision depends on knowledge about the 1532 size of future data blocks and the delay between them. 1533 Implementation: via SEND.SCTP. 1534 Implementation over TCP: By using SEND.TCP and DISABLE_NAGLE.TCP 1535 to disable the Nagle algorithm when the request is made and enable 1536 it again when the request is no longer made. Note that this is 1537 not fully equivalent because it relates to the time of issuing the 1538 request rather than a specific message. 1539 Implementation over UDP: do nothing (UDP never bundles messages). 1541 o Specifying a "payload protocol-id" (handed over as such by the 1542 receiver) 1543 Protocols: SCTP 1544 Functional because it allows to send extra application data with 1545 every message, for the sake of identification of data, which by 1546 itself is application-specific. 1547 Implementation: SEND.SCTP. 1548 Implementation over TCP: not possible (this functionality is not 1549 available in TCP). 1550 Implementation over UDP: not possible (this functionality is not 1551 available in UDP). 1553 o Specifying a key id to be used to authenticate a message 1554 Protocols: SCTP 1555 Functional because this has a direct influence on security. 1556 Implementation: via a parameter in SEND.SCTP. 1557 Implementation over TCP: This could be emulated by using 1558 SET_AUTH.TCP before and after the message is sent. Note that this 1559 is not fully equivalent because it relates to the time of issuing 1560 the request rather than a specific message. 1561 Implementation over UDP: not possible (UDP does not offer 1562 authentication). 1564 o Request not to delay the acknowledgement (SACK) of a message 1565 Protocols: SCTP 1566 Optimizing because only an application knows for which message it 1567 wants to quickly be informed about success / failure of its 1568 delivery. 1569 Implementation over TCP: do nothing (TCP does not offer this 1570 functionality, but ignoring this request from the application will 1571 not yield a semantically wrong behavior). 1572 Implementation over UDP: do nothing (UDP does not offer this 1573 functionality, but ignoring this request from the application will 1574 not yield a semantically wrong behavior). 1576 A.1.2.2. Receiving Data 1578 o Receive data (with no message delimiting) 1579 Protocols: TCP 1580 Functional because a transport system must be able to send and 1581 receive data. 1582 Implementation: via RECEIVE.TCP. 1583 Implementation over UDP: do nothing (UDP only works on messages; 1584 these can be handed over, the application can still ignore the 1585 message boundaries). 1587 o Receive a message 1588 Protocols: SCTP, UDP(-Lite) 1589 Functional because this is closely tied to properties of the data 1590 that an application sends or expects to receive. 1591 Implementation: via RECEIVE.SCTP and RECEIVE.UDP(-Lite). 1592 Implementation over TCP: not possible (TCP does not support 1593 identification of message boundaries). 1595 o Choice of stream to receive from 1596 Protocols: SCTP 1597 Automatable because it requires using multiple streams, but 1598 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1599 category is automatable. 1600 Implementation: see Appendix A.3.2. 1602 o Information about partial message arrival 1603 Protocols: SCTP 1604 Functional because this is closely tied to properties of the data 1605 that an application sends or expects to receive. 1606 Implementation: via RECEIVE.SCTP. 1607 Implementation over TCP: do nothing (this information is not 1608 available with TCP). 1609 Implementation over UDP: do nothing (this information is not 1610 available with UDP). 1612 A.1.2.3. Errors 1614 This section describes sending failures that are associated with a 1615 specific call to in the "Sending Data" category (Appendix A.1.2.1). 1617 o Notification of send failures 1618 Protocols: SCTP, UDP(-Lite) 1619 Functional because this notifies that potentially assumed reliable 1620 data delivery is no longer provided. 1621 ADDED. This differs from the 2 automatable transport features 1622 below in that it does not distinugish between unsent and 1623 unacknowledged messages. 1624 Implementation: via SENDFAILURE-EVENT.SCTP and SEND_FAILURE.UDP(- 1625 Lite). 1626 Implementation over TCP: do nothing (this notification is not 1627 available and will therefore not occur with TCP). 1629 o Notification of an unsent (part of a) message 1630 Protocols: SCTP, UDP(-Lite) 1631 Automatable because the distinction between unsent and 1632 unacknowledged is network-specific. 1634 o Notification of an unacknowledged (part of a) message 1635 Protocols: SCTP 1636 Automatable because the distinction between unsent and 1637 unacknowledged is network-specific. 1639 o Notification that the stack has no more user data to send 1640 Protocols: SCTP 1641 Optimizing because reacting to this notification requires the 1642 application to be involved, and ensuring that the stack does not 1643 run dry of data (for too long) can improve performance. 1644 Implementation over TCP: do nothing (see the discussion in 1645 Appendix A.3.4). 1646 Implementation over UDP: do nothing (this notification is not 1647 available and will therefore not occur with UDP). 1649 o Notification to a receiver that a partial message delivery has 1650 been aborted 1651 Protocols: SCTP 1652 Functional because this is closely tied to properties of the data 1653 that an application sends or expects to receive. 1654 Implementation over TCP: do nothing (this notification is not 1655 available and will therefore not occur with TCP). 1656 Implementation over UDP: do nothing (this notification is not 1657 available and will therefore not occur with UDP). 1659 A.2. Step 2: Reduction -- The Reduced Set of Transport Features 1661 By hiding automatable transport features from the application, a 1662 transport system can gain opportunities to automate the usage of 1663 network-related functionality. This can facilitate using the 1664 transport system for the application programmer and it allows for 1665 optimizations that may not be possible for an application. For 1666 instance, system-wide configurations regarding the usage of multiple 1667 interfaces can better be exploited if the choice of the interface is 1668 not entirely up to the application. Therefore, since they are not 1669 strictly necessary to expose in a transport system, we do not include 1670 automatable transport features in the reduced set of transport 1671 features. This leaves us with only the transport features that are 1672 either optimizing or functional. 1674 A transport system should be able to communicate via TCP or UDP if 1675 alternative transport protocols are found not to work. For many 1676 transport features, this is possible -- often by simply not doing 1677 anything when a specific request is made. For some transport 1678 features, however, it was identified that direct usage of neither TCP 1679 nor UDP is possible: in these cases, even not doing anything would 1680 incur semantically incorrect behavior. Whenever an application would 1681 make use of one of these transport features, this would eliminate the 1682 possibility to use TCP or UDP. Thus, we only keep the functional and 1683 optimizing transport features for which an implementation over either 1684 TCP or UDP is possible in our reduced set. 1686 The "minimal set" derived in this document is meant to be 1687 implementable "one-sided" over TCP, and, with limitations, UDP. In 1688 the following list, we therefore precede a transport feature with 1689 "T:" if an implementation over TCP is possible, "U:" if an 1690 implementation over UDP is possible, and "TU:" if an implementation 1691 over either TCP or UDP is possible. 1693 A.2.1. CONNECTION Related Transport Features 1695 ESTABLISHMENT: 1697 o T,U: Connect 1698 o T,U: Specify number of attempts and/or timeout for the first 1699 establishment message 1700 o T: Configure authentication 1701 o T: Hand over a message to reliably transfer (possibly multiple 1702 times) before connection establishment 1703 o T: Hand over a message to reliably transfer during connection 1704 establishment 1706 AVAILABILITY: 1708 o T,U: Listen 1709 o T: Configure authentication 1711 MAINTENANCE: 1713 o T: Change timeout for aborting connection (using retransmit limit 1714 or time value) 1715 o T: Suggest timeout to the peer 1716 o T,U: Disable Nagle algorithm 1717 o T,U: Notification of Excessive Retransmissions (early warning 1718 below abortion threshold) 1719 o T,U: Specify DSCP field 1720 o T,U: Notification of ICMP error message arrival 1721 o T: Change authentication parameters 1722 o T: Obtain authentication information 1723 o T,U: Set Cookie life value 1724 o T,U: Choose a scheduler to operate between streams of an 1725 association 1726 o T,U: Configure priority or weight for a scheduler 1727 o T,U: Disable checksum when sending 1728 o T,U: Disable checksum requirement when receiving 1729 o T,U: Specify checksum coverage used by the sender 1730 o T,U: Specify minimum checksum coverage required by receiver 1731 o T,U: Specify DF field 1732 o T,U: Get max. transport-message size that may be sent using a non- 1733 fragmented IP packet from the configured interface 1734 o T,U: Get max. transport-message size that may be received from the 1735 configured interface 1736 o T,U: Obtain ECN field 1737 o T,U: Enable and configure a "Low Extra Delay Background Transfer" 1739 TERMINATION: 1741 o T: Close after reliably delivering all remaining data, causing an 1742 event informing the application on the other side 1743 o T: Abort without delivering remaining data, causing an event 1744 informing the application on the other side 1745 o T,U: Abort without delivering remaining data, not causing an event 1746 informing the application on the other side 1747 o T,U: Timeout event when data could not be delivered for too long 1749 A.2.2. DATA Transfer Related Transport Features 1751 A.2.2.1. Sending Data 1753 o T: Reliably transfer data, with congestion control 1754 o T: Reliably transfer a message, with congestion control 1755 o T,U: Unreliably transfer a message 1756 o T: Configurable Message Reliability 1757 o T: Ordered message delivery (potentially slower than unordered) 1758 o T,U: Unordered message delivery (potentially faster than ordered) 1759 o T,U: Request not to bundle messages 1760 o T: Specifying a key id to be used to authenticate a message 1761 o T,U: Request not to delay the acknowledgement (SACK) of a message 1763 A.2.2.2. Receiving Data 1765 o T,U: Receive data (with no message delimiting) 1766 o U: Receive a message 1767 o T,U: Information about partial message arrival 1769 A.2.2.3. Errors 1771 This section describes sending failures that are associated with a 1772 specific call to in the "Sending Data" category (Appendix A.1.2.1). 1774 o T,U: Notification of send failures 1775 o T,U: Notification that the stack has no more user data to send 1776 o T,U: Notification to a receiver that a partial message delivery 1777 has been aborted 1779 A.3. Step 3: Discussion 1781 The reduced set in the previous section exhibits a number of 1782 peculiarities, which we will discuss in the following. This section 1783 focuses on TCP because, with the exception of one particular 1784 transport feature ("Receive a message" -- we will discuss this in 1785 Appendix A.3.1), the list shows that UDP is strictly a subset of TCP. 1786 We can first try to understand how to build a transport system that 1787 can run over TCP, and then narrow down the result further to allow 1788 that the system can always run over either TCP or UDP (which 1789 effectively means removing everything related to reliability, 1790 ordering, authentication and closing/aborting with a notification to 1791 the peer). 1793 Note that, because the functional transport features of UDP are -- 1794 with the exception of "Receive a message" -- a subset of TCP, TCP can 1795 be used as a replacement for UDP whenever an application does not 1796 need message delimiting (e.g., because the application-layer protocol 1797 already does it). This has been recognized by many applications that 1798 already do this in practice, by trying to communicate with UDP at 1799 first, and falling back to TCP in case of a connection failure. 1801 A.3.1. Sending Messages, Receiving Bytes 1803 For implementing a transport system over TCP, there are several 1804 transport features related to sending, but only a single transport 1805 feature related to receiving: "Receive data (with no message 1806 delimiting)" (and, strangely, "information about partial message 1807 arrival"). Notably, the transport feature "Receive a message" is 1808 also the only non-automatable transport feature of UDP(-Lite) for 1809 which no implementation over TCP is possible. 1811 To support these TCP receiver semantics, we define an "Application- 1812 Framed Bytestream" (AFra-Bytestream). AFra-Bytestreams allow senders 1813 to operate on messages while minimizing changes to the TCP socket 1814 API. In particular, nothing changes on the receiver side - data can 1815 be accepted via a normal TCP socket. 1817 In an AFra-Bytestream, the sending application can optionally inform 1818 the transport about message boundaries and required properties per 1819 message (configurable order and reliability, or embedding a request 1820 not to delay the acknowledgement of a message). Whenever the sending 1821 application specifies per-message properties that relax the notion of 1822 reliable in-order delivery of bytes, it must assume that the 1823 receiving application is 1) able to determine message boundaries, 1824 provided that messages are always kept intact, and 2) able to accept 1825 these relaxed per-message properties. Any signaling of such 1826 information to the peer is up to an application-layer protocol and 1827 considered out of scope of this document. 1829 For example, if an application requests to transfer fixed-size 1830 messages of 100 bytes with partial reliability, this needs the 1831 receiving application to be prepared to accept data in chunks of 100 1832 bytes. If, then, some of these 100-byte messages are missing (e.g., 1833 if SCTP with Configurable Reliability is used), this is the expected 1834 application behavior. With TCP, no messages would be missing, but 1835 this is also correct for the application, and the possible 1836 retransmission delay is acceptable within the best effort service 1837 model (see [RFC7305], Section 3.5). Still, the receiving application 1838 would separate the byte stream into 100-byte chunks. 1840 Note that this usage of messages does not require all messages to be 1841 equal in size. Many application protocols use some form of Type- 1842 Length-Value (TLV) encoding, e.g. by defining a header including 1843 length fields; another alternative is the use of byte stuffing 1844 methods such as COBS [COBS]. If an application needs message 1845 numbers, e.g. to restore the correct sequence of messages, these must 1846 also be encoded by the application itself, as the sequence number 1847 related transport features of SCTP are not provided by the "minimum 1848 set" (in the interest of enabling usage of TCP). 1850 A.3.2. Stream Schedulers Without Streams 1852 We have already stated that multi-streaming does not require 1853 application-specific knowledge. Potential benefits or disadvantages 1854 of, e.g., using two streams of an SCTP association versus using two 1855 separate SCTP associations or TCP connections are related to 1856 knowledge about the network and the particular transport protocol in 1857 use, not the application. However, the transport features "Choose a 1858 scheduler to operate between streams of an association" and 1859 "Configure priority or weight for a scheduler" operate on streams. 1860 Here, streams identify communication channels between which a 1861 scheduler operates, and they can be assigned a priority. Moreover, 1862 the transport features in the MAINTENANCE category all operate on 1863 assocations in case of SCTP, i.e. they apply to all streams in that 1864 assocation. 1866 With only these semantics necessary to represent, the interface to a 1867 transport system becomes easier if we assume that connections may be 1868 a transport protocol's connection or association, but could also be a 1869 stream of an existing SCTP association, for example. We only need to 1870 allow for a way to define a possible grouping of connections. Then, 1871 all MAINTENANCE transport features can be said to operate on 1872 connection groups, not connections, and a scheduler operates on the 1873 connections within a group. 1875 To be compatible with multiple transport protocols and uniformly 1876 allow access to both transport connections and streams of a multi- 1877 streaming protocol, the semantics of opening and closing need to be 1878 the most restrictive subset of all of the underlying options. For 1879 example, TCP's support of half-closed connections can be seen as a 1880 feature on top of the more restrictive "ABORT"; this feature cannot 1881 be supported because not all protocols used by a transport system 1882 (including streams of an association) support half-closed 1883 connections. 1885 A.3.3. Early Data Transmission 1887 There are two transport features related to transferring a message 1888 early: "Hand over a message to reliably transfer (possibly multiple 1889 times) before connection establishment", which relates to TCP Fast 1890 Open [RFC7413], and "Hand over a message to reliably transfer during 1891 connection establishment", which relates to SCTP's ability to 1892 transfer data together with the COOKIE-Echo chunk. Also without TCP 1893 Fast Open, TCP can transfer data during the handshake, together with 1894 the SYN packet -- however, the receiver of this data may not hand it 1895 over to the application until the handshake has completed. Also, 1896 different from TCP Fast Open, this data is not delimited as a message 1897 by TCP (thus, not visible as a ``message''). This functionality is 1898 commonly available in TCP and supported in several implementations, 1899 even though the TCP specification does not explain how to provide it 1900 to applications. 1902 A transport system could differentiate between the cases of 1903 transmitting data "before" (possibly multiple times) or "during" the 1904 handshake. Alternatively, it could also assume that data that are 1905 handed over early will be transmitted as early as possible, and 1906 "before" the handshake would only be used for messages that are 1907 explicitly marked as "idempotent" (i.e., it would be acceptable to 1908 transfer them multiple times). 1910 The amount of data that can successfully be transmitted before or 1911 during the handshake depends on various factors: the transport 1912 protocol, the use of header options, the choice of IPv4 and IPv6 and 1913 the Path MTU. A transport system should therefore allow a sending 1914 application to query the maximum amount of data it can possibly 1915 transmit before (or, if exposed, during) connection establishment. 1917 A.3.4. Sender Running Dry 1919 The transport feature "Notification that the stack has no more user 1920 data to send" relates to SCTP's "SENDER DRY" notification. Such 1921 notifications can, in principle, be used to avoid having an 1922 unnecessarily large send buffer, yet ensure that the transport sender 1923 always has data available when it has an opportunity to transmit it. 1924 This has been found to be very beneficial for some applications 1925 [WWDC2015]. However, "SENDER DRY" truly means that the entire send 1926 buffer (including both unsent and unacknowledged data) has emptied -- 1927 i.e., when it notifies the sender, it is already too late, the 1928 transport protocol already missed an opportunity to send data. Some 1929 modern TCP implementations now include the unspecified 1930 "TCP_NOTSENT_LOWAT" socket option that was proposed in [WWDC2015], 1931 which limits the amount of unsent data that TCP can keep in the 1932 socket buffer; this allows to specify at which buffer filling level 1933 the socket becomes writable, rather than waiting for the buffer to 1934 run empty. 1936 SCTP allows to configure the sender-side buffer too: the automatable 1937 Transport Feature "Configure send buffer size" provides this 1938 functionality, but only for the complete buffer, which includes both 1939 unsent and unacknowledged data. SCTP does not allow to control these 1940 two sizes separately. It therefore makes sense for a transport 1941 system to allow for uniform access to "TCP_NOTSENT_LOWAT" as well as 1942 the "SENDER DRY" notification. 1944 A.3.5. Capacity Profile 1946 The transport features: 1948 o Disable Nagle algorithm 1949 o Enable and configure a "Low Extra Delay Background Transfer" 1950 o Specify DSCP field 1952 all relate to a QoS-like application need such as "low latency" or 1953 "scavenger". In the interest of flexibility of a transport system, 1954 they could therefore be offered in a uniform, more abstract way, 1955 where a transport system could e.g. decide by itself how to use 1956 combinations of LEDBAT-like congestion control and certain DSCP 1957 values, and an application would only specify a general "capacity 1958 profile" (a description of how it wants to use the available 1959 capacity). A need for "lowest possible latency at the expense of 1960 overhead" could then translate into automatically disabling the Nagle 1961 algorithm. 1963 In some cases, the Nagle algorithm is best controlled directly by the 1964 application because it is not only related to a general profile but 1965 also to knowledge about the size of future messages. For fine-grain 1966 control over Nagle-like functionality, the "Request not to bundle 1967 messages" is available. 1969 A.3.6. Security 1971 Both TCP and SCTP offer authentication. TCP authenticates complete 1972 segments. SCTP allows to configure which of SCTP's chunk types must 1973 always be authenticated -- if this is exposed as such, it creates an 1974 undesirable dependency on the transport protocol. For compatibility 1975 with TCP, a transport system should only allow to configure complete 1976 transport layer packets, including headers, IP pseudo-header (if any) 1977 and payload. 1979 Security is discussed in a separate document 1980 [I-D.ietf-taps-transport-security]. The minimal set presented in the 1981 present document excludes all security related transport features: 1982 "Configure authentication", "Change authentication parameters", 1983 "Obtain authentication information" and and "Set Cookie life value" 1984 as well as "Specifying a key id to be used to authenticate a 1985 message". 1987 A.3.7. Packet Size 1989 UDP(-Lite) has a transport feature called "Specify DF field". This 1990 yields an error message in case of sending a message that exceeds the 1991 Path MTU, which is necessary for a UDP-based application to be able 1992 to implement Path MTU Discovery (a function that UDP-based 1993 applications must do by themselves). The "Get max. transport-message 1994 size that may be sent using a non-fragmented IP packet from the 1995 configured interface" transport feature yields an upper limit for the 1996 Path MTU (minus headers) and can therefore help to implement Path MTU 1997 Discovery more efficiently. 1999 Appendix B. Revision information 2001 XXX RFC-Ed please remove this section prior to publication. 2003 -02: implementation suggestions added, discussion section added, 2004 terminology extended, DELETED category removed, various other fixes; 2005 list of Transport Features adjusted to -01 version of [RFC8303] 2006 except that MPTCP is not included. 2008 -03: updated to be consistent with -02 version of [RFC8303]. 2010 -04: updated to be consistent with -03 version of [RFC8303]. 2011 Reorganized document, rewrote intro and conclusion, and made a first 2012 stab at creating a real "minimal set". 2014 -05: updated to be consistent with -05 version of [RFC8303] (minor 2015 changes). Fixed a mistake regarding Cookie Life value. Exclusion of 2016 security related transport features (to be covered in a separate 2017 document). Reorganized the document (now begins with the minset, 2018 derivation is in the appendix). First stab at an abstract API for 2019 the minset. 2021 draft-ietf-taps-minset-00: updated to be consistent with -08 version 2022 of [RFC8303] ("obtain message delivery number" was removed, as this 2023 has also been removed in [RFC8303] because it was a mistake in 2024 RFC4960. This led to the removal of two more transport features that 2025 were only designated as functional because they affected "obtain 2026 message delivery number"). Fall-back to UDP incorporated (this was 2027 requested at IETF-99); this also affected the transport feature 2028 "Choice between unordered (potentially faster) or ordered delivery of 2029 messages" because this is a boolean which is always true for one 2030 fall-back protocol, and always false for the other one. This was 2031 therefore now divided into two features, one for ordered, one for 2032 unordered delivery. The word "reliably" was added to the transport 2033 features "Hand over a message to reliably transfer (possibly multiple 2034 times) before connection establishment" and "Hand over a message to 2035 reliably transfer during connection establishment" to make it clearer 2036 why this is not supported by UDP. Clarified that the "minset 2037 abstract interface" is not proposing a specific API for all TAPS 2038 systems to implement, but it is just a way to describe the minimum 2039 set. Author order changed. 2041 WG -01: "fall-back to" (TCP or UDP) replaced (mostly with 2042 "implementation over"). References to post-sockets removed (these 2043 were statments that assumed that post-sockets requires two-sided 2044 implementation). Replaced "flow" with "TAPS Connection" and "frame" 2045 with "message" to avoid introducing new terminology. Made sections 3 2046 and 4 in line with the categorization that is already used in the 2047 appendix and [RFC8303], and changed style of section 4 to be even 2048 shorter and less interface-like. Updated reference draft-ietf-tsvwg- 2049 sctp-ndata to RFC8260. 2051 WG -02: rephrased "the TAPS system" and "TAPS connection" etc. to 2052 more generally talk about transport after the intro (mostly replacing 2053 "TAPS system" with "transport system" and "TAPS connection" with 2054 "connection". Merged sections 3 and 4 to form a new section 3. 2056 WG -03: updated sentence referencing 2057 [I-D.ietf-taps-transport-security] to say that "the minimum security 2058 requirements for a taps system are discussed in a separate security 2059 document", wrote "example" in the paragraph introducing the decision 2060 tree. Removed reference draft-grinnemo-taps-he-03 and the sentence 2061 that referred to it. 2063 WG -04: addressed comments from Theresa Enghardt and Tommy Pauly. As 2064 part of that, removed "TAPS" as a term everywhere (abstract, intro, 2065 ..). 2067 WG -05: addressed comments from Spencer Dawkins. 2069 Authors' Addresses 2071 Michael Welzl 2072 University of Oslo 2073 PO Box 1080 Blindern 2074 Oslo N-0316 2075 Norway 2077 Phone: +47 22 85 24 20 2078 Email: michawe@ifi.uio.no 2080 Stein Gjessing 2081 University of Oslo 2082 PO Box 1080 Blindern 2083 Oslo N-0316 2084 Norway 2086 Phone: +47 22 85 24 44 2087 Email: steing@ifi.uio.no