idnits 2.17.1 draft-ietf-taps-minset-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 22, 2018) is 2074 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'SUBCATEGORY' is mentioned on line 748, but not defined == Outdated reference: A later version (-12) exists of draft-ietf-taps-transport-security-01 -- Unexpected draft version: The latest known version of draft-tsvwg-le-phb is -00, but you're referring to -03. Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TAPS M. Welzl 3 Internet-Draft S. Gjessing 4 Intended status: Informational University of Oslo 5 Expires: February 23, 2019 August 22, 2018 7 A Minimal Set of Transport Services for End Systems 8 draft-ietf-taps-minset-06 10 Abstract 12 This draft recommends a minimal set of Transport Services offered by 13 end systems, and gives guidance on choosing among the available 14 mechanisms and protocols. It is based on the set of transport 15 features in RFC 8303. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at https://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on February 23, 2019. 34 Copyright Notice 36 Copyright (c) 2018 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (https://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 52 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 53 3. The Minimal Set of Transport Features . . . . . . . . . . . . 5 54 3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION . . . . . . . 5 55 3.2. MAINTENANCE . . . . . . . . . . . . . . . . . . . . . . . 8 56 3.2.1. Connection groups . . . . . . . . . . . . . . . . . . 8 57 3.2.2. Individual connections . . . . . . . . . . . . . . . 10 58 3.3. DATA Transfer . . . . . . . . . . . . . . . . . . . . . . 10 59 3.3.1. Sending Data . . . . . . . . . . . . . . . . . . . . 10 60 3.3.2. Receiving Data . . . . . . . . . . . . . . . . . . . 11 61 4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 12 62 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 63 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 64 7. Security Considerations . . . . . . . . . . . . . . . . . . . 12 65 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 66 8.1. Normative References . . . . . . . . . . . . . . . . . . 13 67 8.2. Informative References . . . . . . . . . . . . . . . . . 13 68 Appendix A. Deriving the minimal set . . . . . . . . . . . . . . 15 69 A.1. Step 1: Categorization -- The Superset of Transport 70 Features . . . . . . . . . . . . . . . . . . . . . . . . 15 71 A.1.1. CONNECTION Related Transport Features . . . . . . . . 17 72 A.1.2. DATA Transfer Related Transport Features . . . . . . 33 73 A.2. Step 2: Reduction -- The Reduced Set of Transport 74 Features . . . . . . . . . . . . . . . . . . . . . . . . 39 75 A.2.1. CONNECTION Related Transport Features . . . . . . . . 40 76 A.2.2. DATA Transfer Related Transport Features . . . . . . 41 77 A.3. Step 3: Discussion . . . . . . . . . . . . . . . . . . . 41 78 A.3.1. Sending Messages, Receiving Bytes . . . . . . . . . . 42 79 A.3.2. Stream Schedulers Without Streams . . . . . . . . . . 43 80 A.3.3. Early Data Transmission . . . . . . . . . . . . . . . 44 81 A.3.4. Sender Running Dry . . . . . . . . . . . . . . . . . 44 82 A.3.5. Capacity Profile . . . . . . . . . . . . . . . . . . 45 83 A.3.6. Security . . . . . . . . . . . . . . . . . . . . . . 45 84 A.3.7. Packet Size . . . . . . . . . . . . . . . . . . . . . 46 85 Appendix B. Revision information . . . . . . . . . . . . . . . . 46 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 47 88 1. Introduction 90 The task of a transport system is to offer transport services to its 91 applications, i.e. the applications running on top of the transport 92 system. Ideally, it does so without statically binding applications 93 to particular transport protocols. Currently, the set of transport 94 services that most applications use is based on TCP and UDP (and 95 protocols that are layered on top of them); this limits the ability 96 for the network stack to make use of features of other transport 97 protocols. For example, if a protocol supports out-of-order message 98 delivery but applications always assume that the network provides an 99 ordered bytestream, then the network stack can not immediately 100 deliver a message that arrives out-of-order: doing so would break a 101 fundamental assumption of the application. The net result is 102 unnecessary head-of-line blocking delay. 104 By exposing the transport services of multiple transport protocols, a 105 transport system can make it possible to use these services without 106 having to statically bind an application to a specific transport 107 protocol. The first step towards the design of such a system was 108 taken by [RFC8095], which surveys a large number of transports, and 109 [RFC8303] as well as [RFC8304], which identify the specific transport 110 features that are exposed to applications by the protocols TCP, 111 MPTCP, UDP(-Lite) and SCTP as well as the LEDBAT congestion control 112 mechanism. This memo is based on these documents and follows the 113 same terminology (also listed below). Because the considered 114 transport protocols conjointly cover a wide range of transport 115 features, there is reason to hope that the resulting set (and the 116 reasoning that led to it) will also apply to many aspects of other 117 transport protocols that may be in use today, or may be designed in 118 the future. 120 The number of transport features of current IETF transports is large, 121 and exposing all of them has a number of disadvantages: generally, 122 the more functionality is exposed, the less freedom a transport 123 system has to automate usage of the various functions of its 124 available set of transport protocols. Some functions only exist in 125 one particular protocol, and if an application used them, this would 126 statically tie the application to this protocol, limiting the 127 flexibility of the transport system. Also, if the number of exposed 128 features is exceedingly large, a transport system might become very 129 difficult to use for an application programmer. Taking [RFC8303] as 130 a basis, this document therefore develops a minimal set of transport 131 features, removing the ones that could get in the way of transport 132 flexibility but keeping the ones that must be retained for 133 applications to benefit from useful transport functionality. 135 Applications use a wide variety of APIs today. The transport 136 features in the minimal set in this document must be reflected in 137 *all* network APIs in order for the underlying functionality to 138 become usable everywhere. For example, it does not help an 139 application that talks to a library which offers its own 140 communication interface if the underlying Berkeley Sockets API is 141 extended to offer "unordered message delivery", but the library only 142 exposes an ordered bytestream. Both the Berkeley Sockets API and the 143 library would have to expose the "unordered message delivery" 144 transport feature (alternatively, there may be ways for certain types 145 of libraries to use this transport feature without exposing it, based 146 on knowledge about the applications -- but this is not the general 147 case). In most situations, in the interest of being as flexible and 148 efficient as possible, the best choice will be for a library to 149 expose at least all of the transport features that are recommended as 150 a "minimal set" here. 152 This "minimal set" can be implemented "one-sided" over TCP. This 153 means that a sender-side transport system can talk to a standard TCP 154 receiver, and a receiver-side transport system can talk to a standard 155 TCP sender. If certain limitations are put in place, the "minimal 156 set" can also be implemented "one-sided" over UDP. 158 2. Terminology 160 Transport Feature: a specific end-to-end feature that the transport 161 layer provides to an application. Examples include 162 confidentiality, reliable delivery, ordered delivery, message- 163 versus-stream orientation, etc. 164 Transport Service: a set of Transport Features, without an 165 association to any given framing protocol, which provides a 166 complete service to an application. 167 Transport Protocol: an implementation that provides one or more 168 different transport services using a specific framing and header 169 format on the wire. 170 Transport Service Instance: an arrangement of transport protocols 171 with a selected set of features and configuration parameters that 172 implements a single transport service, e.g., a protocol stack (RTP 173 over UDP). 174 Application: an entity that uses the transport layer for end-to-end 175 delivery data across the network (this may also be an upper layer 176 protocol or tunnel encapsulation). 177 Application-specific knowledge: knowledge that only applications 178 have. 179 Endpoint: an entity that communicates with one or more other 180 endpoints using a transport protocol. 181 Connection: shared state of two or more endpoints that persists 182 across messages that are transmitted between these endpoints. 183 Connection Group: a set of connections which share the same 184 configuration (configuring one of them causes all other 185 connections in the same group to be configured in the same way). 186 We call connections that belong to a connection group "grouped", 187 while "ungrouped" connections are not a part of a connection 188 group. 189 Socket: the combination of a destination IP address and a 190 destination port number. 192 Moreover, throughout the document, the protocol name "UDP(-Lite)" is 193 used when discussing transport features that are equivalent for UDP 194 and UDP-Lite; similarly, the protocol name "TCP" refers to both TCP 195 and MPTCP. 197 3. The Minimal Set of Transport Features 199 Based on the categorization, reduction, and discussion in Appendix A, 200 this section describes a minimal set of transport features that end 201 systems should offer. The described transport system can be 202 implemented over TCP. Elements of the system that are not marked 203 with "!UDP" can also be implemented over UDP. 205 The arguments laid out in Appendix A.3 ("discussion") were used to 206 make the final representation of the minimal set as short, simple and 207 general as possible. There may be situations where these arguments 208 do not apply -- e.g., implementers may have specific reasons to 209 expose multi-streaming as a visible functionality to applications, or 210 the restrictive open / close semantics may be problematic under some 211 circumstances. In such cases, the representation in Appendix A.2 212 ("reduction") should be considered. 214 As in Appendix A, Appendix A.2 and [RFC8303], we categorize the 215 minimal set of transport features as 1) CONNECTION related 216 (ESTABLISHMENT, AVAILABILITY, MAINTENANCE, TERMINATION) and 2) DATA 217 Transfer related (Sending Data, Receiving Data, Errors). Here, the 218 focus is on connections that the transport system offers as an 219 abstraction to the application, as opposed to connections of 220 transport protocols that the transport system uses. 222 3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION 224 A connection must first be "created" to allow for some initial 225 configuration to be carried out before the transport system can 226 actively or passively establish communication with a remote endpoint. 227 All configuration parameters in Section 3.2 can be used initially, 228 although some of them may only take effect when a connection has been 229 established with a chosen transport protocol. Configuring a 230 connection early helps a transport system make the right decisions. 231 For example, grouping information can influence the transport system 232 to implement a connection as a stream of a multi-streaming protocol's 233 existing association or not. 235 For ungrouped connections, early configuration is necessary because 236 it allows the transport system to know which protocols it should try 237 to use. In particular, a transport system that only makes a one-time 238 choice for a particular protocol must know early about strict 239 requirements that must be kept, or it can end up in a deadlock 240 situation (e.g., having chosen UDP and later be asked to support 241 reliable transfer). As an example description of how to correctly 242 handle these cases, we provide the following decision tree (this is 243 derived from Appendix A.2.1 excluding authentication, as explained in 244 Section 7): 246 - Will it ever be necessary to offer any of the following? 247 * Reliably transfer data 248 * Notify the peer of closing/aborting 249 * Preserve data ordering 251 Yes: SCTP or TCP can be used. 252 - Is any of the following useful to the application? 253 * Choosing a scheduler to operate between connections 254 in a group, with the possibility to configure a priority 255 or weight per connection 256 * Configurable message reliability 257 * Unordered message delivery 258 * Request not to delay the acknowledgement (SACK) of a message 260 Yes: SCTP is preferred. 261 No: 262 - Is any of the following useful to the application? 263 * Hand over a message to reliably transfer (possibly 264 multiple times) before connection establishment 265 * Suggest timeout to the peer 266 * Notification of Excessive Retransmissions (early 267 warning below abortion threshold) 268 * Notification of ICMP error message arrival 270 Yes: TCP is preferred. 271 No: SCTP and TCP are equally preferable. 273 No: all protocols can be used. 274 - Is any of the following useful to the application? 275 * Specify checksum coverage used by the sender 276 * Specify minimum checksum coverage required by receiver 278 Yes: UDP-Lite is preferred. 279 No: UDP is preferred. 281 Note that this decision tree is not optimal for all cases. For 282 example, if an application wants to use "Specify checksum coverage 283 used by the sender", which is only offered by UDP-Lite, and 284 "Configure priority or weight for a scheduler", which is only offered 285 by SCTP, the above decision tree will always choose UDP-Lite, making 286 it impossible to use SCTP's schedulers with priorities between 287 grouped connections. We caution implementers to be aware of the full 288 set of trade-offs, for which we recommend consulting the list in 289 Appendix A.2.1 when deciding how to initialize a connection. 291 To summarize, the following parameters serve as input for the 292 transport system to help it choose and configure a suitable protocol: 294 o Reliability: a boolean that should be set to true when any of the 295 following will be useful to the application: reliably transfer 296 data; notify the peer of closing/aborting; preserve data ordering. 297 o Checksum coverage: a boolean to specify whether it will be useful 298 to the application to specify checksum coverage when sending or 299 receiving. 300 o Configure message priority: a boolean that should be set to true 301 when any of the following per-message configuration or 302 prioritization mechanisms will be useful to the application: 303 choosing a scheduler to operate between grouped connections, with 304 the possibility to configure a priority or weight per connection; 305 configurable message reliability; unordered message delivery; 306 requesting not to delay the acknowledgement (SACK) of a message. 307 o Early message timeout notifications: a boolean that should be set 308 to true when any of the following will be useful to the 309 application: hand over a message to reliably transfer (possibly 310 multiple times) before connection establishment; suggest timeout 311 to the peer; notification of excessive retransmissions (early 312 warning below abortion threshold); notification of ICMP error 313 message arrival. 315 Once a connection is created, it can be queried for the maximum 316 amount of data that an application can possibly expect to have 317 reliably transmitted before or during transport connection 318 establishment (with zero being a possible answer) (see 319 Section 3.2.1). An application can also give the connection a 320 message for reliable transmission before or during connection 321 establishment (!UDP); the transport system will then try to transmit 322 it as early as possible. An application can facilitate sending a 323 message particularly early by marking it as "idempotent" (see 324 Section 3.3.1); in this case, the receiving application must be 325 prepared to potentially receive multiple copies of the message 326 (because idempotent messages are reliably transferred, asking for 327 idempotence is not necessary for systems that support UDP). 329 After creation, a transport system can actively establish 330 communication with a peer, or it can passively listen for incoming 331 connection requests. Note that active establishment may or may not 332 trigger a notification on the listening side. It is possible that 333 the first notification on the listening side is the arrival of the 334 first data that the active side sends (a receiver-side transport 335 system could handle this by continuing to block a "Listen" call, 336 immediately followed by issuing "Receive", for example; callback- 337 based implementations could simply skip the equivalent of "Listen"). 338 This also means that the active opening side is assumed to be the 339 first side sending data. 341 A transport system can actively close a connection, i.e. terminate it 342 after reliably delivering all remaining data to the peer (if reliable 343 data delivery was requested earlier (!UDP)), in which case the peer 344 is notified that the connection is closed. Alternatively, a 345 connection can be aborted without delivering outstanding data to the 346 peer. In case reliable or partially reliable data delivery was 347 requested earlier (!UDP), the peer is notified that the connection is 348 aborted. A timeout can be configured to abort a connection when data 349 could not be delivered for too long (!UDP); however, timeout-based 350 abortion does not notify the peer application that the connection has 351 been aborted. Because half-closed connections are not supported, 352 when a host implementing a transport system receives a notification 353 that the peer is closing or aborting the connection (!UDP), its peer 354 may not be able to read outstanding data. This means that 355 unacknowledged data residing a transport system's send buffer may 356 have to be dropped from that buffer upon arrival of a "close" or 357 "abort" notification from the peer. 359 3.2. MAINTENANCE 361 A transport system must offer means to group connections, but it 362 cannot guarantee truly grouping them using the transport protocols 363 that it uses (e.g., it cannot be guaranteed that connections become 364 multiplexed as streams on a single SCTP association when SCTP may not 365 be available). The transport system must therefore ensure that 366 group- versus non-group-configurations are handled correctly in some 367 way (e.g., by applying the configuration to all grouped connections 368 even when they are not multiplexed, or informing the application 369 about grouping success or failure). 371 As a general rule, any configuration described below should be 372 carried out as early as possible to aid the transport system's 373 decision making. 375 3.2.1. Connection groups 377 The following transport features and notifications (some directly 378 from Appendix A.2, some new or changed, based on the discussion in 379 Appendix A.3) automatically apply to all grouped connections: 381 (!UDP) Configure a timeout: this can be done with the following 382 parameters: 384 o A timeout value for aborting connections, in seconds 385 o A timeout value to be suggested to the peer (if possible), in 386 seconds 387 o The number of retransmissions after which the application should 388 be notifed of "Excessive Retransmissions" 390 Configure urgency: this can be done with the following parameters: 392 o A number to identify the type of scheduler that should be used to 393 operate between connections in the group (no guarantees given). 394 Schedulers are defined in [RFC8260]. 395 o A "capacity profile" number to identify how an application wants 396 to use its available capacity. Choices can be "lowest possible 397 latency at the expense of overhead" (which would disable any 398 Nagle-like algorithm), "scavenger", or values that help determine 399 the DSCP value for a connection (e.g. similar to table 1 in 400 [I-D.ietf-tsvwg-rtcweb-qos]). 401 o A buffer limit (in bytes); when the sender has less than the 402 provided limit of bytes in the buffer, the application may be 403 notified. Notifications are not guaranteed, and it is optional 404 for a transport system to support buffer limit values greater than 405 0. Note that this limit and its notification should operate 406 across the buffers of the whole transport system, i.e. also any 407 potential buffers that the transport system itself may use on top 408 of the transport's send buffer. 410 Following Appendix A.3.7, these properties can be queried: 412 o The maximum message size that may be sent without fragmentation 413 via the configured interface. This is optional for a transport 414 system to offer, and may return an error ("not available"). It 415 can aid applications implementing Path MTU Discovery. 416 o The maximum transport message size that can be sent, in bytes. 417 Irrespective of fragmentation, there is a size limit for the 418 messages that can be handed over to SCTP or UDP(-Lite); because 419 the service provided by a transport system is independent of the 420 transport protocol, it must allow an application to query this 421 value -- the maximum size of a message in an Application-Framed- 422 Bytestream (see Appendix A.3.1). This may also return an error 423 when data is not delimited ("not available"). 424 o The maximum transport message size that can be received from the 425 configured interface, in bytes (or "not available"). 426 o The maximum amount of data that can possibly be sent before or 427 during connection establishment, in bytes. 429 In addition to the already mentioned closing / aborting notifications 430 and possible send errors, the following notifications can occur: 432 o Excessive Retransmissions: the configured (or a default) number of 433 retransmissions has been reached, yielding this early warning 434 below an abortion threshold. 435 o ICMP Arrival (parameter: ICMP message): an ICMP packet carrying 436 the conveyed ICMP message has arrived. 437 o ECN Arrival (parameter: ECN value): a packet carrying the conveyed 438 ECN value has arrived. This can be useful for applications 439 implementing congestion control. 440 o Timeout (parameter: s seconds): data could not be delivered for s 441 seconds. 442 o Drain: the send buffer has either drained below the configured 443 buffer limit or it has become completely empty. This is a generic 444 notification that tries to enable uniform access to 445 "TCP_NOTSENT_LOWAT" as well as the "SENDER DRY" notification (as 446 discussed in Appendix A.3.4 -- SCTP's "SENDER DRY" is a special 447 case where the threshold (for unsent data) is 0 and there is also 448 no more unacknowledged data in the send buffer). 450 3.2.2. Individual connections 452 Configure priority or weight for a scheduler, as described in 453 [RFC8260]. 455 Configure checksum usage: this can be done with the following 456 parameters, but there is no guarantee that any checksum limitations 457 will indeed be enforced (the default behavior is "full coverage, 458 checksum enabled"): 460 o A boolean to enable / disable usage of a checksum when sending 461 o The desired coverage (in bytes) of the checksum used when sending 462 o A boolean to enable / disable requiring a checksum when receiving 463 o The required minimum coverage (in bytes) of the checksum when 464 receiving 466 3.3. DATA Transfer 468 3.3.1. Sending Data 470 When sending a message, no guarantees are given about the 471 preservation of message boundaries to the peer; if message boundaries 472 are needed, the receiving application at the peer must know about 473 them beforehand (or the transport system cannot use TCP). Note that 474 an application should already be able to hand over data before the 475 transport system establishes a connection with a chosen transport 476 protocol. Regarding the message that is being handed over, the 477 following parameters can be used: 479 o Reliability: this parameter is used to convey a choice of: fully 480 reliable with congestion control (!UDP), unreliable without 481 congestion control, unreliable with congestion control (!UDP), 482 partially reliable with congestion control (see [RFC3758] and 483 [RFC7496] for details on how to specify partial reliability) 484 (!UDP). The latter two choices are optional for a transport 485 system to offer and may result in full reliability. Note that 486 applications sending unreliable data without congestion control 487 should themselves perform congestion control in accordance with 488 [RFC2914]. 489 o (!UDP) Ordered: this boolean parameter lets an application choose 490 between ordered message delivery (true) and possibly unordered, 491 potentially faster message delivery (false). 492 o Bundle: a boolean that expresses a preference for allowing to 493 bundle messages (true) or not (false). No guarantees are given. 494 o DelAck: a boolean that, if false, lets an application request that 495 the peer would not delay the acknowledgement for this message. 496 o Fragment: a boolean that expresses a preference for allowing to 497 fragment messages (true) or not (false), at the IP level. No 498 guarantees are given. 499 o (!UDP) Idempotent: a boolean that expresses whether a message is 500 idempotent (true) or not (false). Idempotent messages may arrive 501 multiple times at the receiver (but they will arrive at least 502 once). When data is idempotent it can be used by the receiver 503 immediately on a connection establishment attempt. Thus, if data 504 is handed over before the transport system establishes a 505 connection with a chosen transport protocol, stating that a 506 message is idempotent facilitates transmitting it to the peer 507 application particularly early. 509 An application can be notified of a failure to send a specific 510 message. There is no guarantee of such notifications, i.e. send 511 failures can also silently occur. 513 3.3.2. Receiving Data 515 A receiving application obtains an "Application-Framed Bytestream" 516 (AFra-Bytestream); this concept is further described in 517 Appendix A.3.1). In line with TCP's receiver semantics, an AFra- 518 Bytestream is just a stream of bytes to the receiver. If message 519 boundaries were specified by the sender, a receiver-side transport 520 system implementing only the minimum set of transport services 521 defined here will still not inform the receiving application about 522 them (this limitation is only needed for transport systems that are 523 implemented to directly use TCP). 525 Different from TCP's semantics, if the sending application has 526 allowed that messages are not fully reliably transferred, or 527 delivered out of order, then such re-ordering or unreliability may be 528 reflected per message in the arriving data. Messages will always 529 stay intact - i.e. if an incomplete message is contained at the end 530 of the arriving data block, this message is guaranteed to continue in 531 the next arriving data block. 533 4. Conclusion 535 By decoupling applications from transport protocols, a transport 536 system provides a different abstraction level than the Berkeley 537 sockets interface. As with high- vs. low-level programming 538 languages, a higher abstraction level allows more freedom for 539 automation below the interface, yet it takes some control away from 540 the application programmer. This is the design trade-off that a 541 transport system developer is facing, and this document provides 542 guidance on the design of this abstraction level. Some transport 543 features are currently rarely offered by APIs, yet they must be 544 offered or they can never be used ("functional" transport features). 545 Other transport features are offered by the APIs of the protocols 546 covered here, but not exposing them in an API would allow for more 547 freedom to automate protocol usage in a transport system. The 548 minimal set presented in this document is an effort to find a middle 549 ground that can be recommended for transport systems to implement, on 550 the basis of the transport features discussed in [RFC8303]. 552 5. Acknowledgements 554 The authors would like to thank all the participants of the TAPS 555 Working Group and the NEAT and MAMI research projects for valuable 556 input to this document. We especially thank Michael Tuexen for help 557 with connection connection establishment/teardown and Gorry Fairhurst 558 for his suggestions regarding fragmentation and packet sizes, and 559 Spencer Dawkins for his extremely detailed and constructive review. 560 This work has received funding from the European Union's Horizon 2020 561 research and innovation programme under grant agreement No. 644334 562 (NEAT). 564 6. IANA Considerations 566 This memo includes no request to IANA. 568 7. Security Considerations 570 Authentication, confidentiality protection, and integrity protection 571 are identified as transport features by [RFC8095]. As currently 572 deployed in the Internet, these features are generally provided by a 573 protocol or layer on top of the transport protocol; no current full- 574 featured standards-track transport protocol provides all of these 575 transport features on its own. Therefore, these transport features 576 are not considered in this document, with the exception of native 577 authentication capabilities of TCP and SCTP for which the security 578 considerations in [RFC5925] and [RFC4895] apply. The minimum 579 requirements for a secure transport system are discussed in a 580 separate document (Section 5 on Security Features and Transport 581 Dependencies of [I-D.ietf-taps-transport-security]). 583 8. References 585 8.1. Normative References 587 [RFC8303] Welzl, M., Tuexen, M., and N. Khademi, "On the Usage of 588 Transport Features Provided by IETF Transport Protocols", 589 RFC 8303, DOI 10.17487/RFC8303, February 2018, 590 . 592 8.2. Informative References 594 [COBS] Cheshire, S. and M. Baker, "Consistent Overhead Byte 595 Stuffing", September 1997, 596 . 598 [I-D.ietf-taps-transport-security] 599 Pauly, T., Perkins, C., Rose, K., and C. Wood, "A Survey 600 of Transport Security Protocols", draft-ietf-taps- 601 transport-security-01 (work in progress), May 2018. 603 [I-D.ietf-tsvwg-rtcweb-qos] 604 Jones, P., Dhesikan, S., Jennings, C., and D. Druta, "DSCP 605 Packet Markings for WebRTC QoS", draft-ietf-tsvwg-rtcweb- 606 qos-18 (work in progress), August 2016. 608 [LBE-draft] 609 Bless, R., "A Lower Effort Per-Hop Behavior (LE PHB)", 610 Internet-draft draft-tsvwg-le-phb-03, February 2018. 612 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 613 RFC 2914, DOI 10.17487/RFC2914, September 2000, 614 . 616 [RFC3758] Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P. 617 Conrad, "Stream Control Transmission Protocol (SCTP) 618 Partial Reliability Extension", RFC 3758, 619 DOI 10.17487/RFC3758, May 2004, 620 . 622 [RFC4895] Tuexen, M., Stewart, R., Lei, P., and E. Rescorla, 623 "Authenticated Chunks for the Stream Control Transmission 624 Protocol (SCTP)", RFC 4895, DOI 10.17487/RFC4895, August 625 2007, . 627 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 628 Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, 629 . 631 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 632 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 633 June 2010, . 635 [RFC7305] Lear, E., Ed., "Report from the IAB Workshop on Internet 636 Technology Adoption and Transition (ITAT)", RFC 7305, 637 DOI 10.17487/RFC7305, July 2014, 638 . 640 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 641 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 642 . 644 [RFC7496] Tuexen, M., Seggelmann, R., Stewart, R., and S. Loreto, 645 "Additional Policies for the Partially Reliable Stream 646 Control Transmission Protocol Extension", RFC 7496, 647 DOI 10.17487/RFC7496, April 2015, 648 . 650 [RFC8095] Fairhurst, G., Ed., Trammell, B., Ed., and M. Kuehlewind, 651 Ed., "Services Provided by IETF Transport Protocols and 652 Congestion Control Mechanisms", RFC 8095, 653 DOI 10.17487/RFC8095, March 2017, 654 . 656 [RFC8260] Stewart, R., Tuexen, M., Loreto, S., and R. Seggelmann, 657 "Stream Schedulers and User Message Interleaving for the 658 Stream Control Transmission Protocol", RFC 8260, 659 DOI 10.17487/RFC8260, November 2017, 660 . 662 [RFC8304] Fairhurst, G. and T. Jones, "Transport Features of the 663 User Datagram Protocol (UDP) and Lightweight UDP (UDP- 664 Lite)", RFC 8304, DOI 10.17487/RFC8304, February 2018, 665 . 667 [WWDC2015] 668 Lakhera, P. and S. Cheshire, "Your App and Next Generation 669 Networks", Apple Worldwide Developers Conference 2015, San 670 Francisco, USA, June 2015, 671 . 673 Appendix A. Deriving the minimal set 675 We approach the construction of a minimal set of transport features 676 in the following way: 678 1. Categorization (Appendix A.1): the superset of transport features 679 from [RFC8303] is presented, and transport features are 680 categorized for later reduction. 681 2. Reduction (Appendix A.2): a shorter list of transport features is 682 derived from the categorization in the first step. This removes 683 all transport features that do not require application-specific 684 knowledge or would result in semantically incorrect behavior if 685 they were implemented over TCP or UDP. 686 3. Discussion (Appendix A.3): the resulting list shows a number of 687 peculiarities that are discussed, to provide a basis for 688 constructing the minimal set. 689 4. Construction (Section 3): Based on the reduced set and the 690 discussion of the transport features therein, a minimal set is 691 constructed. 693 A.1. Step 1: Categorization -- The Superset of Transport Features 695 Following [RFC8303], we divide the transport features into two main 696 groups as follows: 698 1. CONNECTION related transport features 699 - ESTABLISHMENT 700 - AVAILABILITY 701 - MAINTENANCE 702 - TERMINATION 704 2. DATA Transfer related transport features 705 - Sending Data 706 - Receiving Data 707 - Errors 709 We assume that applications have no specific requirements that need 710 knowledge about the network, e.g. regarding the choice of network 711 interface or the end-to-end path. Even with these assumptions, there 712 are certain requirements that are strictly kept by transport 713 protocols today, and these must also be kept by a transport system. 715 Some of these requirements relate to transport features that we call 716 "Functional". 718 Functional transport features provide functionality that cannot be 719 used without the application knowing about them, or else they violate 720 assumptions that might cause the application to fail. For example, 721 ordered message delivery is a functional transport feature: it cannot 722 be configured without the application knowing about it because the 723 application's assumption could be that messages always arrive in 724 order. Failure includes any change of the application behavior that 725 is not performance oriented, e.g. security. 727 "Change DSCP" and "Disable Nagle algorithm" are examples of transport 728 features that we call "Optimizing": if a transport system 729 autonomously decides to enable or disable them, an application will 730 not fail, but a transport system may be able to communicate more 731 efficiently if the application is in control of this optimizing 732 transport feature. These transport features require application- 733 specific knowledge (e.g., about delay/bandwidth requirements or the 734 length of future data blocks that are to be transmitted). 736 The transport features of IETF transport protocols that do not 737 require application-specific knowledge and could therefore be 738 utilized by a transport system on its own without involving the 739 application are called "Automatable". 741 Finally, some transport features are aggregated and/or slightly 742 changed from [RFC8303] in the description below. These transport 743 features are marked as "ADDED". The corresponding transport features 744 are automatable, and they are listed immediately below the "ADDED" 745 transport feature. 747 In this description, transport services are presented following the 748 nomenclature "CATEGORY.[SUBCATEGORY].SERVICENAME.PROTOCOL", 749 equivalent to "pass 2" in [RFC8303]. We also sketch how functional 750 or optimizing transport features can be implemented by a transport 751 system. The "minimal set" derived in this document is meant to be 752 implementable "one-sided" over TCP, and, with limitations, UDP. 753 Hence, for all transport features that are categorized as 754 "functional" or "optimizing", and for which no matching TCP and/or 755 UDP primitive exists in "pass 2" of [RFC8303], a brief discussion on 756 how to implement them over TCP and/or UDP is included. 758 We designate some transport features as "automatable" on the basis of 759 a broader decision that affects multiple transport features: 761 o Most transport features that are related to multi-streaming were 762 designated as "automatable". This was done because the decision 763 on whether to use multi-streaming or not does not depend on 764 application-specific knowledge. This means that a connection that 765 is exhibited to an application could be implemented by using a 766 single stream of an SCTP association instead of mapping it to a 767 complete SCTP association or TCP connection. This could be 768 achieved by using more than one stream when an SCTP association is 769 first established (CONNECT.SCTP parameter "outbound stream 770 count"), maintaining an internal stream number, and using this 771 stream number when sending data (SEND.SCTP parameter "stream 772 number"). Closing or aborting a connection could then simply free 773 the stream number for future use. This is discussed further in 774 Appendix A.3.2. 775 o All transport features that are related to using multiple paths or 776 the choice of the network interface were designated as 777 "automatable". Choosing a path or an interface does not depend on 778 application-specific knowledge. For example, "Listen" could 779 always listen on all available interfaces and "Connect" could use 780 the default interface for the destination IP address. 782 A.1.1. CONNECTION Related Transport Features 784 ESTABLISHMENT: 786 o Connect 787 Protocols: TCP, SCTP, UDP(-Lite) 788 Functional because the notion of a connection is often reflected 789 in applications as an expectation to be able to communicate after 790 a "Connect" succeeded, with a communication sequence relating to 791 this transport feature that is defined by the application 792 protocol. 793 Implementation: via CONNECT.TCP, CONNECT.SCTP or CONNECT.UDP(- 794 Lite). 796 o Specify which IP Options must always be used 797 Protocols: TCP, UDP(-Lite) 798 Automatable because IP Options relate to knowledge about the 799 network, not the application. 801 o Request multiple streams 802 Protocols: SCTP 803 Automatable because using multi-streaming does not require 804 application-specific knowledge. 805 Implementation: see Appendix A.3.2. 807 o Limit the number of inbound streams 808 Protocols: SCTP 809 Automatable because using multi-streaming does not require 810 application-specific knowledge. 811 Implementation: see Appendix A.3.2. 813 o Specify number of attempts and/or timeout for the first 814 establishment message 815 Protocols: TCP, SCTP 816 Functional because this is closely related to potentially assumed 817 reliable data delivery for data that is sent before or during 818 connection establishment. 819 Implementation: Using a parameter of CONNECT.TCP and CONNECT.SCTP. 820 Implementation over UDP: Do nothing (this is irrelevant in case of 821 UDP because there, reliable data delivery is not assumed). 823 o Obtain multiple sockets 824 Protocols: SCTP 825 Automatable because the usage of multiple paths to communicate to 826 the same end host relates to knowledge about the network, not the 827 application. 829 o Disable MPTCP 830 Protocols: MPTCP 831 Automatable because the usage of multiple paths to communicate to 832 the same end host relates to knowledge about the network, not the 833 application. 834 Implementation: via a boolean parameter in CONNECT.MPTCP. 836 o Configure authentication 837 Protocols: TCP, SCTP 838 Functional because this has a direct influence on security. 839 Implementation: via parameters in CONNECT.TCP and CONNECT.SCTP. 840 With TCP, this allows to configure Master Key Tuples (MKTs) to 841 authenticate complete segments (including the TCP IPv4 842 pseudoheader, TCP header, and TCP data). With SCTP, this allows 843 to specify which chunk types must always be authenticated. 844 Authenticating only certain chunk types creates a reduced level of 845 security that is not supported by TCP; to be compatible, this 846 should therefore only allow to authenticate all chunk types. Key 847 material must be provided in a way that is compatible with both 848 [RFC4895] and [RFC5925]. 850 Implementation over UDP: Not possible (UDP does not offer this 851 functionality). 853 o Indicate (and/or obtain upon completion) an Adaptation Layer via 854 an adaptation code point 855 Protocols: SCTP 856 Functional because it allows to send extra data for the sake of 857 identifying an adaptation layer, which by itself is application- 858 specific. 859 Implementation: via a parameter in CONNECT.SCTP. 860 Implementation over TCP: not possible (TCP does not offer this 861 functionality). 862 Implementation over UDP: not possible (UDP does not offer this 863 functionality). 865 o Request to negotiate interleaving of user messages 866 Protocols: SCTP 867 Automatable because it requires using multiple streams, but 868 requesting multiple streams in the CONNECTION.ESTABLISHMENT 869 category is automatable. 870 Implementation: via a parameter in CONNECT.SCTP. 872 o Hand over a message to reliably transfer (possibly multiple times) 873 before connection establishment 874 Protocols: TCP 875 Functional because this is closely tied to properties of the data 876 that an application sends or expects to receive. 877 Implementation: via a parameter in CONNECT.TCP. 878 Implementation over UDP: not possible (UDP does not provide 879 reliability). 881 o Hand over a message to reliably transfer during connection 882 establishment 883 Protocols: SCTP 884 Functional because this can only work if the message is limited in 885 size, making it closely tied to properties of the data that an 886 application sends or expects to receive. 887 Implementation: via a parameter in CONNECT.SCTP. 888 Implementation over TCP: not possible (TCP does not allow 889 identification of message boundaries because it provides a byte 890 stream service) 891 Implementation over UDP: not possible (UDP is unreliable). 893 o Enable UDP encapsulation with a specified remote UDP port number 894 Protocols: SCTP 895 Automatable because UDP encapsulation relates to knowledge about 896 the network, not the application. 898 AVAILABILITY: 900 o Listen 901 Protocols: TCP, SCTP, UDP(-Lite) 902 Functional because the notion of accepting connection requests is 903 often reflected in applications as an expectation to be able to 904 communicate after a "Listen" succeeded, with a communication 905 sequence relating to this transport feature that is defined by the 906 application protocol. 907 ADDED. This differs from the 3 automatable transport features 908 below in that it leaves the choice of interfaces for listening 909 open. 910 Implementation: by listening on all interfaces via LISTEN.TCP (not 911 providing a local IP address) or LISTEN.SCTP (providing SCTP port 912 number / address pairs for all local IP addresses). LISTEN.UDP(- 913 Lite) supports both methods. 915 o Listen, 1 specified local interface 916 Protocols: TCP, SCTP, UDP(-Lite) 917 Automatable because decisions about local interfaces relate to 918 knowledge about the network and the Operating System, not the 919 application. 921 o Listen, N specified local interfaces 922 Protocols: SCTP 923 Automatable because decisions about local interfaces relate to 924 knowledge about the network and the Operating System, not the 925 application. 927 o Listen, all local interfaces 928 Protocols: TCP, SCTP, UDP(-Lite) 929 Automatable because decisions about local interfaces relate to 930 knowledge about the network and the Operating System, not the 931 application. 933 o Specify which IP Options must always be used 934 Protocols: TCP, UDP(-Lite) 935 Automatable because IP Options relate to knowledge about the 936 network, not the application. 938 o Disable MPTCP 939 Protocols: MPTCP 940 Automatable because the usage of multiple paths to communicate to 941 the same end host relates to knowledge about the network, not the 942 application. 944 o Configure authentication 945 Protocols: TCP, SCTP 946 Functional because this has a direct influence on security. 947 Implementation: via parameters in LISTEN.TCP and LISTEN.SCTP. 948 Implementation over TCP: With TCP, this allows to configure Master 949 Key Tuples (MKTs) to authenticate complete segments (including the 950 TCP IPv4 pseudoheader, TCP header, and TCP data). With SCTP, this 951 allows to specify which chunk types must always be authenticated. 952 Authenticating only certain chunk types creates a reduced level of 953 security that is not supported by TCP; to be compatible, this 954 should therefore only allow to authenticate all chunk types. Key 955 material must be provided in a way that is compatible with both 956 [RFC4895] and [RFC5925]. 957 Implementation over UDP: not possible (UDP does not offer 958 authentication). 960 o Obtain requested number of streams 961 Protocols: SCTP 962 Automatable because using multi-streaming does not require 963 application-specific knowledge. 964 Implementation: see Appendix A.3.2. 966 o Limit the number of inbound streams 967 Protocols: SCTP 968 Automatable because using multi-streaming does not require 969 application-specific knowledge. 970 Implementation: see Appendix A.3.2. 972 o Indicate (and/or obtain upon completion) an Adaptation Layer via 973 an adaptation code point 974 Protocols: SCTP 975 Functional because it allows to send extra data for the sake of 976 identifying an adaptation layer, which by itself is application- 977 specific. 978 Implementation: via a parameter in LISTEN.SCTP. 979 Implementation over TCP: not possible (TCP does not offer this 980 functionality). 981 Implementation over UDP: not possible (UDP does not offer this 982 functionality). 984 o Request to negotiate interleaving of user messages 985 Protocols: SCTP 986 Automatable because it requires using multiple streams, but 987 requesting multiple streams in the CONNECTION.ESTABLISHMENT 988 category is automatable. 989 Implementation: via a parameter in LISTEN.SCTP. 991 MAINTENANCE: 993 o Change timeout for aborting connection (using retransmit limit or 994 time value) 995 Protocols: TCP, SCTP 996 Functional because this is closely related to potentially assumed 997 reliable data delivery. 998 Implementation: via CHANGE_TIMEOUT.TCP or CHANGE_TIMEOUT.SCTP. 999 Implementation over UDP: not possible (UDP is unreliable and there 1000 is no connection timeout). 1002 o Suggest timeout to the peer 1003 Protocols: TCP 1004 Functional because this is closely related to potentially assumed 1005 reliable data delivery. 1006 Implementation: via CHANGE_TIMEOUT.TCP. 1007 Implementation over UDP: not possible (UDP is unreliable and there 1008 is no connection timeout). 1010 o Disable Nagle algorithm 1011 Protocols: TCP, SCTP 1012 Optimizing because this decision depends on knowledge about the 1013 size of future data blocks and the delay between them. 1015 Implementation: via DISABLE_NAGLE.TCP and DISABLE_NAGLE.SCTP. 1016 Implementation over UDP: do nothing (UDP does not implement the 1017 Nagle algorithm). 1019 o Request an immediate heartbeat, returning success/failure 1020 Protocols: SCTP 1021 Automatable because this informs about network-specific knowledge. 1023 o Notification of Excessive Retransmissions (early warning below 1024 abortion threshold) 1025 Protocols: TCP 1026 Optimizing because it is an early warning to the application, 1027 informing it of an impending functional event. 1028 Implementation: via ERROR.TCP. 1029 Implementation over UDP: do nothing (there is no abortion 1030 threshold). 1032 o Add path 1033 Protocols: MPTCP, SCTP 1034 MPTCP Parameters: source-IP; source-Port; destination-IP; 1035 destination-Port 1036 SCTP Parameters: local IP address 1037 Automatable because the usage of multiple paths to communicate to 1038 the same end host relates to knowledge about the network, not the 1039 application. 1041 o Remove path 1042 Protocols: MPTCP, SCTP 1043 MPTCP Parameters: source-IP; source-Port; destination-IP; 1044 destination-Port 1045 SCTP Parameters: local IP address 1046 Automatable because the usage of multiple paths to communicate to 1047 the same end host relates to knowledge about the network, not the 1048 application. 1050 o Set primary path 1051 Protocols: SCTP 1052 Automatable because the usage of multiple paths to communicate to 1053 the same end host relates to knowledge about the network, not the 1054 application. 1056 o Suggest primary path to the peer 1057 Protocols: SCTP 1058 Automatable because the usage of multiple paths to communicate to 1059 the same end host relates to knowledge about the network, not the 1060 application. 1062 o Configure Path Switchover 1063 Protocols: SCTP 1064 Automatable because the usage of multiple paths to communicate to 1065 the same end host relates to knowledge about the network, not the 1066 application. 1068 o Obtain status (query or notification) 1069 Protocols: SCTP, MPTCP 1070 SCTP parameters: association connection state; destination 1071 transport address list; destination transport address reachability 1072 states; current local and peer receiver window size; current local 1073 congestion window sizes; number of unacknowledged DATA chunks; 1074 number of DATA chunks pending receipt; primary path; most recent 1075 SRTT on primary path; RTO on primary path; SRTT and RTO on other 1076 destination addresses; MTU per path; interleaving supported yes/no 1077 MPTCP parameters: subflow-list (identified by source-IP; source- 1078 Port; destination-IP; destination-Port) 1079 Automatable because these parameters relate to knowledge about the 1080 network, not the application. 1082 o Specify DSCP field 1083 Protocols: TCP, SCTP, UDP(-Lite) 1084 Optimizing because choosing a suitable DSCP value requires 1085 application-specific knowledge. 1086 Implementation: via SET_DSCP.TCP / SET_DSCP.SCTP / SET_DSCP.UDP(- 1087 Lite) 1089 o Notification of ICMP error message arrival 1090 Protocols: TCP, UDP(-Lite) 1091 Optimizing because these messages can inform about success or 1092 failure of functional transport features (e.g., host unreachable 1093 relates to "Connect") 1094 Implementation: via ERROR.TCP or ERROR.UDP(-Lite). 1096 o Obtain information about interleaving support 1097 Protocols: SCTP 1098 Automatable because it requires using multiple streams, but 1099 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1100 category is automatable. 1101 Implementation: via STATUS.SCTP. 1103 o Change authentication parameters 1104 Protocols: TCP, SCTP 1105 Functional because this has a direct influence on security. 1106 Implementation: via SET_AUTH.TCP and SET_AUTH.SCTP. 1107 Implementation over TCP: With SCTP, this allows to adjust key_id, 1108 key, and hmac_id. With TCP, this allows to change the preferred 1109 outgoing MKT (current_key) and the preferred incoming MKT 1110 (rnext_key), respectively, for a segment that is sent on the 1111 connection. Key material must be provided in a way that is 1112 compatible with both [RFC4895] and [RFC5925]. 1113 Implementation over UDP: not possible (UDP does not offer 1114 authentication). 1116 o Obtain authentication information 1117 Protocols: SCTP 1118 Functional because authentication decisions may have been made by 1119 the peer, and this has an influence on the necessary application- 1120 level measures to provide a certain level of security. 1121 Implementation: via GET_AUTH.SCTP. 1122 Implementation over TCP: With SCTP, this allows to obtain key_id 1123 and a chunk list. With TCP, this allows to obtain current_key and 1124 rnext_key from a previously received segment. Key material must 1125 be provided in a way that is compatible with both [RFC4895] and 1126 [RFC5925]. 1127 Implementation over UDP: not possible (UDP does not offer 1128 authentication). 1130 o Reset Stream 1131 Protocols: SCTP 1132 Automatable because using multi-streaming does not require 1133 application-specific knowledge. 1134 Implementation: see Appendix A.3.2. 1136 o Notification of Stream Reset 1137 Protocols: STCP 1138 Automatable because using multi-streaming does not require 1139 application-specific knowledge. 1140 Implementation: see Appendix A.3.2. 1142 o Reset Association 1143 Protocols: SCTP 1144 Automatable because deciding to reset an association does not 1145 require application-specific knowledge. 1146 Implementation: via RESET_ASSOC.SCTP. 1148 o Notification of Association Reset 1149 Protocols: STCP 1150 Automatable because this notification does not relate to 1151 application-specific knowledge. 1153 o Add Streams 1154 Protocols: SCTP 1155 Automatable because using multi-streaming does not require 1156 application-specific knowledge. 1157 Implementation: see Appendix A.3.2. 1159 o Notification of Added Stream 1160 Protocols: STCP 1161 Automatable because using multi-streaming does not require 1162 application-specific knowledge. 1163 Implementation: see Appendix A.3.2. 1165 o Choose a scheduler to operate between streams of an association 1166 Protocols: SCTP 1167 Optimizing because the scheduling decision requires application- 1168 specific knowledge. However, if a transport system would not use 1169 this, or wrongly configure it on its own, this would only affect 1170 the performance of data transfers; the outcome would still be 1171 correct within the "best effort" service model. 1172 Implementation: using SET_STREAM_SCHEDULER.SCTP. 1173 Implementation over TCP: do nothing (streams are not available in 1174 TCP, but no guarantee is given that this transport feature has any 1175 effect). 1176 Implementation over UDP: do nothing (streams are not available in 1177 UDP, but no guarantee is given that this transport feature has any 1178 effect). 1180 o Configure priority or weight for a scheduler 1181 Protocols: SCTP 1182 Optimizing because the priority or weight requires application- 1183 specific knowledge. However, if a transport system would not use 1184 this, or wrongly configure it on its own, this would only affect 1185 the performance of data transfers; the outcome would still be 1186 correct within the "best effort" service model. 1187 Implementation: using CONFIGURE_STREAM_SCHEDULER.SCTP. 1188 Implementation over TCP: do nothing (streams are not available in 1189 TCP, but no guarantee is given that this transport feature has any 1190 effect). 1191 Implementation over UDP: do nothing (streams are not available in 1192 UDP, but no guarantee is given that this transport feature has any 1193 effect). 1195 o Configure send buffer size 1196 Protocols: SCTP 1197 Automatable because this decision relates to knowledge about the 1198 network and the Operating System, not the application (see also 1199 the discussion in Appendix A.3.4). 1201 o Configure receive buffer (and rwnd) size 1202 Protocols: SCTP 1203 Automatable because this decision relates to knowledge about the 1204 network and the Operating System, not the application. 1206 o Configure message fragmentation 1207 Protocols: SCTP 1208 Automatable because fragmentation relates to knowledge about the 1209 network and the Operating System, not the application. 1211 Implementation: by always enabling it with 1212 CONFIG_FRAGMENTATION.SCTP and auto-setting the fragmentation size 1213 based on network or Operating System conditions. 1215 o Configure PMTUD 1216 Protocols: SCTP 1217 Automatable because Path MTU Discovery relates to knowledge about 1218 the network, not the application. 1220 o Configure delayed SACK timer 1221 Protocols: SCTP 1222 Automatable because the receiver-side decision to delay sending 1223 SACKs relates to knowledge about the network, not the application 1224 (it can be relevant for a sending application to request not to 1225 delay the SACK of a message, but this is a different transport 1226 feature). 1228 o Set Cookie life value 1229 Protocols: SCTP 1230 Functional because it relates to security (possibly weakened by 1231 keeping a cookie very long) versus the time between connection 1232 establishment attempts. Knowledge about both issues can be 1233 application-specific. 1234 Implementation over TCP: the closest specified TCP functionality 1235 is the cookie in TCP Fast Open; for this, [RFC7413] states that 1236 the server "can expire the cookie at any time to enhance security" 1237 and section 4.1.2 describes an example implementation where 1238 updating the key on the server side causes the cookie to expire. 1239 Alternatively, for implementations that do not support TCP Fast 1240 Open, this transport feature could also affect the validity of SYN 1241 cookies (see Section 3.6 of [RFC4987]). 1242 Implementation over UDP: not possible (UDP does not offer this 1243 functionality). 1245 o Set maximum burst 1246 Protocols: SCTP 1247 Automatable because it relates to knowledge about the network, not 1248 the application. 1250 o Configure size where messages are broken up for partial delivery 1251 Protocols: SCTP 1252 Functional because this is closely tied to properties of the data 1253 that an application sends or expects to receive. 1254 Implementation over TCP: not possible (TCP does not offer 1255 identification of message boundaries). 1256 Implementation over UDP: not possible (UDP does not fragment 1257 messages). 1259 o Disable checksum when sending 1260 Protocols: UDP 1261 Functional because application-specific knowledge is necessary to 1262 decide whether it can be acceptable to lose data integrity. 1263 Implementation: via SET_CHECKSUM_ENABLED.UDP. 1264 Implementation over TCP: do nothing (TCP does not offer to disable 1265 the checksum, but transmitting data with an intact checksum will 1266 not yield a semantically wrong result). 1268 o Disable checksum requirement when receiving 1269 Protocols: UDP 1270 Functional because application-specific knowledge is necessary to 1271 decide whether it can be acceptable to lose data integrity. 1272 Implementation: via SET_CHECKSUM_REQUIRED.UDP. 1273 Implementation over TCP: do nothing (TCP does not offer to disable 1274 the checksum, but transmitting data with an intact checksum will 1275 not yield a semantically wrong result). 1277 o Specify checksum coverage used by the sender 1278 Protocols: UDP-Lite 1279 Functional because application-specific knowledge is necessary to 1280 decide for which parts of the data it can be acceptable to lose 1281 data integrity. 1282 Implementation: via SET_CHECKSUM_COVERAGE.UDP-Lite. 1283 Implementation over TCP: do nothing (TCP does not offer to limit 1284 the checksum length, but transmitting data with an intact checksum 1285 will not yield a semantically wrong result). 1286 Implementation over UDP: if checksum coverage is set to cover 1287 payload data, do nothing. Else, either do nothing (transmitting 1288 data with an intact checksum will not yield a semantically wrong 1289 result), or use the transport feature "Disable checksum when 1290 sending". 1292 o Specify minimum checksum coverage required by receiver 1293 Protocols: UDP-Lite 1294 Functional because application-specific knowledge is necessary to 1295 decide for which parts of the data it can be acceptable to lose 1296 data integrity. 1297 Implementation: via SET_MIN_CHECKSUM_COVERAGE.UDP-Lite. 1298 Implementation over TCP: do nothing (TCP does not offer to limit 1299 the checksum length, but transmitting data with an intact checksum 1300 will not yield a semantically wrong result). 1301 Implementation over UDP: if checksum coverage is set to cover 1302 payload data, do nothing. Else, either do nothing (transmitting 1303 data with an intact checksum will not yield a semantically wrong 1304 result), or use the transport feature "Disable checksum 1305 requirement when receiving". 1307 o Specify DF field 1308 Protocols: UDP(-Lite) 1309 Optimizing because the DF field can be used to carry out Path MTU 1310 Discovery, which can lead an application to choose message sizes 1311 that can be transmitted more efficiently. 1312 Implementation: via MAINTENANCE.SET_DF.UDP(-Lite) and 1313 SEND_FAILURE.UDP(-Lite). 1314 Implementation over TCP: do nothing (with TCP, the sending 1315 application is not in control of transport message sizes, making 1316 this functionality irrelevant). 1318 o Get max. transport-message size that may be sent using a non- 1319 fragmented IP packet from the configured interface 1320 Protocols: UDP(-Lite) 1321 Optimizing because this can lead an application to choose message 1322 sizes that can be transmitted more efficiently. 1323 Implementation over TCP: do nothing (this information is not 1324 available with TCP). 1326 o Get max. transport-message size that may be received from the 1327 configured interface 1328 Protocols: UDP(-Lite) 1329 Optimizing because this can, for example, influence an 1330 application's memory management. 1331 Implementation over TCP: do nothing (this information is not 1332 available with TCP). 1334 o Specify TTL/Hop count field 1335 Protocols: UDP(-Lite) 1336 Automatable because a transport system can use a large enough 1337 system default to avoid communication failures. Allowing an 1338 application to configure it differently can produce notifications 1339 of ICMP error message arrivals that yield information which only 1340 relates to knowledge about the network, not the application. 1342 o Obtain TTL/Hop count field 1343 Protocols: UDP(-Lite) 1344 Automatable because the TTL/Hop count field relates to knowledge 1345 about the network, not the application. 1347 o Specify ECN field 1348 Protocols: UDP(-Lite) 1349 Automatable because the ECN field relates to knowledge about the 1350 network, not the application. 1352 o Obtain ECN field 1353 Protocols: UDP(-Lite) 1354 Optimizing because this information can be used by an application 1355 to better carry out congestion control (this is relevant when 1356 choosing a data transmission transport service that does not 1357 already do congestion control). 1358 Implementation over TCP: do nothing (this information is not 1359 available with TCP). 1361 o Specify IP Options 1362 Protocols: UDP(-Lite) 1363 Automatable because IP Options relate to knowledge about the 1364 network, not the application. 1366 o Obtain IP Options 1367 Protocols: UDP(-Lite) 1368 Automatable because IP Options relate to knowledge about the 1369 network, not the application. 1371 o Enable and configure a "Low Extra Delay Background Transfer" 1372 Protocols: A protocol implementing the LEDBAT congestion control 1373 mechanism 1374 Optimizing because whether this service is appropriate or not 1375 depends on application-specific knowledge. However, wrongly using 1376 this will only affect the speed of data transfers (albeit 1377 including other transfers that may compete with the transport 1378 system's transfer in the network), so it is still correct within 1379 the "best effort" service model. 1380 Implementation: via CONFIGURE.LEDBAT and/or SET_DSCP.TCP / 1381 SET_DSCP.SCTP / SET_DSCP.UDP(-Lite) [LBE-draft]. 1382 Implementation over TCP: do nothing (TCP does not support LEDBAT 1383 congestion control, but not implementing this functionality will 1384 not yield a semantically wrong behavior). 1385 Implementation over UDP: do nothing (UDP does not offer congestion 1386 control). 1388 TERMINATION: 1390 o Close after reliably delivering all remaining data, causing an 1391 event informing the application on the other side 1392 Protocols: TCP, SCTP 1393 Functional because the notion of a connection is often reflected 1394 in applications as an expectation to have all outstanding data 1395 delivered and no longer be able to communicate after a "Close" 1396 succeeded, with a communication sequence relating to this 1397 transport feature that is defined by the application protocol. 1398 Implementation: via CLOSE.TCP and CLOSE.SCTP. 1399 Implementation over UDP: not possible (UDP is unreliable and hence 1400 does not know when all remaining data is delivered; it does also 1401 not offer to cause an event related to closing at the peer). 1403 o Abort without delivering remaining data, causing an event 1404 informing the application on the other side 1405 Protocols: TCP, SCTP 1406 Functional because the notion of a connection is often reflected 1407 in applications as an expectation to potentially not have all 1408 outstanding data delivered and no longer be able to communicate 1409 after an "Abort" succeeded. On both sides of a connection, an 1410 application protocol may define a communication sequence relating 1411 to this transport feature. 1412 Implementation: via ABORT.TCP and ABORT.SCTP. 1414 Implementation over UDP: not possible (UDP does not offer to cause 1415 an event related to aborting at the peer). 1417 o Abort without delivering remaining data, not causing an event 1418 informing the application on the other side 1419 Protocols: UDP(-Lite) 1420 Functional because the notion of a connection is often reflected 1421 in applications as an expectation to potentially not have all 1422 outstanding data delivered and no longer be able to communicate 1423 after an "Abort" succeeded. On both sides of a connection, an 1424 application protocol may define a communication sequence relating 1425 to this transport feature. 1426 Implementation: via ABORT.UDP(-Lite). 1427 Implementation over TCP: stop using the connection, wait for a 1428 timeout. 1430 o Timeout event when data could not be delivered for too long 1431 Protocols: TCP, SCTP 1432 Functional because this notifies that potentially assumed reliable 1433 data delivery is no longer provided. 1434 Implementation: via TIMEOUT.TCP and TIMEOUT.SCTP. 1435 Implementation over UDP: do nothing (this event will not occur 1436 with UDP). 1438 A.1.2. DATA Transfer Related Transport Features 1440 A.1.2.1. Sending Data 1442 o Reliably transfer data, with congestion control 1443 Protocols: TCP, SCTP 1444 Functional because this is closely tied to properties of the data 1445 that an application sends or expects to receive. 1446 Implementation: via SEND.TCP and SEND.SCTP. 1447 Implementation over UDP: not possible (UDP is unreliable). 1449 o Reliably transfer a message, with congestion control 1450 Protocols: SCTP 1451 Functional because this is closely tied to properties of the data 1452 that an application sends or expects to receive. 1454 Implementation: via SEND.SCTP. 1455 Implementation over TCP: via SEND.TCP. With SEND.TCP, message 1456 boundaries will not be identifiable by the receiver, because TCP 1457 provides a byte stream service. 1458 Implementation over UDP: not possible (UDP is unreliable). 1460 o Unreliably transfer a message 1461 Protocols: SCTP, UDP(-Lite) 1462 Optimizing because only applications know about the time 1463 criticality of their communication, and reliably transfering a 1464 message is never incorrect for the receiver of a potentially 1465 unreliable data transfer, it is just slower. 1466 ADDED. This differs from the 2 automatable transport features 1467 below in that it leaves the choice of congestion control open. 1468 Implementation: via SEND.SCTP or SEND.UDP(-Lite). 1469 Implementation over TCP: use SEND.TCP. With SEND.TCP, messages 1470 will be sent reliably, and message boundaries will not be 1471 identifiable by the receiver. 1473 o Unreliably transfer a message, with congestion control 1474 Protocols: SCTP 1475 Automatable because congestion control relates to knowledge about 1476 the network, not the application. 1478 o Unreliably transfer a message, without congestion control 1479 Protocols: UDP(-Lite) 1480 Automatable because congestion control relates to knowledge about 1481 the network, not the application. 1483 o Configurable Message Reliability 1484 Protocols: SCTP 1485 Optimizing because only applications know about the time 1486 criticality of their communication, and reliably transfering a 1487 message is never incorrect for the receiver of a potentially 1488 unreliable data transfer, it is just slower. 1489 Implementation: via SEND.SCTP. 1490 Implementation over TCP: By using SEND.TCP and ignoring this 1491 configuration: based on the assumption of the best-effort service 1492 model, unnecessarily delivering data does not violate application 1493 expectations. Moreover, it is not possible to associate the 1494 requested reliability to a "message" in TCP anyway. 1495 Implementation over UDP: not possible (UDP is unreliable). 1497 o Choice of stream 1498 Protocols: SCTP 1499 Automatable because it requires using multiple streams, but 1500 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1501 category is automatable. Implementation: see Appendix A.3.2. 1503 o Choice of path (destination address) 1504 Protocols: SCTP 1505 Automatable because it requires using multiple sockets, but 1506 obtaining multiple sockets in the CONNECTION.ESTABLISHMENT 1507 category is automatable. 1509 o Ordered message delivery (potentially slower than unordered) 1510 Protocols: SCTP 1511 Functional because this is closely tied to properties of the data 1512 that an application sends or expects to receive. 1513 Implementation: via SEND.SCTP. 1514 Implementation over TCP: By using SEND.TCP. With SEND.TCP, 1515 messages will not be identifiable by the receiver. 1516 Implementation over UDP: not possible (UDP does not offer any 1517 guarantees regarding ordering). 1519 o Unordered message delivery (potentially faster than ordered) 1520 Protocols: SCTP, UDP(-Lite) 1521 Functional because this is closely tied to properties of the data 1522 that an application sends or expects to receive. 1523 Implementation: via SEND.SCTP. 1524 Implementation over TCP: By using SEND.TCP and always sending data 1525 ordered: based on the assumption of the best-effort service model, 1526 ordered delivery may just be slower and does not violate 1527 application expectations. Moreover, it is not possible to 1528 associate the requested delivery order to a "message" in TCP 1529 anyway. 1531 o Request not to bundle messages 1532 Protocols: SCTP 1533 Optimizing because this decision depends on knowledge about the 1534 size of future data blocks and the delay between them. 1535 Implementation: via SEND.SCTP. 1536 Implementation over TCP: By using SEND.TCP and DISABLE_NAGLE.TCP 1537 to disable the Nagle algorithm when the request is made and enable 1538 it again when the request is no longer made. Note that this is 1539 not fully equivalent because it relates to the time of issuing the 1540 request rather than a specific message. 1541 Implementation over UDP: do nothing (UDP never bundles messages). 1543 o Specifying a "payload protocol-id" (handed over as such by the 1544 receiver) 1545 Protocols: SCTP 1546 Functional because it allows to send extra application data with 1547 every message, for the sake of identification of data, which by 1548 itself is application-specific. 1549 Implementation: SEND.SCTP. 1550 Implementation over TCP: not possible (this functionality is not 1551 available in TCP). 1552 Implementation over UDP: not possible (this functionality is not 1553 available in UDP). 1555 o Specifying a key id to be used to authenticate a message 1556 Protocols: SCTP 1557 Functional because this has a direct influence on security. 1558 Implementation: via a parameter in SEND.SCTP. 1559 Implementation over TCP: This could be emulated by using 1560 SET_AUTH.TCP before and after the message is sent. Note that this 1561 is not fully equivalent because it relates to the time of issuing 1562 the request rather than a specific message. 1563 Implementation over UDP: not possible (UDP does not offer 1564 authentication). 1566 o Request not to delay the acknowledgement (SACK) of a message 1567 Protocols: SCTP 1568 Optimizing because only an application knows for which message it 1569 wants to quickly be informed about success / failure of its 1570 delivery. 1571 Implementation over TCP: do nothing (TCP does not offer this 1572 functionality, but ignoring this request from the application will 1573 not yield a semantically wrong behavior). 1575 Implementation over UDP: do nothing (UDP does not offer this 1576 functionality, but ignoring this request from the application will 1577 not yield a semantically wrong behavior). 1579 A.1.2.2. Receiving Data 1581 o Receive data (with no message delimiting) 1582 Protocols: TCP 1583 Functional because a transport system must be able to send and 1584 receive data. 1585 Implementation: via RECEIVE.TCP. 1586 Implementation over UDP: do nothing (UDP only works on messages; 1587 these can be handed over, the application can still ignore the 1588 message boundaries). 1590 o Receive a message 1591 Protocols: SCTP, UDP(-Lite) 1592 Functional because this is closely tied to properties of the data 1593 that an application sends or expects to receive. 1594 Implementation: via RECEIVE.SCTP and RECEIVE.UDP(-Lite). 1595 Implementation over TCP: not possible (TCP does not support 1596 identification of message boundaries). 1598 o Choice of stream to receive from 1599 Protocols: SCTP 1600 Automatable because it requires using multiple streams, but 1601 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1602 category is automatable. 1603 Implementation: see Appendix A.3.2. 1605 o Information about partial message arrival 1606 Protocols: SCTP 1607 Functional because this is closely tied to properties of the data 1608 that an application sends or expects to receive. 1609 Implementation: via RECEIVE.SCTP. 1610 Implementation over TCP: do nothing (this information is not 1611 available with TCP). 1612 Implementation over UDP: do nothing (this information is not 1613 available with UDP). 1615 A.1.2.3. Errors 1617 This section describes sending failures that are associated with a 1618 specific call to in the "Sending Data" category (Appendix A.1.2.1). 1620 o Notification of send failures 1621 Protocols: SCTP, UDP(-Lite) 1622 Functional because this notifies that potentially assumed reliable 1623 data delivery is no longer provided. 1624 ADDED. This differs from the 2 automatable transport features 1625 below in that it does not distinugish between unsent and 1626 unacknowledged messages. 1627 Implementation: via SENDFAILURE-EVENT.SCTP and SEND_FAILURE.UDP(- 1628 Lite). 1629 Implementation over TCP: do nothing (this notification is not 1630 available and will therefore not occur with TCP). 1632 o Notification of an unsent (part of a) message 1633 Protocols: SCTP, UDP(-Lite) 1634 Automatable because the distinction between unsent and 1635 unacknowledged is network-specific. 1637 o Notification of an unacknowledged (part of a) message 1638 Protocols: SCTP 1639 Automatable because the distinction between unsent and 1640 unacknowledged is network-specific. 1642 o Notification that the stack has no more user data to send 1643 Protocols: SCTP 1644 Optimizing because reacting to this notification requires the 1645 application to be involved, and ensuring that the stack does not 1646 run dry of data (for too long) can improve performance. 1647 Implementation over TCP: do nothing (see the discussion in 1648 Appendix A.3.4). 1649 Implementation over UDP: do nothing (this notification is not 1650 available and will therefore not occur with UDP). 1652 o Notification to a receiver that a partial message delivery has 1653 been aborted 1654 Protocols: SCTP 1655 Functional because this is closely tied to properties of the data 1656 that an application sends or expects to receive. 1657 Implementation over TCP: do nothing (this notification is not 1658 available and will therefore not occur with TCP). 1659 Implementation over UDP: do nothing (this notification is not 1660 available and will therefore not occur with UDP). 1662 A.2. Step 2: Reduction -- The Reduced Set of Transport Features 1664 By hiding automatable transport features from the application, a 1665 transport system can gain opportunities to automate the usage of 1666 network-related functionality. This can facilitate using the 1667 transport system for the application programmer and it allows for 1668 optimizations that may not be possible for an application. For 1669 instance, system-wide configurations regarding the usage of multiple 1670 interfaces can better be exploited if the choice of the interface is 1671 not entirely up to the application. Therefore, since they are not 1672 strictly necessary to expose in a transport system, we do not include 1673 automatable transport features in the reduced set of transport 1674 features. This leaves us with only the transport features that are 1675 either optimizing or functional. 1677 A transport system should be able to communicate via TCP or UDP if 1678 alternative transport protocols are found not to work. For many 1679 transport features, this is possible -- often by simply not doing 1680 anything when a specific request is made. For some transport 1681 features, however, it was identified that direct usage of neither TCP 1682 nor UDP is possible: in these cases, even not doing anything would 1683 incur semantically incorrect behavior. Whenever an application would 1684 make use of one of these transport features, this would eliminate the 1685 possibility to use TCP or UDP. Thus, we only keep the functional and 1686 optimizing transport features for which an implementation over either 1687 TCP or UDP is possible in our reduced set. 1689 The "minimal set" derived in this document is meant to be 1690 implementable "one-sided" over TCP, and, with limitations, UDP. In 1691 the following list, we therefore precede a transport feature with 1692 "T:" if an implementation over TCP is possible, "U:" if an 1693 implementation over UDP is possible, and "TU:" if an implementation 1694 over either TCP or UDP is possible. 1696 A.2.1. CONNECTION Related Transport Features 1698 ESTABLISHMENT: 1700 o T,U: Connect 1701 o T,U: Specify number of attempts and/or timeout for the first 1702 establishment message 1703 o T: Configure authentication 1704 o T: Hand over a message to reliably transfer (possibly multiple 1705 times) before connection establishment 1706 o T: Hand over a message to reliably transfer during connection 1707 establishment 1709 AVAILABILITY: 1711 o T,U: Listen 1712 o T: Configure authentication 1714 MAINTENANCE: 1716 o T: Change timeout for aborting connection (using retransmit limit 1717 or time value) 1718 o T: Suggest timeout to the peer 1719 o T,U: Disable Nagle algorithm 1720 o T,U: Notification of Excessive Retransmissions (early warning 1721 below abortion threshold) 1722 o T,U: Specify DSCP field 1723 o T,U: Notification of ICMP error message arrival 1724 o T: Change authentication parameters 1725 o T: Obtain authentication information 1726 o T,U: Set Cookie life value 1727 o T,U: Choose a scheduler to operate between streams of an 1728 association 1729 o T,U: Configure priority or weight for a scheduler 1730 o T,U: Disable checksum when sending 1731 o T,U: Disable checksum requirement when receiving 1732 o T,U: Specify checksum coverage used by the sender 1733 o T,U: Specify minimum checksum coverage required by receiver 1734 o T,U: Specify DF field 1735 o T,U: Get max. transport-message size that may be sent using a non- 1736 fragmented IP packet from the configured interface 1737 o T,U: Get max. transport-message size that may be received from the 1738 configured interface 1739 o T,U: Obtain ECN field 1740 o T,U: Enable and configure a "Low Extra Delay Background Transfer" 1742 TERMINATION: 1744 o T: Close after reliably delivering all remaining data, causing an 1745 event informing the application on the other side 1746 o T: Abort without delivering remaining data, causing an event 1747 informing the application on the other side 1748 o T,U: Abort without delivering remaining data, not causing an event 1749 informing the application on the other side 1750 o T,U: Timeout event when data could not be delivered for too long 1752 A.2.2. DATA Transfer Related Transport Features 1754 A.2.2.1. Sending Data 1756 o T: Reliably transfer data, with congestion control 1757 o T: Reliably transfer a message, with congestion control 1758 o T,U: Unreliably transfer a message 1759 o T: Configurable Message Reliability 1760 o T: Ordered message delivery (potentially slower than unordered) 1761 o T,U: Unordered message delivery (potentially faster than ordered) 1762 o T,U: Request not to bundle messages 1763 o T: Specifying a key id to be used to authenticate a message 1764 o T,U: Request not to delay the acknowledgement (SACK) of a message 1766 A.2.2.2. Receiving Data 1768 o T,U: Receive data (with no message delimiting) 1769 o U: Receive a message 1770 o T,U: Information about partial message arrival 1772 A.2.2.3. Errors 1774 This section describes sending failures that are associated with a 1775 specific call to in the "Sending Data" category (Appendix A.1.2.1). 1777 o T,U: Notification of send failures 1778 o T,U: Notification that the stack has no more user data to send 1779 o T,U: Notification to a receiver that a partial message delivery 1780 has been aborted 1782 A.3. Step 3: Discussion 1784 The reduced set in the previous section exhibits a number of 1785 peculiarities, which we will discuss in the following. This section 1786 focuses on TCP because, with the exception of one particular 1787 transport feature ("Receive a message" -- we will discuss this in 1788 Appendix A.3.1), the list shows that UDP is strictly a subset of TCP. 1789 We can first try to understand how to build a transport system that 1790 can run over TCP, and then narrow down the result further to allow 1791 that the system can always run over either TCP or UDP (which 1792 effectively means removing everything related to reliability, 1793 ordering, authentication and closing/aborting with a notification to 1794 the peer). 1796 Note that, because the functional transport features of UDP are -- 1797 with the exception of "Receive a message" -- a subset of TCP, TCP can 1798 be used as a replacement for UDP whenever an application does not 1799 need message delimiting (e.g., because the application-layer protocol 1800 already does it). This has been recognized by many applications that 1801 already do this in practice, by trying to communicate with UDP at 1802 first, and falling back to TCP in case of a connection failure. 1804 A.3.1. Sending Messages, Receiving Bytes 1806 For implementing a transport system over TCP, there are several 1807 transport features related to sending, but only a single transport 1808 feature related to receiving: "Receive data (with no message 1809 delimiting)" (and, strangely, "information about partial message 1810 arrival"). Notably, the transport feature "Receive a message" is 1811 also the only non-automatable transport feature of UDP(-Lite) for 1812 which no implementation over TCP is possible. 1814 To support these TCP receiver semantics, we define an "Application- 1815 Framed Bytestream" (AFra-Bytestream). AFra-Bytestreams allow senders 1816 to operate on messages while minimizing changes to the TCP socket 1817 API. In particular, nothing changes on the receiver side - data can 1818 be accepted via a normal TCP socket. 1820 In an AFra-Bytestream, the sending application can optionally inform 1821 the transport about message boundaries and required properties per 1822 message (configurable order and reliability, or embedding a request 1823 not to delay the acknowledgement of a message). Whenever the sending 1824 application specifies per-message properties that relax the notion of 1825 reliable in-order delivery of bytes, it must assume that the 1826 receiving application is 1) able to determine message boundaries, 1827 provided that messages are always kept intact, and 2) able to accept 1828 these relaxed per-message properties. Any signaling of such 1829 information to the peer is up to an application-layer protocol and 1830 considered out of scope of this document. 1832 For example, if an application requests to transfer fixed-size 1833 messages of 100 bytes with partial reliability, this needs the 1834 receiving application to be prepared to accept data in chunks of 100 1835 bytes. If, then, some of these 100-byte messages are missing (e.g., 1836 if SCTP with Configurable Reliability is used), this is the expected 1837 application behavior. With TCP, no messages would be missing, but 1838 this is also correct for the application, and the possible 1839 retransmission delay is acceptable within the best effort service 1840 model (see [RFC7305], Section 3.5). Still, the receiving application 1841 would separate the byte stream into 100-byte chunks. 1843 Note that this usage of messages does not require all messages to be 1844 equal in size. Many application protocols use some form of Type- 1845 Length-Value (TLV) encoding, e.g. by defining a header including 1846 length fields; another alternative is the use of byte stuffing 1847 methods such as COBS [COBS]. If an application needs message 1848 numbers, e.g. to restore the correct sequence of messages, these must 1849 also be encoded by the application itself, as the sequence number 1850 related transport features of SCTP are not provided by the "minimum 1851 set" (in the interest of enabling usage of TCP). 1853 A.3.2. Stream Schedulers Without Streams 1855 We have already stated that multi-streaming does not require 1856 application-specific knowledge. Potential benefits or disadvantages 1857 of, e.g., using two streams of an SCTP association versus using two 1858 separate SCTP associations or TCP connections are related to 1859 knowledge about the network and the particular transport protocol in 1860 use, not the application. However, the transport features "Choose a 1861 scheduler to operate between streams of an association" and 1862 "Configure priority or weight for a scheduler" operate on streams. 1863 Here, streams identify communication channels between which a 1864 scheduler operates, and they can be assigned a priority. Moreover, 1865 the transport features in the MAINTENANCE category all operate on 1866 assocations in case of SCTP, i.e. they apply to all streams in that 1867 assocation. 1869 With only these semantics necessary to represent, the interface to a 1870 transport system becomes easier if we assume that connections may be 1871 a transport protocol's connection or association, but could also be a 1872 stream of an existing SCTP association, for example. We only need to 1873 allow for a way to define a possible grouping of connections. Then, 1874 all MAINTENANCE transport features can be said to operate on 1875 connection groups, not connections, and a scheduler operates on the 1876 connections within a group. 1878 To be compatible with multiple transport protocols and uniformly 1879 allow access to both transport connections and streams of a multi- 1880 streaming protocol, the semantics of opening and closing need to be 1881 the most restrictive subset of all of the underlying options. For 1882 example, TCP's support of half-closed connections can be seen as a 1883 feature on top of the more restrictive "ABORT"; this feature cannot 1884 be supported because not all protocols used by a transport system 1885 (including streams of an association) support half-closed 1886 connections. 1888 A.3.3. Early Data Transmission 1890 There are two transport features related to transferring a message 1891 early: "Hand over a message to reliably transfer (possibly multiple 1892 times) before connection establishment", which relates to TCP Fast 1893 Open [RFC7413], and "Hand over a message to reliably transfer during 1894 connection establishment", which relates to SCTP's ability to 1895 transfer data together with the COOKIE-Echo chunk. Also without TCP 1896 Fast Open, TCP can transfer data during the handshake, together with 1897 the SYN packet -- however, the receiver of this data may not hand it 1898 over to the application until the handshake has completed. Also, 1899 different from TCP Fast Open, this data is not delimited as a message 1900 by TCP (thus, not visible as a ``message''). This functionality is 1901 commonly available in TCP and supported in several implementations, 1902 even though the TCP specification does not explain how to provide it 1903 to applications. 1905 A transport system could differentiate between the cases of 1906 transmitting data "before" (possibly multiple times) or "during" the 1907 handshake. Alternatively, it could also assume that data that are 1908 handed over early will be transmitted as early as possible, and 1909 "before" the handshake would only be used for messages that are 1910 explicitly marked as "idempotent" (i.e., it would be acceptable to 1911 transfer them multiple times). 1913 The amount of data that can successfully be transmitted before or 1914 during the handshake depends on various factors: the transport 1915 protocol, the use of header options, the choice of IPv4 and IPv6 and 1916 the Path MTU. A transport system should therefore allow a sending 1917 application to query the maximum amount of data it can possibly 1918 transmit before (or, if exposed, during) connection establishment. 1920 A.3.4. Sender Running Dry 1922 The transport feature "Notification that the stack has no more user 1923 data to send" relates to SCTP's "SENDER DRY" notification. Such 1924 notifications can, in principle, be used to avoid having an 1925 unnecessarily large send buffer, yet ensure that the transport sender 1926 always has data available when it has an opportunity to transmit it. 1927 This has been found to be very beneficial for some applications 1928 [WWDC2015]. However, "SENDER DRY" truly means that the entire send 1929 buffer (including both unsent and unacknowledged data) has emptied -- 1930 i.e., when it notifies the sender, it is already too late, the 1931 transport protocol already missed an opportunity to send data. Some 1932 modern TCP implementations now include the unspecified 1933 "TCP_NOTSENT_LOWAT" socket option that was proposed in [WWDC2015], 1934 which limits the amount of unsent data that TCP can keep in the 1935 socket buffer; this allows to specify at which buffer filling level 1936 the socket becomes writable, rather than waiting for the buffer to 1937 run empty. 1939 SCTP allows to configure the sender-side buffer too: the automatable 1940 Transport Feature "Configure send buffer size" provides this 1941 functionality, but only for the complete buffer, which includes both 1942 unsent and unacknowledged data. SCTP does not allow to control these 1943 two sizes separately. It therefore makes sense for a transport 1944 system to allow for uniform access to "TCP_NOTSENT_LOWAT" as well as 1945 the "SENDER DRY" notification. 1947 A.3.5. Capacity Profile 1949 The transport features: 1951 o Disable Nagle algorithm 1952 o Enable and configure a "Low Extra Delay Background Transfer" 1953 o Specify DSCP field 1955 all relate to a QoS-like application need such as "low latency" or 1956 "scavenger". In the interest of flexibility of a transport system, 1957 they could therefore be offered in a uniform, more abstract way, 1958 where a transport system could e.g. decide by itself how to use 1959 combinations of LEDBAT-like congestion control and certain DSCP 1960 values, and an application would only specify a general "capacity 1961 profile" (a description of how it wants to use the available 1962 capacity). A need for "lowest possible latency at the expense of 1963 overhead" could then translate into automatically disabling the Nagle 1964 algorithm. 1966 In some cases, the Nagle algorithm is best controlled directly by the 1967 application because it is not only related to a general profile but 1968 also to knowledge about the size of future messages. For fine-grain 1969 control over Nagle-like functionality, the "Request not to bundle 1970 messages" is available. 1972 A.3.6. Security 1974 Both TCP and SCTP offer authentication. TCP authenticates complete 1975 segments. SCTP allows to configure which of SCTP's chunk types must 1976 always be authenticated -- if this is exposed as such, it creates an 1977 undesirable dependency on the transport protocol. For compatibility 1978 with TCP, a transport system should only allow to configure complete 1979 transport layer packets, including headers, IP pseudo-header (if any) 1980 and payload. 1982 Security is discussed in a separate document 1983 [I-D.ietf-taps-transport-security]. The minimal set presented in the 1984 present document excludes all security related transport features: 1985 "Configure authentication", "Change authentication parameters", 1986 "Obtain authentication information" and and "Set Cookie life value" 1987 as well as "Specifying a key id to be used to authenticate a 1988 message". 1990 A.3.7. Packet Size 1992 UDP(-Lite) has a transport feature called "Specify DF field". This 1993 yields an error message in case of sending a message that exceeds the 1994 Path MTU, which is necessary for a UDP-based application to be able 1995 to implement Path MTU Discovery (a function that UDP-based 1996 applications must do by themselves). The "Get max. transport-message 1997 size that may be sent using a non-fragmented IP packet from the 1998 configured interface" transport feature yields an upper limit for the 1999 Path MTU (minus headers) and can therefore help to implement Path MTU 2000 Discovery more efficiently. 2002 Appendix B. Revision information 2004 XXX RFC-Ed please remove this section prior to publication. 2006 -02: implementation suggestions added, discussion section added, 2007 terminology extended, DELETED category removed, various other fixes; 2008 list of Transport Features adjusted to -01 version of [RFC8303] 2009 except that MPTCP is not included. 2011 -03: updated to be consistent with -02 version of [RFC8303]. 2013 -04: updated to be consistent with -03 version of [RFC8303]. 2014 Reorganized document, rewrote intro and conclusion, and made a first 2015 stab at creating a real "minimal set". 2017 -05: updated to be consistent with -05 version of [RFC8303] (minor 2018 changes). Fixed a mistake regarding Cookie Life value. Exclusion of 2019 security related transport features (to be covered in a separate 2020 document). Reorganized the document (now begins with the minset, 2021 derivation is in the appendix). First stab at an abstract API for 2022 the minset. 2024 draft-ietf-taps-minset-00: updated to be consistent with -08 version 2025 of [RFC8303] ("obtain message delivery number" was removed, as this 2026 has also been removed in [RFC8303] because it was a mistake in 2027 RFC4960. This led to the removal of two more transport features that 2028 were only designated as functional because they affected "obtain 2029 message delivery number"). Fall-back to UDP incorporated (this was 2030 requested at IETF-99); this also affected the transport feature 2031 "Choice between unordered (potentially faster) or ordered delivery of 2032 messages" because this is a boolean which is always true for one 2033 fall-back protocol, and always false for the other one. This was 2034 therefore now divided into two features, one for ordered, one for 2035 unordered delivery. The word "reliably" was added to the transport 2036 features "Hand over a message to reliably transfer (possibly multiple 2037 times) before connection establishment" and "Hand over a message to 2038 reliably transfer during connection establishment" to make it clearer 2039 why this is not supported by UDP. Clarified that the "minset 2040 abstract interface" is not proposing a specific API for all TAPS 2041 systems to implement, but it is just a way to describe the minimum 2042 set. Author order changed. 2044 WG -01: "fall-back to" (TCP or UDP) replaced (mostly with 2045 "implementation over"). References to post-sockets removed (these 2046 were statments that assumed that post-sockets requires two-sided 2047 implementation). Replaced "flow" with "TAPS Connection" and "frame" 2048 with "message" to avoid introducing new terminology. Made sections 3 2049 and 4 in line with the categorization that is already used in the 2050 appendix and [RFC8303], and changed style of section 4 to be even 2051 shorter and less interface-like. Updated reference draft-ietf-tsvwg- 2052 sctp-ndata to RFC8260. 2054 WG -02: rephrased "the TAPS system" and "TAPS connection" etc. to 2055 more generally talk about transport after the intro (mostly replacing 2056 "TAPS system" with "transport system" and "TAPS connection" with 2057 "connection". Merged sections 3 and 4 to form a new section 3. 2059 WG -03: updated sentence referencing 2060 [I-D.ietf-taps-transport-security] to say that "the minimum security 2061 requirements for a taps system are discussed in a separate security 2062 document", wrote "example" in the paragraph introducing the decision 2063 tree. Removed reference draft-grinnemo-taps-he-03 and the sentence 2064 that referred to it. 2066 WG -04: addressed comments from Theresa Enghardt and Tommy Pauly. As 2067 part of that, removed "TAPS" as a term everywhere (abstract, intro, 2068 ..). 2070 WG -05: addressed comments from Spencer Dawkins. 2072 WG -06: Fixed nits. 2074 Authors' Addresses 2075 Michael Welzl 2076 University of Oslo 2077 PO Box 1080 Blindern 2078 Oslo N-0316 2079 Norway 2081 Phone: +47 22 85 24 20 2082 Email: michawe@ifi.uio.no 2084 Stein Gjessing 2085 University of Oslo 2086 PO Box 1080 Blindern 2087 Oslo N-0316 2088 Norway 2090 Phone: +47 22 85 24 44 2091 Email: steing@ifi.uio.no