idnits 2.17.1 draft-ietf-taps-minset-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 28, 2018) is 2243 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'SUBCATEGORY' is mentioned on line 743, but not defined == Outdated reference: A later version (-02) exists of draft-pauly-taps-transport-security-01 -- Unexpected draft version: The latest known version of draft-tsvwg-le-phb is -00, but you're referring to -03. Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TAPS M. Welzl 3 Internet-Draft S. Gjessing 4 Intended status: Informational University of Oslo 5 Expires: September 1, 2018 February 28, 2018 7 A Minimal Set of Transport Services for TAPS Systems 8 draft-ietf-taps-minset-02 10 Abstract 12 This draft recommends a minimal set of IETF Transport Services 13 offered by end systems supporting TAPS, and gives guidance on 14 choosing among the available mechanisms and protocols. It is based 15 on the set of transport features in RFC 8303. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at https://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on September 1, 2018. 34 Copyright Notice 36 Copyright (c) 2018 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (https://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 52 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 53 3. The Minimal Set of Transport Features . . . . . . . . . . . . 5 54 3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION . . . . . . . 5 55 3.2. MAINTENANCE . . . . . . . . . . . . . . . . . . . . . . . 8 56 3.2.1. Connection groups . . . . . . . . . . . . . . . . . . 8 57 3.2.2. Individual connections . . . . . . . . . . . . . . . 10 58 3.3. DATA Transfer . . . . . . . . . . . . . . . . . . . . . . 10 59 3.3.1. Sending Data . . . . . . . . . . . . . . . . . . . . 10 60 3.3.2. Receiving Data . . . . . . . . . . . . . . . . . . . 11 61 4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 12 62 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 63 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 64 7. Security Considerations . . . . . . . . . . . . . . . . . . . 12 65 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 66 8.1. Normative References . . . . . . . . . . . . . . . . . . 13 67 8.2. Informative References . . . . . . . . . . . . . . . . . 13 68 Appendix A. Deriving the minimal set . . . . . . . . . . . . . . 15 69 A.1. Step 1: Categorization -- The Superset of Transport 70 Features . . . . . . . . . . . . . . . . . . . . . . . . 15 71 A.1.1. CONNECTION Related Transport Features . . . . . . . . 17 72 A.1.2. DATA Transfer Related Transport Features . . . . . . 32 73 A.2. Step 2: Reduction -- The Reduced Set of Transport 74 Features . . . . . . . . . . . . . . . . . . . . . . . . 37 75 A.2.1. CONNECTION Related Transport Features . . . . . . . . 38 76 A.2.2. DATA Transfer Related Transport Features . . . . . . 39 77 A.3. Step 3: Discussion . . . . . . . . . . . . . . . . . . . 40 78 A.3.1. Sending Messages, Receiving Bytes . . . . . . . . . . 40 79 A.3.2. Stream Schedulers Without Streams . . . . . . . . . . 41 80 A.3.3. Early Data Transmission . . . . . . . . . . . . . . . 42 81 A.3.4. Sender Running Dry . . . . . . . . . . . . . . . . . 43 82 A.3.5. Capacity Profile . . . . . . . . . . . . . . . . . . 43 83 A.3.6. Security . . . . . . . . . . . . . . . . . . . . . . 44 84 A.3.7. Packet Size . . . . . . . . . . . . . . . . . . . . . 44 85 Appendix B. Revision information . . . . . . . . . . . . . . . . 45 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 46 88 1. Introduction 90 The task of any system that implements TAPS is to offer transport 91 services to its applications, i.e. the applications running on top of 92 the transport system, without binding them to a particular transport 93 protocol. Currently, the set of transport services that most 94 applications use is based on TCP and UDP (and protocols that are 95 layered on top of them); this limits the ability for the network 96 stack to make use of features of other transport protocols. For 97 example, if a protocol supports out-of-order message delivery but 98 applications always assume that the network provides an ordered 99 bytestream, then the network stack can not immediately deliver a 100 message that arrives out-of-order: doing so would break a fundamental 101 assumption of the application. The net result is unnecessary head- 102 of-line blocking delay. 104 By exposing the transport services of multiple transport protocols, a 105 TAPS transport system can make it possible to use these services 106 without having to statically bind an application to a specific 107 transport protocol. The first step towards the design of such a 108 system was taken by [RFC8095], which surveys a large number of 109 transports, and [RFC8303] as well as [RFC8304], which identify the 110 specific transport features that are exposed to applications by the 111 protocols TCP, MPTCP, UDP(-Lite) and SCTP as well as the LEDBAT 112 congestion control mechanism. This memo is based on these documents 113 and follows the same terminology (also listed below). Because the 114 considered transport protocols conjointly cover a wide range of 115 transport features, there is reason to hope that the resulting set 116 (and the reasoning that led to it) will also apply to many aspects of 117 other transport protocols. 119 The number of transport features of current IETF transports is large, 120 and exposing all of them has a number of disadvantages: generally, 121 the more functionality is exposed, the less freedom a transport 122 system has to automate usage of the various functions of its 123 available set of transport protocols. Some functions only exist in 124 one particular protocol, and if an application would use them, this 125 would statically tie the application to this protocol, counteracting 126 the purpose of TAPS. Also, if the number of exposed features is 127 exceedingly large, a transport system might become very difficult to 128 use for an application programmer. Taking [RFC8303] as a basis, this 129 document therefore develops a minimal set of transport features, 130 removing the ones that could be harmful to the purpose of TAPS but 131 keeping the ones that must be retained for applications to benefit 132 from useful transport functionality. 134 Applications use a wide variety of APIs today. The transport 135 features in the minimal set in this document must be reflected in 136 *all* network APIs in order for the underlying functionality to 137 become usable everywhere. For example, it does not help an 138 application that talks to a middleware if only the Berkeley Sockets 139 API is extended to offer "unordered message delivery", but the 140 middleware only offers an ordered bytestream. Both the Berkeley 141 Sockets API and the middleware would have to expose the "unordered 142 message delivery" transport feature (alternatively, there may be ways 143 for certain types of middleware to use this transport feature without 144 exposing it, based on knowledge about the applications -- but this is 145 not the general case). In most situations, in the interest of being 146 as flexible and efficient as possible, the best choice will be for a 147 middleware or library to expose at least all of the transport 148 features that are recommended as a "minimal set" here. 150 This "minimal set" can be implemented one-sided over TCP (or UDP, if 151 certain limitations are put in place). This means that a sender-side 152 TAPS system implementing it can talk to a non-TAPS TCP (or UDP) 153 receiver, and a receiver-side TAPS system implementing it can talk to 154 a non-TAPS TCP (or UDP) sender. 156 2. Terminology 158 The following terms are used throughout this document, and in 159 subsequent documents produced by TAPS that describe the composition 160 and decomposition of transport services. 162 Transport Feature: a specific end-to-end feature that the transport 163 layer provides to an application. Examples include 164 confidentiality, reliable delivery, ordered delivery, message- 165 versus-stream orientation, etc. 166 Transport Service: a set of Transport Features, without an 167 association to any given framing protocol, which provides a 168 complete service to an application. 169 Transport Protocol: an implementation that provides one or more 170 different transport services using a specific framing and header 171 format on the wire. 172 Transport Service Instance: an arrangement of transport protocols 173 with a selected set of features and configuration parameters that 174 implements a single transport service, e.g., a protocol stack (RTP 175 over UDP). 176 Application: an entity that uses the transport layer for end-to-end 177 delivery data across the network (this may also be an upper layer 178 protocol or tunnel encapsulation). 179 Application-specific knowledge: knowledge that only applications 180 have. 181 Endpoint: an entity that communicates with one or more other 182 endpoints using a transport protocol. 183 Connection: shared state of two or more endpoints that persists 184 across messages that are transmitted between these endpoints. 185 Socket: the combination of a destination IP address and a 186 destination port number. 188 Moreover, throughout the document, the protocol name "UDP(-Lite)" is 189 used when discussing transport features that are equivalent for UDP 190 and UDP-Lite; similarly, the protocol name "TCP" refers to both TCP 191 and MPTCP. 193 3. The Minimal Set of Transport Features 195 Based on the categorization, reduction and discussion in Appendix A, 196 this section describes the minimal set of transport features that is 197 offered by end systems supporting TAPS. The described transport 198 system can be implemented over TCP; elements of the system that may 199 prohibit implementation over UDP are marked with "!UDP". To 200 implement a transport system that can also work over UDP, these 201 marked transport features should be excluded. 203 As in Appendix A, Appendix A.2 and [RFC8303], we categorize the 204 minimal set of transport features as 1) CONNECTION related 205 (ESTABLISHMENT, AVAILABILITY, MAINTENANCE, TERMINATION) and 2) DATA 206 Transfer related (Sending Data, Receiving Data, Errors). Here, the 207 focus is on connections that the transport system offers, as opposed 208 to connections of transport protocols that the transport system uses. 210 3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION 212 A connection must first be "created" to allow for some initial 213 configuration to be carried out before the transport system can 214 actively or passively establish communication with a remote endpoint. 215 All configuration parameters in Section 3.2 can be used initially, 216 although some of them may only take effect when a connection has been 217 established with a chosen transport protocol. Configuring a 218 connection early helps a transport system make the right decisions. 219 For example, grouping information can influence the transport system 220 to implement a connection as a stream of a multi-streaming protocol's 221 existing association or not. 223 For ungrouped connections, early configuration is necessary because 224 it allows the transport system to know which protocols it should try 225 to use (to steer a mechanism such as "Happy Eyeballs" 226 [I-D.grinnemo-taps-he]). In particular, a transport system that only 227 makes a one-time choice for a particular protocol must know early 228 about strict requirements that must be kept, or it can end up in a 229 deadlock situation (e.g., having chosen UDP and later be asked to 230 support reliable transfer). As a possibility to correctly handle 231 these cases, we provide the following decision tree (this is derived 232 from Appendix A.2.1 excluding authentication, as explained in 233 Section 7): 235 - Will it ever be necessary to offer any of the following? 236 * Reliably transfer data 237 * Notify the peer of closing/aborting 238 * Preserve data ordering 240 Yes: SCTP or TCP can be used. 241 - Is any of the following useful to the application? 242 * Choosing a scheduler to operate between connections 243 in a group, with the possibility to configure a priority 244 or weight per connection 245 * Configurable message reliability 246 * Unordered message delivery 247 * Request not to delay the acknowledgement (SACK) of a message 249 Yes: SCTP is preferred. 250 No: 251 - Is any of the following useful to the application? 252 * Hand over a message to reliably transfer (possibly 253 multiple times) before connection establishment 254 * Suggest timeout to the peer 255 * Notification of Excessive Retransmissions (early 256 warning below abortion threshold) 257 * Notification of ICMP error message arrival 259 Yes: TCP is preferred. 260 No: SCTP and TCP are equally preferable. 262 No: all protocols can be used. 263 - Is any of the following useful to the application? 264 * Specify checksum coverage used by the sender 265 * Specify minimum checksum coverage required by receiver 267 Yes: UDP-Lite is preferred. 268 No: UDP is preferred. 270 Note that this decision tree is not optimal for all cases. For 271 example, if an application wants to use "Specify checksum coverage 272 used by the sender", which is only offered by UDP-Lite, and 273 "Configure priority or weight for a scheduler", which is only offered 274 by SCTP, the above decision tree will always choose UDP-Lite, making 275 it impossible to use SCTP's schedulers with priorities between 276 grouped connections. The transport system must know which choice is 277 more important for the application in order to make the best 278 decision. We caution implementers to be aware of the full set of 279 trade-offs, for which we recommend consulting the list in 280 Appendix A.2.1 when deciding how to initialize a connection. 282 To summarize, the following parameters serve as input for the 283 transport system to help it choose and configure a suitable protocol: 285 o Reliability: a boolean that should be set to true when any of the 286 following will be useful to the application: reliably transfer 287 data; notify the peer of closing/aborting; preserve data ordering. 288 o Checksum_coverage: a boolean to specify whether it will be useful 289 to the application to specify checksum coverage when sending or 290 receiving. 291 o Config_msg_prio: a boolean that should be set to true when any of 292 the following per-message configuration or prioritization 293 mechanisms will be useful to the application: choosing a scheduler 294 to operate between grouped connections, with the possibility to 295 configure a priority or weight per connection; configurable 296 message reliability; unordered message delivery; requesting not to 297 delay the acknowledgement (SACK) of a message. 298 o Earlymsg_timeout_notifications: a boolean that should be set to 299 true when any of the following will be useful to the application: 300 hand over a message to reliably transfer (possibly multiple times) 301 before connection establishment; suggest timeout to the peer; 302 notification of excessive retransmissions (early warning below 303 abortion threshold); notification of ICMP error message arrival. 305 Once a connection is created, it can be queried for the maximum 306 amount of data that an application can possibly expect to have 307 reliably transmitted before or during transport connection 308 establishment (with zero being a possible answer) (see 309 Section 3.2.1). An application can also give the connection a 310 message for reliable transmission before or during connection 311 establishment (!UDP); the transport system will then try to transmit 312 it as early as possible. An application can facilitate sending a 313 message particularly early by marking it as "idempotent" (see 314 Section 3.3.1); in this case, the receiving application must be 315 prepared to potentially receive multiple copies of the message 316 (because idempotent messages are reliably transferred, asking for 317 idempotence is not necessary for systems that support UDP). 319 After creation, a transport system can actively establish 320 communication with a peer, or it can passively listen for incoming 321 connection requests. Note that active establishment may or may not 322 trigger a notification on the listening side. It is possible that 323 the first notification on the listening side is the arrival of the 324 first data that the active side sends (a receiver-side transport 325 system could handle this by continuing to block a "Listen" call, 326 immediately followed by issuing "Receive", for example; callback- 327 based implementations could simply skip the equivalent of "Listen"). 328 This also means that the active opening side is assumed to be the 329 first side sending data. 331 A transport system can actively close a connection, i.e. terminate it 332 after reliably delivering all remaining data to the peer (if reliable 333 data delivery was requested earlier (!UDP)), in which case the peer 334 is notified that the connection is closed. Alternatively, a 335 connection can be aborted without delivering outstanding data to the 336 peer. In case reliable or partially reliable data delivery was 337 requested earlier (!UDP), the peer is notified that the connection is 338 aborted. A timeout can be configured to abort a connection when data 339 could not be delivered for too long (!UDP); however, timeout-based 340 abortion does not notify the peer application that the connection has 341 been aborted. Because half-closed connections are not supported, 342 when a host implementing TAPS receives a notification that the peer 343 is closing or aborting the connection (!UDP), its peer may not be 344 able to read outstanding data. This means that unacknowledged data 345 residing a transport system's send buffer may have to be dropped from 346 that buffer upon arrival of a "close" or "abort" notification from 347 the peer. 349 3.2. MAINTENANCE 351 A transport system must offer means to group connections, but it 352 cannot guarantee truly grouping them using the transport protocols 353 that it uses (e.g., it cannot be guaranteed that connections become 354 multiplexed as streams on a single SCTP association when SCTP may not 355 be available). The transport system must therefore ensure that 356 group- versus non-group-configurations are handled correctly in some 357 way (e.g., by applying the configuration to all grouped connections 358 even when they are not multiplexed, or informing the application 359 about grouping success or failure). 361 As a general rule, any configuration described below should be 362 carried out as early as possible to aid the transport system's 363 decision making. 365 3.2.1. Connection groups 367 The following transport features and notifications (some directly 368 from Appendix A.2, some new or changed, based on the discussion in 369 Appendix A.3) automatically apply to all grouped connections: 371 (!UDP) Configure a timeout: this can be done with the following 372 parameters: 374 o A timeout value for aborting connections, in seconds 375 o A timeout value to be suggested to the peer (if possible), in 376 seconds 377 o The number of retransmissions after which the application should 378 be notifed of "Excessive Retransmissions" 380 Configure urgency: this can be done with the following parameters: 382 o A number to identify the type of scheduler that should be used to 383 operate between connections in the group (no guarantees given). 384 Schedulers are defined in [RFC8260]. 385 o A "capacity profile" number to identify how an application wants 386 to use its available capacity. Choices can be "lowest possible 387 latency at the expense of overhead" (which would disable any 388 Nagle-like algorithm), "scavenger", or values that help determine 389 the DSCP value for a connection (e.g. similar to table 1 in 390 [I-D.ietf-tsvwg-rtcweb-qos]). 391 o A buffer limit (in bytes); when the sender has less then the 392 provided limit of bytes in the buffer, the application may be 393 notified. Notifications are not guaranteed, and it is optional 394 for a transport system to support buffer limit values greater than 395 0. Note that this limit and its notification should operate 396 across the buffers of the whole transport system, i.e. also any 397 potential buffers that the transport system itself may use on top 398 of the transport's send buffer. 400 Following Appendix A.3.7, these properties can be queried: 402 o The maximum message size that may be sent without fragmentation 403 via the configured interface. This is optional for a transport 404 system to offer, and may return an error ("not available"). It 405 can aid applications implementing Path MTU Discovery. 406 o The maximum transport message size that can be sent, in bytes. 407 Irrespective of fragmentation, there is a size limit for the 408 messages that can be handed over to SCTP or UDP(-Lite); because 409 the service provided by a transport system is independent of the 410 transport protocol, it must allow an application to query this 411 value -- the maximum size of a message in an Application-Framed- 412 Bytestream (see Appendix A.3.1). This may also return an error 413 when data is not delimited ("not available"). 414 o The maximum transport message size that can be received from the 415 configured interface, in bytes (or "not available"). 416 o The maximum amount of data that can possibly be sent before or 417 during connection establishment, in bytes. 419 In addition to the already mentioned closing / aborting notifications 420 and possible send errors, the following notifications can occur: 422 o Excessive Retransmissions: the configured (or a default) number of 423 retransmissions has been reached, yielding this early warning 424 below an abortion threshold. 425 o ICMP Arrival (parameter: ICMP message): an ICMP packet carrying 426 the conveyed ICMP message has arrived. 428 o ECN Arrival (parameter: ECN value): a packet carrying the conveyed 429 ECN value has arrived. This can be useful for applications 430 implementing congestion control. 431 o Timeout (parameter: s seconds): data could not be delivered for s 432 seconds. 433 o Drain: the send buffer has either drained below the configured 434 buffer limit or it has become completely empty. This is a generic 435 notification that tries to enable uniform access to 436 "TCP_NOTSENT_LOWAT" as well as the "SENDER DRY" notification (as 437 discussed in Appendix A.3.4 -- SCTP's "SENDER DRY" is a special 438 case where the threshold (for unsent data) is 0 and there is also 439 no more unacknowledged data in the send buffer). 441 3.2.2. Individual connections 443 Configure priority or weight for a scheduler, as described in 444 [RFC8260]. 446 Configure checksum usage: this can be done with the following 447 parameters, but there is no guarantee that any checksum limitations 448 will indeed be enforced (the default behavior is "full coverage, 449 checksum enabled"): 451 o A boolean to enable / disable usage of a checksum when sending 452 o The desired coverage (in bytes) of the checksum used when sending 453 o A boolean to enable / disable requiring a checksum when receiving 454 o The required minimum coverage (in bytes) of the checksum when 455 receiving 457 3.3. DATA Transfer 459 3.3.1. Sending Data 461 When sending a message, no guarantees are given about the 462 preservation of message boundaries to the peer; if message boundaries 463 are needed, the receiving application at the peer must know about 464 them beforehand (or the transport system cannot use TCP). Note that 465 an application should already be able to hand over data before the 466 transport system establishes a connection with a chosen transport 467 protocol. Regarding the message that is being handed over, the 468 following parameters can be used: 470 o Reliability: this parameter is used to convey a choice of: fully 471 reliable (!UDP), unreliable without congestion control, unreliable 472 (!UDP), partially reliable (see [RFC3758] and [RFC7496] for 473 details on how to specify partial reliability) (!UDP). The latter 474 two choices are optional for a transport system to offer and may 475 result in full reliability. Note that applications sending 476 unreliable data without congestion control should themselves 477 perform congestion control in accordance with [RFC2914]. 478 o (!UDP) Ordered: this boolean parameter lets an application choose 479 between ordered message delivery (true) and possibly unordered, 480 potentially faster message delivery (false). 481 o Bundle: a boolean that expresses a preference for allowing to 482 bundle messages (true) or not (false). No guarantees are given. 483 o DelAck: a boolean that, if false, lets an application request that 484 the peer would not delay the acknowledgement for this message. 485 o Fragment: a boolean that expresses a preference for allowing to 486 fragment messages (true) or not (false), at the IP level. No 487 guarantees are given. 488 o (!UDP) Idempotent: a boolean that expresses whether a message is 489 idempotent (true) or not (false). Idempotent messages may arrive 490 multiple times at the receiver (but they will arrive at least 491 once). When data is idempotent it can be used by the receiver 492 immediately on a connection establishment attempt. Thus, if data 493 is handed over before the transport system establishes a 494 connection with a chosen transport protocol, stating that a 495 message is idempotent facilitates transmitting it to the peer 496 application particularly early. 498 An application can be notified of a failure to send a specific 499 message. There is no guarantee of such notifications, i.e. send 500 failures can also silently occur. 502 3.3.2. Receiving Data 504 A receiving application obtains an "Application-Framed Bytestream" 505 (AFra-Bytestream); this concept is further described in 506 Appendix A.3.1). In line with TCP's receiver semantics, an AFra- 507 Bytestream is just a stream of bytes to the receiver. If message 508 boundaries were specified by the sender, a receiver-side transport 509 system implementing only the minimum set of transport services 510 defined here will still not inform the receiving application about 511 them (this limitation is only needed for transport systems that are 512 implemented to directly use TCP). 514 Different from TCP's semantics, if the sending application has 515 allowed that messages are not fully reliably transferred, or 516 delivered out of order, then such re-ordering or unreliability may be 517 reflected per message in the arriving data. Messages will always 518 stay intact - i.e. if an incomplete message is contained at the end 519 of the arriving data block, this message is guaranteed to continue in 520 the next arriving data block. 522 4. Conclusion 524 By decoupling applications from transport protocols, a TAPS transport 525 system provides a different abstraction level than the Berkeley 526 sockets interface. As with high- vs. low-level programming 527 languages, a higher abstraction level allows more freedom for 528 automation below the interface, yet it takes some control away from 529 the application programmer. This is the design trade-off that a 530 transport system developer is facing, and this document provides 531 guidance on the design of this abstraction level. Some transport 532 features are currently rarely offered by APIs, yet they must be 533 offered or they can never be used ("functional" transport features). 534 Other transport features are offered by the APIs of the protocols 535 covered here, but not exposing them in a TAPS API would allow for 536 more freedom to automate protocol usage in a transport system. The 537 minimal set presented in this document is an effort to find a middle 538 ground that can be recommended for transport systems to implement, on 539 the basis of the transport features discussed in [RFC8303]. 541 5. Acknowledgements 543 The authors would like to thank all the participants of the TAPS 544 Working Group and the NEAT and MAMI research projects for valuable 545 input to this document. We especially thank Michael Tuexen for help 546 with connection connection establishment/teardown and Gorry Fairhurst 547 for his suggestions regarding fragmentation and packet sizes. This 548 work has received funding from the European Union's Horizon 2020 549 research and innovation programme under grant agreement No. 644334 550 (NEAT). 552 6. IANA Considerations 554 XX RFC ED - PLEASE REMOVE THIS SECTION XXX 556 This memo includes no request to IANA. 558 7. Security Considerations 560 Authentication, confidentiality protection, and integrity protection 561 are identified as transport features by [RFC8095]. As currently 562 deployed in the Internet, these features are generally provided by a 563 protocol or layer on top of the transport protocol; no current full- 564 featured standards-track transport protocol provides all of these 565 transport features on its own. Therefore, these transport features 566 are not considered in this document, with the exception of native 567 authentication capabilities of TCP and SCTP for which the security 568 considerations in [RFC5925] and [RFC4895] apply. Security is 569 discussed further in a separate TAPS document 570 [I-D.pauly-taps-transport-security]. 572 8. References 574 8.1. Normative References 576 [RFC8303] Welzl, M., Tuexen, M., and N. Khademi, "On the Usage of 577 Transport Features Provided by IETF Transport Protocols", 578 RFC 8303, DOI 10.17487/RFC8303, February 2018, 579 . 581 8.2. Informative References 583 [COBS] Cheshire, S. and M. Baker, "Consistent Overhead Byte 584 Stuffing", September 1997, 585 . 587 [I-D.grinnemo-taps-he] 588 Grinnemo, K., Brunstrom, A., Hurtig, P., Khademi, N., and 589 Z. Bozakov, "Happy Eyeballs for Transport Selection", 590 draft-grinnemo-taps-he-03 (work in progress), July 2017. 592 [I-D.ietf-tsvwg-rtcweb-qos] 593 Jones, P., Dhesikan, S., Jennings, C., and D. Druta, "DSCP 594 Packet Markings for WebRTC QoS", draft-ietf-tsvwg-rtcweb- 595 qos-18 (work in progress), August 2016. 597 [I-D.pauly-taps-transport-security] 598 Pauly, T., Rose, K., and C. Wood, "A Survey of Transport 599 Security Protocols", draft-pauly-taps-transport- 600 security-01 (work in progress), January 2018. 602 [LBE-draft] 603 Bless, R., "A Lower Effort Per-Hop Behavior (LE PHB)", 604 Internet-draft draft-tsvwg-le-phb-03, February 2018. 606 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 607 RFC 2914, DOI 10.17487/RFC2914, September 2000, 608 . 610 [RFC3758] Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P. 611 Conrad, "Stream Control Transmission Protocol (SCTP) 612 Partial Reliability Extension", RFC 3758, 613 DOI 10.17487/RFC3758, May 2004, 614 . 616 [RFC4895] Tuexen, M., Stewart, R., Lei, P., and E. Rescorla, 617 "Authenticated Chunks for the Stream Control Transmission 618 Protocol (SCTP)", RFC 4895, DOI 10.17487/RFC4895, August 619 2007, . 621 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 622 Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, 623 . 625 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 626 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 627 June 2010, . 629 [RFC7305] Lear, E., Ed., "Report from the IAB Workshop on Internet 630 Technology Adoption and Transition (ITAT)", RFC 7305, 631 DOI 10.17487/RFC7305, July 2014, 632 . 634 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 635 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 636 . 638 [RFC7496] Tuexen, M., Seggelmann, R., Stewart, R., and S. Loreto, 639 "Additional Policies for the Partially Reliable Stream 640 Control Transmission Protocol Extension", RFC 7496, 641 DOI 10.17487/RFC7496, April 2015, 642 . 644 [RFC8095] Fairhurst, G., Ed., Trammell, B., Ed., and M. Kuehlewind, 645 Ed., "Services Provided by IETF Transport Protocols and 646 Congestion Control Mechanisms", RFC 8095, 647 DOI 10.17487/RFC8095, March 2017, 648 . 650 [RFC8260] Stewart, R., Tuexen, M., Loreto, S., and R. Seggelmann, 651 "Stream Schedulers and User Message Interleaving for the 652 Stream Control Transmission Protocol", RFC 8260, 653 DOI 10.17487/RFC8260, November 2017, 654 . 656 [RFC8304] Fairhurst, G. and T. Jones, "Transport Features of the 657 User Datagram Protocol (UDP) and Lightweight UDP (UDP- 658 Lite)", RFC 8304, DOI 10.17487/RFC8304, February 2018, 659 . 661 [WWDC2015] 662 Lakhera, P. and S. Cheshire, "Your App and Next Generation 663 Networks", Apple Worldwide Developers Conference 2015, San 664 Francisco, USA, June 2015, 665 . 667 Appendix A. Deriving the minimal set 669 We approach the construction of a minimal set of transport features 670 in the following way: 672 1. Categorization: the superset of transport features from [RFC8303] 673 is presented, and transport features are categorized for later 674 reduction. 675 2. Reduction: a shorter list of transport features is derived from 676 the categorization in the first step. This removes all transport 677 features that do not require application-specific knowledge or 678 cannot be implemented with TCP or UDP. 679 3. Discussion: the resulting list shows a number of peculiarities 680 that are discussed, to provide a basis for constructing the 681 minimal set. 682 4. Construction: Based on the reduced set and the discussion of the 683 transport features therein, a minimal set is constructed. 685 The first three steps as well as the underlying rationale for 686 constructing the minimal set are described in this appendix. The 687 minimal set itself is described in Section 3. 689 A.1. Step 1: Categorization -- The Superset of Transport Features 691 Following [RFC8303], we divide the transport features into two main 692 groups as follows: 694 1. CONNECTION related transport features 695 - ESTABLISHMENT 696 - AVAILABILITY 697 - MAINTENANCE 698 - TERMINATION 700 2. DATA Transfer related transport features 701 - Sending Data 702 - Receiving Data 703 - Errors 705 We assume that applications have no specific requirements that need 706 knowledge about the network, e.g. regarding the choice of network 707 interface or the end-to-end path. Even with these assumptions, there 708 are certain requirements that are strictly kept by transport 709 protocols today, and these must also be kept by a transport system. 710 Some of these requirements relate to transport features that we call 711 "Functional". 713 Functional transport features provide functionality that cannot be 714 used without the application knowing about them, or else they violate 715 assumptions that might cause the application to fail. For example, 716 ordered message delivery is a functional transport feature: it cannot 717 be configured without the application knowing about it because the 718 application's assumption could be that messages always arrive in 719 order. Failure includes any change of the application behavior that 720 is not performance oriented, e.g. security. 722 "Change DSCP" and "Disable Nagle algorithm" are examples of transport 723 features that we call "Optimizing": if a transport system 724 autonomously decides to enable or disable them, an application will 725 not fail, but a transport system may be able to communicate more 726 efficiently if the application is in control of this optimizing 727 transport feature. These transport features require application- 728 specific knowledge (e.g., about delay/bandwidth requirements or the 729 length of future data blocks that are to be transmitted). 731 The transport features of IETF transport protocols that do not 732 require application-specific knowledge and could therefore be 733 transparently utilized by a transport system are called 734 "Automatable". 736 Finally, some transport features are aggregated and/or slightly 737 changed in the description below. These transport features are 738 marked as "ADDED". The corresponding transport features are 739 automatable, and they are listed immediately below the "ADDED" 740 transport feature. 742 In this description, transport services are presented following the 743 nomenclature "CATEGORY.[SUBCATEGORY].SERVICENAME.PROTOCOL", 744 equivalent to "pass 2" in [RFC8303]. We also sketch how some of the 745 TAPS transport features can be implemented by a transport system. 746 For all transport features that are categorized as "functional" or 747 "optimizing", and for which no matching TCP and/or UDP primitive 748 exists in "pass 2" of [RFC8303], a brief discussion on how to 749 implement them over TCP and/or UDP is included. 751 We designate some transport features as "automatable" on the basis of 752 a broader decision that affects multiple transport features: 754 o Most transport features that are related to multi-streaming were 755 designated as "automatable". This was done because the decision 756 on whether to use multi-streaming or not does not depend on 757 application-specific knowledge. This means that a connection that 758 is exhibited to an application could be implemented by using a 759 single stream of an SCTP association instead of mapping it to a 760 complete SCTP association or TCP connection. This could be 761 achieved by using more than one stream when an SCTP association is 762 first established (CONNECT.SCTP parameter "outbound stream 763 count"), maintaining an internal stream number, and using this 764 stream number when sending data (SEND.SCTP parameter "stream 765 number"). Closing or aborting a connection could then simply free 766 the stream number for future use. This is discussed further in 767 Appendix A.3.2. 768 o All transport features that are related to using multiple paths or 769 the choice of the network interface were designated as 770 "automatable". Choosing a path or an interface does not depend on 771 application-specific knowledge. For example, "Listen" could 772 always listen on all available interfaces and "Connect" could use 773 the default interface for the destination IP address. 775 A.1.1. CONNECTION Related Transport Features 777 ESTABLISHMENT: 779 o Connect 780 Protocols: TCP, SCTP, UDP(-Lite) 781 Functional because the notion of a connection is often reflected 782 in applications as an expectation to be able to communicate after 783 a "Connect" succeeded, with a communication sequence relating to 784 this transport feature that is defined by the application 785 protocol. 786 Implementation: via CONNECT.TCP, CONNECT.SCTP or CONNECT.UDP(- 787 Lite). 789 o Specify which IP Options must always be used 790 Protocols: TCP, UDP(-Lite) 791 Automatable because IP Options relate to knowledge about the 792 network, not the application. 794 o Request multiple streams 795 Protocols: SCTP 796 Automatable because using multi-streaming does not require 797 application-specific knowledge. 798 Implementation: see Appendix A.3.2. 800 o Limit the number of inbound streams 801 Protocols: SCTP 802 Automatable because using multi-streaming does not require 803 application-specific knowledge. 804 Implementation: see Appendix A.3.2. 806 o Specify number of attempts and/or timeout for the first 807 establishment message 808 Protocols: TCP, SCTP 809 Functional because this is closely related to potentially assumed 810 reliable data delivery for data that is sent before or during 811 connection establishment. 812 Implementation: Using a parameter of CONNECT.TCP and CONNECT.SCTP. 813 Implementation over UDP: Do nothing (this is irrelevant in case of 814 UDP because there, reliable data delivery is not assumed). 816 o Obtain multiple sockets 817 Protocols: SCTP 818 Automatable because the usage of multiple paths to communicate to 819 the same end host relates to knowledge about the network, not the 820 application. 822 o Disable MPTCP 823 Protocols: MPTCP 824 Automatable because the usage of multiple paths to communicate to 825 the same end host relates to knowledge about the network, not the 826 application. 827 Implementation: via a boolean parameter in CONNECT.MPTCP. 829 o Configure authentication 830 Protocols: TCP, SCTP 831 Functional because this has a direct influence on security. 832 Implementation: via parameters in CONNECT.TCP and CONNECT.SCTP. 833 Implementation over TCP: With TCP, this allows to configure Master 834 Key Tuples (MKTs) to authenticate complete segments (including the 835 TCP IPv4 pseudoheader, TCP header, and TCP data). With SCTP, this 836 allows to specify which chunk types must always be authenticated. 837 Authenticating only certain chunk types creates a reduced level of 838 security that is not supported by TCP; to be compatible, this 839 should therefore only allow to authenticate all chunk types. Key 840 material must be provided in a way that is compatible with both 841 [RFC4895] and [RFC5925]. 843 Implementation over UDP: Not possible. 845 o Indicate (and/or obtain upon completion) an Adaptation Layer via 846 an adaptation code point 847 Protocols: SCTP 848 Functional because it allows to send extra data for the sake of 849 identifying an adaptation layer, which by itself is application- 850 specific. 851 Implementation: via a parameter in CONNECT.SCTP. 852 Implementation over TCP: not possible. 853 Implementation over UDP: not possible. 855 o Request to negotiate interleaving of user messages 856 Protocols: SCTP 857 Automatable because it requires using multiple streams, but 858 requesting multiple streams in the CONNECTION.ESTABLISHMENT 859 category is automatable. 860 Implementation: via a parameter in CONNECT.SCTP. 862 o Hand over a message to reliably transfer (possibly multiple times) 863 before connection establishment 864 Protocols: TCP 865 Functional because this is closely tied to properties of the data 866 that an application sends or expects to receive. 867 Implementation: via a parameter in CONNECT.TCP. 868 Implementation over UDP: not possible. 870 o Hand over a message to reliably transfer during connection 871 establishment 872 Protocols: SCTP 873 Functional because this can only work if the message is limited in 874 size, making it closely tied to properties of the data that an 875 application sends or expects to receive. 876 Implementation: via a parameter in CONNECT.SCTP. 877 Implementation over UDP: not possible. 879 o Enable UDP encapsulation with a specified remote UDP port number 880 Protocols: SCTP 881 Automatable because UDP encapsulation relates to knowledge about 882 the network, not the application. 884 AVAILABILITY: 886 o Listen 887 Protocols: TCP, SCTP, UDP(-Lite) 888 Functional because the notion of accepting connection requests is 889 often reflected in applications as an expectation to be able to 890 communicate after a "Listen" succeeded, with a communication 891 sequence relating to this transport feature that is defined by the 892 application protocol. 893 ADDED. This differs from the 3 automatable transport features 894 below in that it leaves the choice of interfaces for listening 895 open. 896 Implementation: by listening on all interfaces via LISTEN.TCP (not 897 providing a local IP address) or LISTEN.SCTP (providing SCTP port 898 number / address pairs for all local IP addresses). LISTEN.UDP(- 899 Lite) supports both methods. 901 o Listen, 1 specified local interface 902 Protocols: TCP, SCTP, UDP(-Lite) 903 Automatable because decisions about local interfaces relate to 904 knowledge about the network and the Operating System, not the 905 application. 907 o Listen, N specified local interfaces 908 Protocols: SCTP 909 Automatable because decisions about local interfaces relate to 910 knowledge about the network and the Operating System, not the 911 application. 913 o Listen, all local interfaces 914 Protocols: TCP, SCTP, UDP(-Lite) 915 Automatable because decisions about local interfaces relate to 916 knowledge about the network and the Operating System, not the 917 application. 919 o Specify which IP Options must always be used 920 Protocols: TCP, UDP(-Lite) 921 Automatable because IP Options relate to knowledge about the 922 network, not the application. 924 o Disable MPTCP 925 Protocols: MPTCP 926 Automatable because the usage of multiple paths to communicate to 927 the same end host relates to knowledge about the network, not the 928 application. 930 o Configure authentication 931 Protocols: TCP, SCTP 932 Functional because this has a direct influence on security. 933 Implementation: via parameters in LISTEN.TCP and LISTEN.SCTP. 934 Implementation over TCP: With TCP, this allows to configure Master 935 Key Tuples (MKTs) to authenticate complete segments (including the 936 TCP IPv4 pseudoheader, TCP header, and TCP data). With SCTP, this 937 allows to specify which chunk types must always be authenticated. 938 Authenticating only certain chunk types creates a reduced level of 939 security that is not supported by TCP; to be compatible, this 940 should therefore only allow to authenticate all chunk types. Key 941 material must be provided in a way that is compatible with both 942 [RFC4895] and [RFC5925]. 943 Implementation over UDP: not possible. 945 o Obtain requested number of streams 946 Protocols: SCTP 947 Automatable because using multi-streaming does not require 948 application-specific knowledge. 949 Implementation: see Appendix A.3.2. 951 o Limit the number of inbound streams 952 Protocols: SCTP 953 Automatable because using multi-streaming does not require 954 application-specific knowledge. 955 Implementation: see Appendix A.3.2. 957 o Indicate (and/or obtain upon completion) an Adaptation Layer via 958 an adaptation code point 959 Protocols: SCTP 960 Functional because it allows to send extra data for the sake of 961 identifying an adaptation layer, which by itself is application- 962 specific. 963 Implementation: via a parameter in LISTEN.SCTP. 964 Implementation over TCP: not possible. 965 Implementation over UDP: not possible. 967 o Request to negotiate interleaving of user messages 968 Protocols: SCTP 969 Automatable because it requires using multiple streams, but 970 requesting multiple streams in the CONNECTION.ESTABLISHMENT 971 category is automatable. 972 Implementation: via a parameter in LISTEN.SCTP. 974 MAINTENANCE: 976 o Change timeout for aborting connection (using retransmit limit or 977 time value) 978 Protocols: TCP, SCTP 979 Functional because this is closely related to potentially assumed 980 reliable data delivery. 981 Implementation: via CHANGE_TIMEOUT.TCP or CHANGE_TIMEOUT.SCTP. 982 Implementation over UDP: not possible (UDP is unreliable and there 983 is no connection timeout). 985 o Suggest timeout to the peer 986 Protocols: TCP 987 Functional because this is closely related to potentially assumed 988 reliable data delivery. 989 Implementation: via CHANGE_TIMEOUT.TCP. 990 Implementation over UDP: not possible (UDP is unreliable and there 991 is no connection timeout). 993 o Disable Nagle algorithm 994 Protocols: TCP, SCTP 995 Optimizing because this decision depends on knowledge about the 996 size of future data blocks and the delay between them. 997 Implementation: via DISABLE_NAGLE.TCP and DISABLE_NAGLE.SCTP. 998 Implementation over UDP: do nothing (UDP does not implement the 999 Nagle algorithm). 1001 o Request an immediate heartbeat, returning success/failure 1002 Protocols: SCTP 1003 Automatable because this informs about network-specific knowledge. 1005 o Notification of Excessive Retransmissions (early warning below 1006 abortion threshold) 1007 Protocols: TCP 1008 Optimizing because it is an early warning to the application, 1009 informing it of an impending functional event. 1010 Implementation: via ERROR.TCP. 1011 Implementation over UDP: do nothing (there is no abortion 1012 threshold). 1014 o Add path 1015 Protocols: MPTCP, SCTP 1016 MPTCP Parameters: source-IP; source-Port; destination-IP; 1017 destination-Port 1018 SCTP Parameters: local IP address 1019 Automatable because the usage of multiple paths to communicate to 1020 the same end host relates to knowledge about the network, not the 1021 application. 1023 o Remove path 1024 Protocols: MPTCP, SCTP 1025 MPTCP Parameters: source-IP; source-Port; destination-IP; 1026 destination-Port 1027 SCTP Parameters: local IP address 1028 Automatable because the usage of multiple paths to communicate to 1029 the same end host relates to knowledge about the network, not the 1030 application. 1032 o Set primary path 1033 Protocols: SCTP 1034 Automatable because the usage of multiple paths to communicate to 1035 the same end host relates to knowledge about the network, not the 1036 application. 1038 o Suggest primary path to the peer 1039 Protocols: SCTP 1040 Automatable because the usage of multiple paths to communicate to 1041 the same end host relates to knowledge about the network, not the 1042 application. 1044 o Configure Path Switchover 1045 Protocols: SCTP 1046 Automatable because the usage of multiple paths to communicate to 1047 the same end host relates to knowledge about the network, not the 1048 application. 1050 o Obtain status (query or notification) 1051 Protocols: SCTP, MPTCP 1052 SCTP parameters: association connection state; destination 1053 transport address list; destination transport address reachability 1054 states; current local and peer receiver window size; current local 1055 congestion window sizes; number of unacknowledged DATA chunks; 1056 number of DATA chunks pending receipt; primary path; most recent 1057 SRTT on primary path; RTO on primary path; SRTT and RTO on other 1058 destination addresses; MTU per path; interleaving supported yes/no 1059 MPTCP parameters: subflow-list (identified by source-IP; source- 1060 Port; destination-IP; destination-Port) 1061 Automatable because these parameters relate to knowledge about the 1062 network, not the application. 1064 o Specify DSCP field 1065 Protocols: TCP, SCTP, UDP(-Lite) 1066 Optimizing because choosing a suitable DSCP value requires 1067 application-specific knowledge. 1068 Implementation: via SET_DSCP.TCP / SET_DSCP.SCTP / SET_DSCP.UDP(- 1069 Lite) 1071 o Notification of ICMP error message arrival 1072 Protocols: TCP, UDP(-Lite) 1073 Optimizing because these messages can inform about success or 1074 failure of functional transport features (e.g., host unreachable 1075 relates to "Connect") 1076 Implementation: via ERROR.TCP or ERROR.UDP(-Lite). 1078 o Obtain information about interleaving support 1079 Protocols: SCTP 1080 Automatable because it requires using multiple streams, but 1081 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1082 category is automatable. 1083 Implementation: via STATUS.SCTP. 1085 o Change authentication parameters 1086 Protocols: TCP, SCTP 1087 Functional because this has a direct influence on security. 1088 Implementation: via SET_AUTH.TCP and SET_AUTH.SCTP. 1089 Implementation over TCP: With SCTP, this allows to adjust key_id, 1090 key, and hmac_id. With TCP, this allows to change the preferred 1091 outgoing MKT (current_key) and the preferred incoming MKT 1092 (rnext_key), respectively, for a segment that is sent on the 1093 connection. Key material must be provided in a way that is 1094 compatible with both [RFC4895] and [RFC5925]. 1095 Implementation over UDP: not possible. 1097 o Obtain authentication information 1098 Protocols: SCTP 1099 Functional because authentication decisions may have been made by 1100 the peer, and this has an influence on the necessary application- 1101 level measures to provide a certain level of security. 1102 Implementation: via GET_AUTH.SCTP. 1103 Implementation over TCP: With SCTP, this allows to obtain key_id 1104 and a chunk list. With TCP, this allows to obtain current_key and 1105 rnext_key from a previously received segment. Key material must 1106 be provided in a way that is compatible with both [RFC4895] and 1107 [RFC5925]. 1108 Implementation over UDP: not possible. 1110 o Reset Stream 1111 Protocols: SCTP 1112 Automatable because using multi-streaming does not require 1113 application-specific knowledge. 1114 Implementation: see Appendix A.3.2. 1116 o Notification of Stream Reset 1117 Protocols: STCP 1118 Automatable because using multi-streaming does not require 1119 application-specific knowledge. 1120 Implementation: see Appendix A.3.2. 1122 o Reset Association 1123 Protocols: SCTP 1124 Automatable because deciding to reset an association does not 1125 require application-specific knowledge. 1126 Implementation: via RESET_ASSOC.SCTP. 1128 o Notification of Association Reset 1129 Protocols: STCP 1130 Automatable because this notification does not relate to 1131 application-specific knowledge. 1133 o Add Streams 1134 Protocols: SCTP 1135 Automatable because using multi-streaming does not require 1136 application-specific knowledge. 1137 Implementation: see Appendix A.3.2. 1139 o Notification of Added Stream 1140 Protocols: STCP 1141 Automatable because using multi-streaming does not require 1142 application-specific knowledge. 1143 Implementation: see Appendix A.3.2. 1145 o Choose a scheduler to operate between streams of an association 1146 Protocols: SCTP 1147 Optimizing because the scheduling decision requires application- 1148 specific knowledge. However, if a transport system would not use 1149 this, or wrongly configure it on its own, this would only affect 1150 the performance of data transfers; the outcome would still be 1151 correct within the "best effort" service model. 1152 Implementation: using SET_STREAM_SCHEDULER.SCTP. 1153 Implementation over TCP: do nothing. 1154 Implementation over UDP: do nothing. 1156 o Configure priority or weight for a scheduler 1157 Protocols: SCTP 1158 Optimizing because the priority or weight requires application- 1159 specific knowledge. However, if a transport system would not use 1160 this, or wrongly configure it on its own, this would only affect 1161 the performance of data transfers; the outcome would still be 1162 correct within the "best effort" service model. 1163 Implementation: using CONFIGURE_STREAM_SCHEDULER.SCTP. 1164 Implementation over TCP: do nothing. 1165 Implementation over UDP: do nothing. 1167 o Configure send buffer size 1168 Protocols: SCTP 1169 Automatable because this decision relates to knowledge about the 1170 network and the Operating System, not the application (see also 1171 the discussion in Appendix A.3.4). 1173 o Configure receive buffer (and rwnd) size 1174 Protocols: SCTP 1175 Automatable because this decision relates to knowledge about the 1176 network and the Operating System, not the application. 1178 o Configure message fragmentation 1179 Protocols: SCTP 1180 Automatable because fragmentation relates to knowledge about the 1181 network and the Operating System, not the application. 1182 Implementation: by always enabling it with 1183 CONFIG_FRAGMENTATION.SCTP and auto-setting the fragmentation size 1184 based on network or Operating System conditions. 1186 o Configure PMTUD 1187 Protocols: SCTP 1188 Automatable because Path MTU Discovery relates to knowledge about 1189 the network, not the application. 1191 o Configure delayed SACK timer 1192 Protocols: SCTP 1193 Automatable because the receiver-side decision to delay sending 1194 SACKs relates to knowledge about the network, not the application 1195 (it can be relevant for a sending application to request not to 1196 delay the SACK of a message, but this is a different transport 1197 feature). 1199 o Set Cookie life value 1200 Protocols: SCTP 1201 Functional because it relates to security (possibly weakened by 1202 keeping a cookie very long) versus the time between connection 1203 establishment attempts. Knowledge about both issues can be 1204 application-specific. 1206 Implementation over TCP: the closest specified TCP functionality 1207 is the cookie in TCP Fast Open; for this, [RFC7413] states that 1208 the server "can expire the cookie at any time to enhance security" 1209 and section 4.1.2 describes an example implementation where 1210 updating the key on the server side causes the cookie to expire. 1211 Alternatively, for implementations that do not support TCP Fast 1212 Open, this transport feature could also affect the validity of SYN 1213 cookies (see Section 3.6 of [RFC4987]). 1214 Implementation over UDP: do nothing. 1216 o Set maximum burst 1217 Protocols: SCTP 1218 Automatable because it relates to knowledge about the network, not 1219 the application. 1221 o Configure size where messages are broken up for partial delivery 1222 Protocols: SCTP 1223 Functional because this is closely tied to properties of the data 1224 that an application sends or expects to receive. 1225 Implementation over TCP: not possible. 1226 Implementation over UDP: not possible. 1228 o Disable checksum when sending 1229 Protocols: UDP 1230 Functional because application-specific knowledge is necessary to 1231 decide whether it can be acceptable to lose data integrity. 1232 Implementation: via SET_CHECKSUM_ENABLED.UDP. 1233 Implementation over TCP: do nothing. 1235 o Disable checksum requirement when receiving 1236 Protocols: UDP 1237 Functional because application-specific knowledge is necessary to 1238 decide whether it can be acceptable to lose data integrity. 1239 Implementation: via SET_CHECKSUM_REQUIRED.UDP. 1240 Implementation over TCP: do nothing. 1242 o Specify checksum coverage used by the sender 1243 Protocols: UDP-Lite 1244 Functional because application-specific knowledge is necessary to 1245 decide for which parts of the data it can be acceptable to lose 1246 data integrity. 1247 Implementation: via SET_CHECKSUM_COVERAGE.UDP-Lite. 1248 Implementation over TCP: do nothing. 1250 o Specify minimum checksum coverage required by receiver 1251 Protocols: UDP-Lite 1252 Functional because application-specific knowledge is necessary to 1253 decide for which parts of the data it can be acceptable to lose 1254 data integrity. 1255 Implementation: via SET_MIN_CHECKSUM_COVERAGE.UDP-Lite. 1256 Implementation over TCP: do nothing. 1258 o Specify DF field 1259 Protocols: UDP(-Lite) 1260 Optimizing because the DF field can be used to carry out Path MTU 1261 Discovery, which can lead an application to choose message sizes 1262 that can be transmitted more efficiently. 1263 Implementation: via MAINTENANCE.SET_DF.UDP(-Lite) and 1264 SEND_FAILURE.UDP(-Lite). 1265 Implementation over TCP: do nothing. With TCP the sender is not 1266 in control of transport message sizes, making this functionality 1267 irrelevant. 1269 o Get max. transport-message size that may be sent using a non- 1270 fragmented IP packet from the configured interface 1271 Protocols: UDP(-Lite) 1272 Optimizing because this can lead an application to choose message 1273 sizes that can be transmitted more efficiently. 1274 Implementation over TCP: do nothing: this information is not 1275 available with TCP. 1277 o Get max. transport-message size that may be received from the 1278 configured interface 1279 Protocols: UDP(-Lite) 1280 Optimizing because this can, for example, influence an 1281 application's memory management. 1282 Implementation over TCP: do nothing: this information is not 1283 available with TCP. 1285 o Specify TTL/Hop count field 1286 Protocols: UDP(-Lite) 1287 Automatable because a transport system can use a large enough 1288 system default to avoid communication failures. Allowing an 1289 application to configure it differently can produce notifications 1290 of ICMP error message arrivals that yield information which only 1291 relates to knowledge about the network, not the application. 1293 o Obtain TTL/Hop count field 1294 Protocols: UDP(-Lite) 1295 Automatable because the TTL/Hop count field relates to knowledge 1296 about the network, not the application. 1298 o Specify ECN field 1299 Protocols: UDP(-Lite) 1300 Automatable because the ECN field relates to knowledge about the 1301 network, not the application. 1303 o Obtain ECN field 1304 Protocols: UDP(-Lite) 1305 Optimizing because this information can be used by an application 1306 to better carry out congestion control (this is relevant when 1307 choosing a data transmission transport service that does not 1308 already do congestion control). 1309 Implementation over TCP: do nothing: this information is not 1310 available with TCP. 1312 o Specify IP Options 1313 Protocols: UDP(-Lite) 1314 Automatable because IP Options relate to knowledge about the 1315 network, not the application. 1317 o Obtain IP Options 1318 Protocols: UDP(-Lite) 1319 Automatable because IP Options relate to knowledge about the 1320 network, not the application. 1322 o Enable and configure a "Low Extra Delay Background Transfer" 1323 Protocols: A protocol implementing the LEDBAT congestion control 1324 mechanism 1325 Optimizing because whether this service is appropriate or not 1326 depends on application-specific knowledge. However, wrongly using 1327 this will only affect the speed of data transfers (albeit 1328 including other transfers that may compete with the transport 1329 system's transfer in the network), so it is still correct within 1330 the "best effort" service model. 1331 Implementation: via CONFIGURE.LEDBAT and/or SET_DSCP.TCP / 1332 SET_DSCP.SCTP / SET_DSCP.UDP(-Lite) [LBE-draft]. 1333 Implementation over TCP: do nothing. 1334 Implementation over UDP: do nothing. 1336 TERMINATION: 1338 o Close after reliably delivering all remaining data, causing an 1339 event informing the application on the other side 1340 Protocols: TCP, SCTP 1341 Functional because the notion of a connection is often reflected 1342 in applications as an expectation to have all outstanding data 1343 delivered and no longer be able to communicate after a "Close" 1344 succeeded, with a communication sequence relating to this 1345 transport feature that is defined by the application protocol. 1346 Implementation: via CLOSE.TCP and CLOSE.SCTP. 1347 Implementation over UDP: not possible. 1349 o Abort without delivering remaining data, causing an event 1350 informing the application on the other side 1351 Protocols: TCP, SCTP 1352 Functional because the notion of a connection is often reflected 1353 in applications as an expectation to potentially not have all 1354 outstanding data delivered and no longer be able to communicate 1355 after an "Abort" succeeded. On both sides of a connection, an 1356 application protocol may define a communication sequence relating 1357 to this transport feature. 1358 Implementation: via ABORT.TCP and ABORT.SCTP. 1359 Implementation over UDP: not possible. 1361 o Abort without delivering remaining data, not causing an event 1362 informing the application on the other side 1363 Protocols: UDP(-Lite) 1364 Functional because the notion of a connection is often reflected 1365 in applications as an expectation to potentially not have all 1366 outstanding data delivered and no longer be able to communicate 1367 after an "Abort" succeeded. On both sides of a connection, an 1368 application protocol may define a communication sequence relating 1369 to this transport feature. 1370 Implementation: via ABORT.UDP(-Lite). 1371 Implementation over TCP: stop using the connection, wait for a 1372 timeout. 1374 o Timeout event when data could not be delivered for too long 1375 Protocols: TCP, SCTP 1376 Functional because this notifies that potentially assumed reliable 1377 data delivery is no longer provided. 1378 Implementation: via TIMEOUT.TCP and TIMEOUT.SCTP. 1379 Implementation over UDP: do nothing: this event will not occur 1380 with UDP. 1382 A.1.2. DATA Transfer Related Transport Features 1384 A.1.2.1. Sending Data 1386 o Reliably transfer data, with congestion control 1387 Protocols: TCP, SCTP 1388 Functional because this is closely tied to properties of the data 1389 that an application sends or expects to receive. 1390 Implementation: via SEND.TCP and SEND.SCTP. 1391 Implementation over UDP: not possible. 1393 o Reliably transfer a message, with congestion control 1394 Protocols: SCTP 1395 Functional because this is closely tied to properties of the data 1396 that an application sends or expects to receive. 1397 Implementation: via SEND.SCTP. 1398 Implementation over TCP: via SEND.TCP. With SEND.TCP, messages 1399 will not be identifiable by the receiver. 1400 Implementation over UDP: not possible. 1402 o Unreliably transfer a message 1403 Protocols: SCTP, UDP(-Lite) 1404 Optimizing because only applications know about the time 1405 criticality of their communication, and reliably transfering a 1406 message is never incorrect for the receiver of a potentially 1407 unreliable data transfer, it is just slower. 1408 ADDED. This differs from the 2 automatable transport features 1409 below in that it leaves the choice of congestion control open. 1410 Implementation: via SEND.SCTP or SEND.UDP(-Lite). 1411 Implementation over TCP: use SEND.TCP. With SEND.TCP, messages 1412 will be sent reliably, and they will not be identifiable by the 1413 receiver. 1415 o Unreliably transfer a message, with congestion control 1416 Protocols: SCTP 1417 Automatable because congestion control relates to knowledge about 1418 the network, not the application. 1420 o Unreliably transfer a message, without congestion control 1421 Protocols: UDP(-Lite) 1422 Automatable because congestion control relates to knowledge about 1423 the network, not the application. 1425 o Configurable Message Reliability 1426 Protocols: SCTP 1427 Optimizing because only applications know about the time 1428 criticality of their communication, and reliably transfering a 1429 message is never incorrect for the receiver of a potentially 1430 unreliable data transfer, it is just slower. 1431 Implementation: via SEND.SCTP. 1432 Implementation over TCP: By using SEND.TCP and ignoring this 1433 configuration: based on the assumption of the best-effort service 1434 model, unnecessarily delivering data does not violate application 1435 expectations. Moreover, it is not possible to associate the 1436 requested reliability to a "message" in TCP anyway. 1437 Implementation over UDP: not possible. 1439 o Choice of stream 1440 Protocols: SCTP 1441 Automatable because it requires using multiple streams, but 1442 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1443 category is automatable. Implementation: see Appendix A.3.2. 1445 o Choice of path (destination address) 1446 Protocols: SCTP 1447 Automatable because it requires using multiple sockets, but 1448 obtaining multiple sockets in the CONNECTION.ESTABLISHMENT 1449 category is automatable. 1451 o Ordered message delivery (potentially slower than unordered) 1452 Protocols: SCTP 1453 Functional because this is closely tied to properties of the data 1454 that an application sends or expects to receive. 1455 Implementation: via SEND.SCTP. 1456 Implementation over TCP: By using SEND.TCP. With SEND.TCP, 1457 messages will not be identifiable by the receiver. 1458 Implementation over UDP: not possible. 1460 o Unordered message delivery (potentially faster than ordered) 1461 Protocols: SCTP, UDP(-Lite) 1462 Functional because this is closely tied to properties of the data 1463 that an application sends or expects to receive. 1464 Implementation: via SEND.SCTP. 1465 Implementation over TCP: By using SEND.TCP and always sending data 1466 ordered: based on the assumption of the best-effort service model, 1467 ordered delivery may just be slower and does not violate 1468 application expectations. Moreover, it is not possible to 1469 associate the requested delivery order to a "message" in TCP 1470 anyway. 1472 o Request not to bundle messages 1473 Protocols: SCTP 1474 Optimizing because this decision depends on knowledge about the 1475 size of future data blocks and the delay between them. 1476 Implementation: via SEND.SCTP. 1477 Implementation over TCP: By using SEND.TCP and DISABLE_NAGLE.TCP 1478 to disable the Nagle algorithm when the request is made and enable 1479 it again when the request is no longer made. Note that this is 1480 not fully equivalent because it relates to the time of issuing the 1481 request rather than a specific message. 1483 Implementation over UDP: do nothing (UDP never bundles messages). 1485 o Specifying a "payload protocol-id" (handed over as such by the 1486 receiver) 1487 Protocols: SCTP 1488 Functional because it allows to send extra application data with 1489 every message, for the sake of identification of data, which by 1490 itself is application-specific. 1491 Implementation: SEND.SCTP. 1492 Implementation over TCP: not possible. 1493 Implementation over UDP: not possible. 1495 o Specifying a key id to be used to authenticate a message 1496 Protocols: SCTP 1497 Functional because this has a direct influence on security. 1498 Implementation: via a parameter in SEND.SCTP. 1499 Implementation over TCP: This could be emulated by using 1500 SET_AUTH.TCP before and after the message is sent. Note that this 1501 is not fully equivalent because it relates to the time of issuing 1502 the request rather than a specific message. 1503 Implementation over UDP: not possible. 1505 o Request not to delay the acknowledgement (SACK) of a message 1506 Protocols: SCTP 1507 Optimizing because only an application knows for which message it 1508 wants to quickly be informed about success / failure of its 1509 delivery. 1510 Implementation over TCP: do nothing. 1511 Implementation over UDP: do nothing. 1513 A.1.2.2. Receiving Data 1515 o Receive data (with no message delimiting) 1516 Protocols: TCP 1517 Functional because a transport system must be able to send and 1518 receive data. 1519 Implementation: via RECEIVE.TCP. 1520 Implementation over UDP: do nothing (hand over a message, let the 1521 application ignore message boundaries). 1523 o Receive a message 1524 Protocols: SCTP, UDP(-Lite) 1525 Functional because this is closely tied to properties of the data 1526 that an application sends or expects to receive. 1527 Implementation: via RECEIVE.SCTP and RECEIVE.UDP(-Lite). 1528 Implementation over TCP: not possible. 1530 o Choice of stream to receive from 1531 Protocols: SCTP 1532 Automatable because it requires using multiple streams, but 1533 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1534 category is automatable. 1535 Implementation: see Appendix A.3.2. 1537 o Information about partial message arrival 1538 Protocols: SCTP 1539 Functional because this is closely tied to properties of the data 1540 that an application sends or expects to receive. 1541 Implementation: via RECEIVE.SCTP. 1542 Implementation over TCP: do nothing: this information is not 1543 available with TCP. 1544 Implementation over UDP: do nothing: this information is not 1545 available with UDP. 1547 A.1.2.3. Errors 1549 This section describes sending failures that are associated with a 1550 specific call to in the "Sending Data" category (Appendix A.1.2.1). 1552 o Notification of send failures 1553 Protocols: SCTP, UDP(-Lite) 1554 Functional because this notifies that potentially assumed reliable 1555 data delivery is no longer provided. 1556 ADDED. This differs from the 2 automatable transport features 1557 below in that it does not distinugish between unsent and 1558 unacknowledged messages. 1559 Implementation: via SENDFAILURE-EVENT.SCTP and SEND_FAILURE.UDP(- 1560 Lite). 1561 Implementation over TCP: do nothing: this notification is not 1562 available and will therefore not occur with TCP. 1564 o Notification of an unsent (part of a) message 1565 Protocols: SCTP, UDP(-Lite) 1566 Automatable because the distinction between unsent and 1567 unacknowledged is network-specific. 1569 o Notification of an unacknowledged (part of a) message 1570 Protocols: SCTP 1571 Automatable because the distinction between unsent and 1572 unacknowledged is network-specific. 1574 o Notification that the stack has no more user data to send 1575 Protocols: SCTP 1576 Optimizing because reacting to this notification requires the 1577 application to be involved, and ensuring that the stack does not 1578 run dry of data (for too long) can improve performance. 1579 Implementation over TCP: do nothing. See also the discussion in 1580 Appendix A.3.4. 1581 Implementation over UDP: do nothing. This notification is not 1582 available and will therefore not occur with UDP. 1584 o Notification to a receiver that a partial message delivery has 1585 been aborted 1586 Protocols: SCTP 1587 Functional because this is closely tied to properties of the data 1588 that an application sends or expects to receive. 1589 Implementation over TCP: do nothing. This notification is not 1590 available and will therefore not occur with TCP. 1591 Implementation over UDP: do nothing. This notification is not 1592 available and will therefore not occur with UDP. 1594 A.2. Step 2: Reduction -- The Reduced Set of Transport Features 1596 By hiding automatable transport features from the application, a 1597 transport system can gain opportunities to automate the usage of 1598 network-related functionality. This can facilitate using the 1599 transport system for the application programmer and it allows for 1600 optimizations that may not be possible for an application. For 1601 instance, system-wide configurations regarding the usage of multiple 1602 interfaces can better be exploited if the choice of the interface is 1603 not entirely up to the application. Therefore, since they are not 1604 strictly necessary to expose in a transport system, we do not include 1605 automatable transport features in the reduced set of transport 1606 features. This leaves us with only the transport features that are 1607 either optimizing or functional. 1609 A transport system should be able to communicate via TCP or UDP if 1610 alternative transport protocols are found not to work. For many 1611 transport features, this is possible -- often by simply not doing 1612 anything when a specific request is made. For some transport 1613 features, however, it was identified that direct usage of neither TCP 1614 nor UDP is possible: in these cases, even not doing anything would 1615 incur semantically incorrect behavior. Whenever an application would 1616 make use of one of these transport features, this would eliminate the 1617 possibility to use TCP or UDP. Thus, we only keep the functional and 1618 optimizing transport features for which an implementation over either 1619 TCP or UDP is possible in our reduced set. 1621 In the following list, we precede a transport feature with "T:" if an 1622 implementation over TCP is possible, "U:" if an implementation over 1623 UDP is possible, and "TU:" if an implementation over either TCP or 1624 UDP is possible. 1626 A.2.1. CONNECTION Related Transport Features 1628 ESTABLISHMENT: 1630 o T,U: Connect 1631 o T,U: Specify number of attempts and/or timeout for the first 1632 establishment message 1633 o T: Configure authentication 1634 o T: Hand over a message to reliably transfer (possibly multiple 1635 times) before connection establishment 1636 o T: Hand over a message to reliably transfer during connection 1637 establishment 1639 AVAILABILITY: 1641 o T,U: Listen 1642 o T: Configure authentication 1644 MAINTENANCE: 1646 o T: Change timeout for aborting connection (using retransmit limit 1647 or time value) 1648 o T: Suggest timeout to the peer 1649 o T,U: Disable Nagle algorithm 1650 o T,U: Notification of Excessive Retransmissions (early warning 1651 below abortion threshold) 1652 o T,U: Specify DSCP field 1653 o T,U: Notification of ICMP error message arrival 1654 o T: Change authentication parameters 1655 o T: Obtain authentication information 1656 o T,U: Set Cookie life value 1657 o T,U: Choose a scheduler to operate between streams of an 1658 association 1659 o T,U: Configure priority or weight for a scheduler 1660 o T,U: Disable checksum when sending 1661 o T,U: Disable checksum requirement when receiving 1662 o T,U: Specify checksum coverage used by the sender 1663 o T,U: Specify minimum checksum coverage required by receiver 1664 o T,U: Specify DF field 1665 o T,U: Get max. transport-message size that may be sent using a non- 1666 fragmented IP packet from the configured interface 1667 o T,U: Get max. transport-message size that may be received from the 1668 configured interface 1669 o T,U: Obtain ECN field 1670 o T,U: Enable and configure a "Low Extra Delay Background Transfer" 1672 TERMINATION: 1674 o T: Close after reliably delivering all remaining data, causing an 1675 event informing the application on the other side 1676 o T: Abort without delivering remaining data, causing an event 1677 informing the application on the other side 1678 o T,U: Abort without delivering remaining data, not causing an event 1679 informing the application on the other side 1680 o T,U: Timeout event when data could not be delivered for too long 1682 A.2.2. DATA Transfer Related Transport Features 1684 A.2.2.1. Sending Data 1686 o T: Reliably transfer data, with congestion control 1687 o T: Reliably transfer a message, with congestion control 1688 o T,U: Unreliably transfer a message 1689 o T: Configurable Message Reliability 1690 o T: Ordered message delivery (potentially slower than unordered) 1691 o T,U: Unordered message delivery (potentially faster than ordered) 1692 o T,U: Request not to bundle messages 1693 o T: Specifying a key id to be used to authenticate a message 1694 o T,U: Request not to delay the acknowledgement (SACK) of a message 1696 A.2.2.2. Receiving Data 1698 o T,U: Receive data (with no message delimiting) 1699 o U: Receive a message 1700 o T,U: Information about partial message arrival 1702 A.2.2.3. Errors 1704 This section describes sending failures that are associated with a 1705 specific call to in the "Sending Data" category (Appendix A.1.2.1). 1707 o T,U: Notification of send failures 1708 o T,U: Notification that the stack has no more user data to send 1709 o T,U: Notification to a receiver that a partial message delivery 1710 has been aborted 1712 A.3. Step 3: Discussion 1714 The reduced set in the previous section exhibits a number of 1715 peculiarities, which we will discuss in the following. This section 1716 focuses on TCP because, with the exception of one particular 1717 transport feature ("Receive a message" -- we will discuss this in 1718 Appendix A.3.1), the list shows that UDP is strictly a subset of TCP. 1719 We can first try to understand how to build a transport system that 1720 can run over TCP, and then narrow down the result further to allow 1721 that the system can always run over either TCP or UDP (which 1722 effectively means removing everything related to reliability, 1723 ordering, authentication and closing/aborting with a notification to 1724 the peer). 1726 Note that, because the functional transport features of UDP are -- 1727 with the exception of "Receive a message" -- a subset of TCP, TCP can 1728 be used as a replacement for UDP whenever an application does not 1729 need message delimiting (e.g., because the application-layer protocol 1730 already does it). This has been recognized by many applications that 1731 already do this in practice, by trying to communicate with UDP at 1732 first, and falling back to TCP in case of a connection failure. 1734 A.3.1. Sending Messages, Receiving Bytes 1736 For implementing a transport system over TCP, there are several 1737 transport features related to sending, but only a single transport 1738 feature related to receiving: "Receive data (with no message 1739 delimiting)" (and, strangely, "information about partial message 1740 arrival"). Notably, the transport feature "Receive a message" is 1741 also the only non-automatable transport feature of UDP(-Lite) for 1742 which no implementation over TCP is possible. 1744 To support these TCP receiver semantics, we define an "Application- 1745 Framed Bytestream" (AFra-Bytestream). AFra-Bytestreams allow senders 1746 to operate on messages while minimizing changes to the TCP socket 1747 API. In particular, nothing changes on the receiver side - data can 1748 be accepted via a normal TCP socket. 1750 In an AFra-Bytestream, the sending application can optionally inform 1751 the transport about message boundaries and required properties per 1752 message (configurable order and reliability, or embedding a request 1753 not to delay the acknowledgement of a message). Whenever the sending 1754 application specifies per-message properties that relax the notion of 1755 reliable in-order delivery of bytes, it must assume that the 1756 receiving application is 1) able to determine message boundaries, 1757 provided that messages are always kept intact, and 2) able to accept 1758 these relaxed per-message properties. Any signaling of such 1759 information to the peer is up to an application-layer protocol and 1760 considered out of scope of this document. 1762 For example, if an application requests to transfer fixed-size 1763 messages of 100 bytes with partial reliability, this needs the 1764 receiving application to be prepared to accept data in chunks of 100 1765 bytes. If, then, some of these 100-byte messages are missing (e.g., 1766 if SCTP with Configurable Reliability is used), this is the expected 1767 application behavior. With TCP, no messages would be missing, but 1768 this is also correct for the application, and the possible 1769 retransmission delay is acceptable within the best effort service 1770 model [RFC7305]. Still, the receiving application would separate the 1771 byte stream into 100-byte chunks. 1773 Note that this usage of messages does not require all messages to be 1774 equal in size. Many application protocols use some form of Type- 1775 Length-Value (TLV) encoding, e.g. by defining a header including 1776 length fields; another alternative is the use of byte stuffing 1777 methods such as COBS [COBS]. If an application needs message 1778 numbers, e.g. to restore the correct sequence of messages, these must 1779 also be encoded by the application itself, as the sequence number 1780 related transport features of SCTP are not provided by the "minimum 1781 set" (in the interest of enabling usage of TCP). 1783 A.3.2. Stream Schedulers Without Streams 1785 We have already stated that multi-streaming does not require 1786 application-specific knowledge. Potential benefits or disadvantages 1787 of, e.g., using two streams of an SCTP association versus using two 1788 separate SCTP associations or TCP connections are related to 1789 knowledge about the network and the particular transport protocol in 1790 use, not the application. However, the transport features "Choose a 1791 scheduler to operate between streams of an association" and 1792 "Configure priority or weight for a scheduler" operate on streams. 1793 Here, streams identify communication channels between which a 1794 scheduler operates, and they can be assigned a priority. Moreover, 1795 the transport features in the MAINTENANCE category all operate on 1796 assocations in case of SCTP, i.e. they apply to all streams in that 1797 assocation. 1799 With only these semantics necessary to represent, the interface to a 1800 transport system becomes easier if we assume that connections may be 1801 a transport protocol's connection or association, but could also be a 1802 stream of an existing SCTP association, for example. We only need to 1803 allow for a way to define a possible grouping of connections. Then, 1804 all MAINTENANCE transport features can be said to operate on 1805 connection groups, not connections, and a scheduler operates on the 1806 connections within a group. 1808 To be compatible with multiple transport protocols and uniformly 1809 allow access to both transport connections and streams of a multi- 1810 streaming protocol, the semantics of opening and closing need to be 1811 the most restrictive subset of all of the underlying options. For 1812 example, TCP's support of half-closed connections can be seen as a 1813 feature on top of the more restrictive "ABORT"; this feature cannot 1814 be supported because not all protocols used by a transport system 1815 (including streams of an association) support half-closed 1816 connections. 1818 A.3.3. Early Data Transmission 1820 There are two transport features related to transferring a message 1821 early: "Hand over a message to reliably transfer (possibly multiple 1822 times) before connection establishment", which relates to TCP Fast 1823 Open [RFC7413], and "Hand over a message to reliably transfer during 1824 connection establishment", which relates to SCTP's ability to 1825 transfer data together with the COOKIE-Echo chunk. Also without TCP 1826 Fast Open, TCP can transfer data during the handshake, together with 1827 the SYN packet -- however, the receiver of this data may not hand it 1828 over to the application until the handshake has completed. Also, 1829 different from TCP Fast Open, this data is not delimited as a message 1830 by TCP (thus, not visible as a ``message''). This functionality is 1831 commonly available in TCP and supported in several implementations, 1832 even though the TCP specification does not explain how to provide it 1833 to applications. 1835 A transport system could differentiate between the cases of 1836 transmitting data "before" (possibly multiple times) or "during" the 1837 handshake. Alternatively, it could also assume that data that are 1838 handed over early will be transmitted as early as possible, and 1839 "before" the handshake would only be used for messages that are 1840 explicitly marked as "idempotent" (i.e., it would be acceptable to 1841 transfer them multiple times). 1843 The amount of data that can successfully be transmitted before or 1844 during the handshake depends on various factors: the transport 1845 protocol, the use of header options, the choice of IPv4 and IPv6 and 1846 the Path MTU. A transport system should therefore allow a sending 1847 application to query the maximum amount of data it can possibly 1848 transmit before (or, if exposed, during) connection establishment. 1850 A.3.4. Sender Running Dry 1852 The transport feature "Notification that the stack has no more user 1853 data to send" relates to SCTP's "SENDER DRY" notification. Such 1854 notifications can, in principle, be used to avoid having an 1855 unnecessarily large send buffer, yet ensure that the transport sender 1856 always has data available when it has an opportunity to transmit it. 1857 This has been found to be very beneficial for some applications 1858 [WWDC2015]. However, "SENDER DRY" truly means that the entire send 1859 buffer (including both unsent and unacknowledged data) has emptied -- 1860 i.e., when it notifies the sender, it is already too late, the 1861 transport protocol already missed an opportunity to send data. Some 1862 modern TCP implementations now include the unspecified 1863 "TCP_NOTSENT_LOWAT" socket option that was proposed in [WWDC2015], 1864 which limits the amount of unsent data that TCP can keep in the 1865 socket buffer; this allows to specify at which buffer filling level 1866 the socket becomes writable, rather than waiting for the buffer to 1867 run empty. 1869 SCTP allows to configure the sender-side buffer too: the automatable 1870 Transport Feature "Configure send buffer size" provides this 1871 functionality, but only for the complete buffer, which includes both 1872 unsent and unacknowledged data. SCTP does not allow to control these 1873 two sizes separately. It therefore makes sense for a transport 1874 system to allow for uniform access to "TCP_NOTSENT_LOWAT" as well as 1875 the "SENDER DRY" notification. 1877 A.3.5. Capacity Profile 1879 The transport features: 1881 o Disable Nagle algorithm 1882 o Enable and configure a "Low Extra Delay Background Transfer" 1883 o Specify DSCP field 1885 all relate to a QoS-like application need such as "low latency" or 1886 "scavenger". In the interest of flexibility of a transport system, 1887 they could therefore be offered in a uniform, more abstract way, 1888 where a transport system could e.g. decide by itself how to use 1889 combinations of LEDBAT-like congestion control and certain DSCP 1890 values, and an application would only specify a general "capacity 1891 profile" (a description of how it wants to use the available 1892 capacity). A need for "lowest possible latency at the expense of 1893 overhead" could then translate into automatically disabling the Nagle 1894 algorithm. 1896 In some cases, the Nagle algorithm is best controlled directly by the 1897 application because it is not only related to a general profile but 1898 also to knowledge about the size of future messages. For fine-grain 1899 control over Nagle-like functionality, the "Request not to bundle 1900 messages" is available. 1902 A.3.6. Security 1904 Both TCP and SCTP offer authentication. TCP authenticates complete 1905 segments. SCTP allows to configure which of SCTP's chunk types must 1906 always be authenticated -- if this is exposed as such, it creates an 1907 undesirable dependency on the transport protocol. For compatibility 1908 with TCP, a transport system should only allow to configure complete 1909 transport layer packets, including headers, IP pseudo-header (if any) 1910 and payload. 1912 Security is discussed in a separate TAPS document 1913 [I-D.pauly-taps-transport-security]. The minimal set presented in 1914 the present document therefore excludes all security related 1915 transport features: "Configure authentication", "Change 1916 authentication parameters", "Obtain authentication information" and 1917 and "Set Cookie life value" as well as "Specifying a key id to be 1918 used to authenticate a message". 1920 A.3.7. Packet Size 1922 UDP(-Lite) has a transport feature called "Specify DF field". This 1923 yields an error message in case of sending a message that exceeds the 1924 Path MTU, which is necessary for a UDP-based application to be able 1925 to implement Path MTU Discovery (a function that UDP-based 1926 applications must do by themselves). The "Get max. transport-message 1927 size that may be sent using a non-fragmented IP packet from the 1928 configured interface" transport feature yields an upper limit for the 1929 Path MTU (minus headers) and can therefore help to implement Path MTU 1930 Discovery more efficiently. 1932 Appendix B. Revision information 1934 XXX RFC-Ed please remove this section prior to publication. 1936 -02: implementation suggestions added, discussion section added, 1937 terminology extended, DELETED category removed, various other fixes; 1938 list of Transport Features adjusted to -01 version of [RFC8303] 1939 except that MPTCP is not included. 1941 -03: updated to be consistent with -02 version of [RFC8303]. 1943 -04: updated to be consistent with -03 version of [RFC8303]. 1944 Reorganized document, rewrote intro and conclusion, and made a first 1945 stab at creating a real "minimal set". 1947 -05: updated to be consistent with -05 version of [RFC8303] (minor 1948 changes). Fixed a mistake regarding Cookie Life value. Exclusion of 1949 security related transport features (to be covered in a separate 1950 document). Reorganized the document (now begins with the minset, 1951 derivation is in the appendix). First stab at an abstract API for 1952 the minset. 1954 draft-ietf-taps-minset-00: updated to be consistent with -08 version 1955 of [RFC8303] ("obtain message delivery number" was removed, as this 1956 has also been removed in [RFC8303] because it was a mistake in 1957 RFC4960. This led to the removal of two more transport features that 1958 were only designated as functional because they affected "obtain 1959 message delivery number"). Fall-back to UDP incorporated (this was 1960 requested at IETF-99); this also affected the transport feature 1961 "Choice between unordered (potentially faster) or ordered delivery of 1962 messages" because this is a boolean which is always true for one 1963 fall-back protocol, and always false for the other one. This was 1964 therefore now divided into two features, one for ordered, one for 1965 unordered delivery. The word "reliably" was added to the transport 1966 features "Hand over a message to reliably transfer (possibly multiple 1967 times) before connection establishment" and "Hand over a message to 1968 reliably transfer during connection establishment" to make it clearer 1969 why this is not supported by UDP. Clarified that the "minset 1970 abstract interface" is not proposing a specific API for all TAPS 1971 systems to implement, but it is just a way to describe the minimum 1972 set. Author order changed. 1974 WG -01: "fall-back to" (TCP or UDP) replaced (mostly with 1975 "implementation over"). References to post-sockets removed (these 1976 were statments that assumed that post-sockets requires two-sided 1977 implementation). Replaced "flow" with "TAPS Connection" and "frame" 1978 with "message" to avoid introducing new terminology. Made sections 3 1979 and 4 in line with the categorization that is already used in the 1980 appendix and [RFC8303], and changed style of section 4 to be even 1981 shorter and less interface-like. Updated reference draft-ietf-tsvwg- 1982 sctp-ndata to RFC8260. 1984 WG -02: rephrased "the TAPS system" and "TAPS connection" etc. to 1985 more generally talk about transport after the intro (mostly replacing 1986 "TAPS system" with "transport system" and "TAPS connection" with 1987 "connection". Merged sections 3 and 4 to form a new section 3. 1989 Authors' Addresses 1991 Michael Welzl 1992 University of Oslo 1993 PO Box 1080 Blindern 1994 Oslo N-0316 1995 Norway 1997 Phone: +47 22 85 24 20 1998 Email: michawe@ifi.uio.no 2000 Stein Gjessing 2001 University of Oslo 2002 PO Box 1080 Blindern 2003 Oslo N-0316 2004 Norway 2006 Phone: +47 22 85 24 44 2007 Email: steing@ifi.uio.no