idnits 2.17.1 draft-ietf-taps-minset-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 5, 2018) is 2150 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'SUBCATEGORY' is mentioned on line 734, but not defined == Outdated reference: A later version (-12) exists of draft-ietf-taps-transport-security-01 -- Unexpected draft version: The latest known version of draft-tsvwg-le-phb is -00, but you're referring to -03. Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TAPS M. Welzl 3 Internet-Draft S. Gjessing 4 Intended status: Informational University of Oslo 5 Expires: December 7, 2018 June 5, 2018 7 A Minimal Set of Transport Services for End Systems 8 draft-ietf-taps-minset-04 10 Abstract 12 This draft recommends a minimal set of Transport Services offered by 13 end systems, and gives guidance on choosing among the available 14 mechanisms and protocols. It is based on the set of transport 15 features in RFC 8303. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at https://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on December 7, 2018. 34 Copyright Notice 36 Copyright (c) 2018 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (https://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 52 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 53 3. The Minimal Set of Transport Features . . . . . . . . . . . . 5 54 3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION . . . . . . . 5 55 3.2. MAINTENANCE . . . . . . . . . . . . . . . . . . . . . . . 8 56 3.2.1. Connection groups . . . . . . . . . . . . . . . . . . 8 57 3.2.2. Individual connections . . . . . . . . . . . . . . . 10 58 3.3. DATA Transfer . . . . . . . . . . . . . . . . . . . . . . 10 59 3.3.1. Sending Data . . . . . . . . . . . . . . . . . . . . 10 60 3.3.2. Receiving Data . . . . . . . . . . . . . . . . . . . 11 61 4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 12 62 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 63 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 64 7. Security Considerations . . . . . . . . . . . . . . . . . . . 12 65 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 66 8.1. Normative References . . . . . . . . . . . . . . . . . . 13 67 8.2. Informative References . . . . . . . . . . . . . . . . . 13 68 Appendix A. Deriving the minimal set . . . . . . . . . . . . . . 15 69 A.1. Step 1: Categorization -- The Superset of Transport 70 Features . . . . . . . . . . . . . . . . . . . . . . . . 15 71 A.1.1. CONNECTION Related Transport Features . . . . . . . . 17 72 A.1.2. DATA Transfer Related Transport Features . . . . . . 32 73 A.2. Step 2: Reduction -- The Reduced Set of Transport 74 Features . . . . . . . . . . . . . . . . . . . . . . . . 37 75 A.2.1. CONNECTION Related Transport Features . . . . . . . . 38 76 A.2.2. DATA Transfer Related Transport Features . . . . . . 39 77 A.3. Step 3: Discussion . . . . . . . . . . . . . . . . . . . 40 78 A.3.1. Sending Messages, Receiving Bytes . . . . . . . . . . 40 79 A.3.2. Stream Schedulers Without Streams . . . . . . . . . . 41 80 A.3.3. Early Data Transmission . . . . . . . . . . . . . . . 42 81 A.3.4. Sender Running Dry . . . . . . . . . . . . . . . . . 43 82 A.3.5. Capacity Profile . . . . . . . . . . . . . . . . . . 43 83 A.3.6. Security . . . . . . . . . . . . . . . . . . . . . . 44 84 A.3.7. Packet Size . . . . . . . . . . . . . . . . . . . . . 44 85 Appendix B. Revision information . . . . . . . . . . . . . . . . 45 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 46 88 1. Introduction 90 The task of a transport system is to offer transport services to its 91 applications, i.e. the applications running on top of the transport 92 system. Ideally, it does so without statically binding applications 93 to particular transport protocols. Currently, the set of transport 94 services that most applications use is based on TCP and UDP (and 95 protocols that are layered on top of them); this limits the ability 96 for the network stack to make use of features of other transport 97 protocols. For example, if a protocol supports out-of-order message 98 delivery but applications always assume that the network provides an 99 ordered bytestream, then the network stack can not immediately 100 deliver a message that arrives out-of-order: doing so would break a 101 fundamental assumption of the application. The net result is 102 unnecessary head-of-line blocking delay. 104 By exposing the transport services of multiple transport protocols, a 105 transport system can make it possible to use these services without 106 having to statically bind an application to a specific transport 107 protocol. The first step towards the design of such a system was 108 taken by [RFC8095], which surveys a large number of transports, and 109 [RFC8303] as well as [RFC8304], which identify the specific transport 110 features that are exposed to applications by the protocols TCP, 111 MPTCP, UDP(-Lite) and SCTP as well as the LEDBAT congestion control 112 mechanism. This memo is based on these documents and follows the 113 same terminology (also listed below). Because the considered 114 transport protocols conjointly cover a wide range of transport 115 features, there is reason to hope that the resulting set (and the 116 reasoning that led to it) will also apply to many aspects of other 117 transport protocols. 119 The number of transport features of current IETF transports is large, 120 and exposing all of them has a number of disadvantages: generally, 121 the more functionality is exposed, the less freedom a transport 122 system has to automate usage of the various functions of its 123 available set of transport protocols. Some functions only exist in 124 one particular protocol, and if an application used them, this would 125 statically tie the application to this protocol, limiting the 126 flexibility of the transport system. Also, if the number of exposed 127 features is exceedingly large, a transport system might become very 128 difficult to use for an application programmer. Taking [RFC8303] as 129 a basis, this document therefore develops a minimal set of transport 130 features, removing the ones that could get in the way of transport 131 flexibility but keeping the ones that must be retained for 132 applications to benefit from useful transport functionality. 134 Applications use a wide variety of APIs today. The transport 135 features in the minimal set in this document must be reflected in 136 *all* network APIs in order for the underlying functionality to 137 become usable everywhere. For example, it does not help an 138 application that talks to a library which offers its own 139 communication interface if only the underlying Berkeley Sockets API 140 is extended to offer "unordered message delivery", but the library 141 only exposes an ordered bytestream. Both the Berkeley Sockets API 142 and the library would have to expose the "unordered message delivery" 143 transport feature (alternatively, there may be ways for certain types 144 of libraries to use this transport feature without exposing it, based 145 on knowledge about the applications -- but this is not the general 146 case). In most situations, in the interest of being as flexible and 147 efficient as possible, the best choice will be for a library to 148 expose at least all of the transport features that are recommended as 149 a "minimal set" here. 151 This "minimal set" can be implemented one-sided over TCP (or UDP, if 152 certain limitations are put in place). This means that a sender-side 153 transport system can talk to a standard TCP (or UDP) receiver, and a 154 receiver-side transport system can talk to a standard TCP (or UDP) 155 sender. 157 2. Terminology 159 Transport Feature: a specific end-to-end feature that the transport 160 layer provides to an application. Examples include 161 confidentiality, reliable delivery, ordered delivery, message- 162 versus-stream orientation, etc. 163 Transport Service: a set of Transport Features, without an 164 association to any given framing protocol, which provides a 165 complete service to an application. 166 Transport Protocol: an implementation that provides one or more 167 different transport services using a specific framing and header 168 format on the wire. 169 Transport Service Instance: an arrangement of transport protocols 170 with a selected set of features and configuration parameters that 171 implements a single transport service, e.g., a protocol stack (RTP 172 over UDP). 173 Application: an entity that uses the transport layer for end-to-end 174 delivery data across the network (this may also be an upper layer 175 protocol or tunnel encapsulation). 176 Application-specific knowledge: knowledge that only applications 177 have. 178 Endpoint: an entity that communicates with one or more other 179 endpoints using a transport protocol. 180 Connection: shared state of two or more endpoints that persists 181 across messages that are transmitted between these endpoints. 182 Socket: the combination of a destination IP address and a 183 destination port number. 185 Moreover, throughout the document, the protocol name "UDP(-Lite)" is 186 used when discussing transport features that are equivalent for UDP 187 and UDP-Lite; similarly, the protocol name "TCP" refers to both TCP 188 and MPTCP. 190 3. The Minimal Set of Transport Features 192 Based on the categorization, reduction, and discussion in Appendix A, 193 this section describes a minimal set of transport features that end 194 systems should offer. The described transport system can be 195 implemented over TCP. Elements of the system that are not marked 196 with "!UDP" can also be implemented over UDP. 198 As in Appendix A, Appendix A.2 and [RFC8303], we categorize the 199 minimal set of transport features as 1) CONNECTION related 200 (ESTABLISHMENT, AVAILABILITY, MAINTENANCE, TERMINATION) and 2) DATA 201 Transfer related (Sending Data, Receiving Data, Errors). Here, the 202 focus is on connections that the transport system offers as an 203 abstraction to the application, as opposed to connections of 204 transport protocols that the transport system uses. 206 3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION 208 A connection must first be "created" to allow for some initial 209 configuration to be carried out before the transport system can 210 actively or passively establish communication with a remote endpoint. 211 All configuration parameters in Section 3.2 can be used initially, 212 although some of them may only take effect when a connection has been 213 established with a chosen transport protocol. Configuring a 214 connection early helps a transport system make the right decisions. 215 For example, grouping information can influence the transport system 216 to implement a connection as a stream of a multi-streaming protocol's 217 existing association or not. 219 For ungrouped connections, early configuration is necessary because 220 it allows the transport system to know which protocols it should try 221 to use. In particular, a transport system that only makes a one-time 222 choice for a particular protocol must know early about strict 223 requirements that must be kept, or it can end up in a deadlock 224 situation (e.g., having chosen UDP and later be asked to support 225 reliable transfer). As an example description of how to correctly 226 handle these cases, we provide the following decision tree (this is 227 derived from Appendix A.2.1 excluding authentication, as explained in 228 Section 7): 230 - Will it ever be necessary to offer any of the following? 231 * Reliably transfer data 232 * Notify the peer of closing/aborting 233 * Preserve data ordering 235 Yes: SCTP or TCP can be used. 236 - Is any of the following useful to the application? 237 * Choosing a scheduler to operate between connections 238 in a group, with the possibility to configure a priority 239 or weight per connection 240 * Configurable message reliability 241 * Unordered message delivery 242 * Request not to delay the acknowledgement (SACK) of a message 244 Yes: SCTP is preferred. 245 No: 246 - Is any of the following useful to the application? 247 * Hand over a message to reliably transfer (possibly 248 multiple times) before connection establishment 249 * Suggest timeout to the peer 250 * Notification of Excessive Retransmissions (early 251 warning below abortion threshold) 252 * Notification of ICMP error message arrival 254 Yes: TCP is preferred. 255 No: SCTP and TCP are equally preferable. 257 No: all protocols can be used. 258 - Is any of the following useful to the application? 259 * Specify checksum coverage used by the sender 260 * Specify minimum checksum coverage required by receiver 262 Yes: UDP-Lite is preferred. 263 No: UDP is preferred. 265 Note that this decision tree is not optimal for all cases. For 266 example, if an application wants to use "Specify checksum coverage 267 used by the sender", which is only offered by UDP-Lite, and 268 "Configure priority or weight for a scheduler", which is only offered 269 by SCTP, the above decision tree will always choose UDP-Lite, making 270 it impossible to use SCTP's schedulers with priorities between 271 grouped connections. The transport system must know which choice is 272 more important for the application in order to make the best 273 decision. We caution implementers to be aware of the full set of 274 trade-offs, for which we recommend consulting the list in 275 Appendix A.2.1 when deciding how to initialize a connection. 277 To summarize, the following parameters serve as input for the 278 transport system to help it choose and configure a suitable protocol: 280 o Reliability: a boolean that should be set to true when any of the 281 following will be useful to the application: reliably transfer 282 data; notify the peer of closing/aborting; preserve data ordering. 283 o Checksum coverage: a boolean to specify whether it will be useful 284 to the application to specify checksum coverage when sending or 285 receiving. 286 o Configure message priority: a boolean that should be set to true 287 when any of the following per-message configuration or 288 prioritization mechanisms will be useful to the application: 289 choosing a scheduler to operate between grouped connections, with 290 the possibility to configure a priority or weight per connection; 291 configurable message reliability; unordered message delivery; 292 requesting not to delay the acknowledgement (SACK) of a message. 293 o Early message timeout notifications: a boolean that should be set 294 to true when any of the following will be useful to the 295 application: hand over a message to reliably transfer (possibly 296 multiple times) before connection establishment; suggest timeout 297 to the peer; notification of excessive retransmissions (early 298 warning below abortion threshold); notification of ICMP error 299 message arrival. 301 Once a connection is created, it can be queried for the maximum 302 amount of data that an application can possibly expect to have 303 reliably transmitted before or during transport connection 304 establishment (with zero being a possible answer) (see 305 Section 3.2.1). An application can also give the connection a 306 message for reliable transmission before or during connection 307 establishment (!UDP); the transport system will then try to transmit 308 it as early as possible. An application can facilitate sending a 309 message particularly early by marking it as "idempotent" (see 310 Section 3.3.1); in this case, the receiving application must be 311 prepared to potentially receive multiple copies of the message 312 (because idempotent messages are reliably transferred, asking for 313 idempotence is not necessary for systems that support UDP). 315 After creation, a transport system can actively establish 316 communication with a peer, or it can passively listen for incoming 317 connection requests. Note that active establishment may or may not 318 trigger a notification on the listening side. It is possible that 319 the first notification on the listening side is the arrival of the 320 first data that the active side sends (a receiver-side transport 321 system could handle this by continuing to block a "Listen" call, 322 immediately followed by issuing "Receive", for example; callback- 323 based implementations could simply skip the equivalent of "Listen"). 325 This also means that the active opening side is assumed to be the 326 first side sending data. 328 A transport system can actively close a connection, i.e. terminate it 329 after reliably delivering all remaining data to the peer (if reliable 330 data delivery was requested earlier (!UDP)), in which case the peer 331 is notified that the connection is closed. Alternatively, a 332 connection can be aborted without delivering outstanding data to the 333 peer. In case reliable or partially reliable data delivery was 334 requested earlier (!UDP), the peer is notified that the connection is 335 aborted. A timeout can be configured to abort a connection when data 336 could not be delivered for too long (!UDP); however, timeout-based 337 abortion does not notify the peer application that the connection has 338 been aborted. Because half-closed connections are not supported, 339 when a host implementing a transport system receives a notification 340 that the peer is closing or aborting the connection (!UDP), its peer 341 may not be able to read outstanding data. This means that 342 unacknowledged data residing a transport system's send buffer may 343 have to be dropped from that buffer upon arrival of a "close" or 344 "abort" notification from the peer. 346 3.2. MAINTENANCE 348 A transport system must offer means to group connections, but it 349 cannot guarantee truly grouping them using the transport protocols 350 that it uses (e.g., it cannot be guaranteed that connections become 351 multiplexed as streams on a single SCTP association when SCTP may not 352 be available). The transport system must therefore ensure that 353 group- versus non-group-configurations are handled correctly in some 354 way (e.g., by applying the configuration to all grouped connections 355 even when they are not multiplexed, or informing the application 356 about grouping success or failure). 358 As a general rule, any configuration described below should be 359 carried out as early as possible to aid the transport system's 360 decision making. 362 3.2.1. Connection groups 364 The following transport features and notifications (some directly 365 from Appendix A.2, some new or changed, based on the discussion in 366 Appendix A.3) automatically apply to all grouped connections: 368 (!UDP) Configure a timeout: this can be done with the following 369 parameters: 371 o A timeout value for aborting connections, in seconds 372 o A timeout value to be suggested to the peer (if possible), in 373 seconds 374 o The number of retransmissions after which the application should 375 be notifed of "Excessive Retransmissions" 377 Configure urgency: this can be done with the following parameters: 379 o A number to identify the type of scheduler that should be used to 380 operate between connections in the group (no guarantees given). 381 Schedulers are defined in [RFC8260]. 382 o A "capacity profile" number to identify how an application wants 383 to use its available capacity. Choices can be "lowest possible 384 latency at the expense of overhead" (which would disable any 385 Nagle-like algorithm), "scavenger", or values that help determine 386 the DSCP value for a connection (e.g. similar to table 1 in 387 [I-D.ietf-tsvwg-rtcweb-qos]). 388 o A buffer limit (in bytes); when the sender has less then the 389 provided limit of bytes in the buffer, the application may be 390 notified. Notifications are not guaranteed, and it is optional 391 for a transport system to support buffer limit values greater than 392 0. Note that this limit and its notification should operate 393 across the buffers of the whole transport system, i.e. also any 394 potential buffers that the transport system itself may use on top 395 of the transport's send buffer. 397 Following Appendix A.3.7, these properties can be queried: 399 o The maximum message size that may be sent without fragmentation 400 via the configured interface. This is optional for a transport 401 system to offer, and may return an error ("not available"). It 402 can aid applications implementing Path MTU Discovery. 403 o The maximum transport message size that can be sent, in bytes. 404 Irrespective of fragmentation, there is a size limit for the 405 messages that can be handed over to SCTP or UDP(-Lite); because 406 the service provided by a transport system is independent of the 407 transport protocol, it must allow an application to query this 408 value -- the maximum size of a message in an Application-Framed- 409 Bytestream (see Appendix A.3.1). This may also return an error 410 when data is not delimited ("not available"). 411 o The maximum transport message size that can be received from the 412 configured interface, in bytes (or "not available"). 413 o The maximum amount of data that can possibly be sent before or 414 during connection establishment, in bytes. 416 In addition to the already mentioned closing / aborting notifications 417 and possible send errors, the following notifications can occur: 419 o Excessive Retransmissions: the configured (or a default) number of 420 retransmissions has been reached, yielding this early warning 421 below an abortion threshold. 422 o ICMP Arrival (parameter: ICMP message): an ICMP packet carrying 423 the conveyed ICMP message has arrived. 424 o ECN Arrival (parameter: ECN value): a packet carrying the conveyed 425 ECN value has arrived. This can be useful for applications 426 implementing congestion control. 427 o Timeout (parameter: s seconds): data could not be delivered for s 428 seconds. 429 o Drain: the send buffer has either drained below the configured 430 buffer limit or it has become completely empty. This is a generic 431 notification that tries to enable uniform access to 432 "TCP_NOTSENT_LOWAT" as well as the "SENDER DRY" notification (as 433 discussed in Appendix A.3.4 -- SCTP's "SENDER DRY" is a special 434 case where the threshold (for unsent data) is 0 and there is also 435 no more unacknowledged data in the send buffer). 437 3.2.2. Individual connections 439 Configure priority or weight for a scheduler, as described in 440 [RFC8260]. 442 Configure checksum usage: this can be done with the following 443 parameters, but there is no guarantee that any checksum limitations 444 will indeed be enforced (the default behavior is "full coverage, 445 checksum enabled"): 447 o A boolean to enable / disable usage of a checksum when sending 448 o The desired coverage (in bytes) of the checksum used when sending 449 o A boolean to enable / disable requiring a checksum when receiving 450 o The required minimum coverage (in bytes) of the checksum when 451 receiving 453 3.3. DATA Transfer 455 3.3.1. Sending Data 457 When sending a message, no guarantees are given about the 458 preservation of message boundaries to the peer; if message boundaries 459 are needed, the receiving application at the peer must know about 460 them beforehand (or the transport system cannot use TCP). Note that 461 an application should already be able to hand over data before the 462 transport system establishes a connection with a chosen transport 463 protocol. Regarding the message that is being handed over, the 464 following parameters can be used: 466 o Reliability: this parameter is used to convey a choice of: fully 467 reliable (!UDP), unreliable without congestion control, unreliable 468 (!UDP), partially reliable (see [RFC3758] and [RFC7496] for 469 details on how to specify partial reliability) (!UDP). The latter 470 two choices are optional for a transport system to offer and may 471 result in full reliability. Note that applications sending 472 unreliable data without congestion control should themselves 473 perform congestion control in accordance with [RFC2914]. 474 o (!UDP) Ordered: this boolean parameter lets an application choose 475 between ordered message delivery (true) and possibly unordered, 476 potentially faster message delivery (false). 477 o Bundle: a boolean that expresses a preference for allowing to 478 bundle messages (true) or not (false). No guarantees are given. 479 o DelAck: a boolean that, if false, lets an application request that 480 the peer would not delay the acknowledgement for this message. 481 o Fragment: a boolean that expresses a preference for allowing to 482 fragment messages (true) or not (false), at the IP level. No 483 guarantees are given. 484 o (!UDP) Idempotent: a boolean that expresses whether a message is 485 idempotent (true) or not (false). Idempotent messages may arrive 486 multiple times at the receiver (but they will arrive at least 487 once). When data is idempotent it can be used by the receiver 488 immediately on a connection establishment attempt. Thus, if data 489 is handed over before the transport system establishes a 490 connection with a chosen transport protocol, stating that a 491 message is idempotent facilitates transmitting it to the peer 492 application particularly early. 494 An application can be notified of a failure to send a specific 495 message. There is no guarantee of such notifications, i.e. send 496 failures can also silently occur. 498 3.3.2. Receiving Data 500 A receiving application obtains an "Application-Framed Bytestream" 501 (AFra-Bytestream); this concept is further described in 502 Appendix A.3.1). In line with TCP's receiver semantics, an AFra- 503 Bytestream is just a stream of bytes to the receiver. If message 504 boundaries were specified by the sender, a receiver-side transport 505 system implementing only the minimum set of transport services 506 defined here will still not inform the receiving application about 507 them (this limitation is only needed for transport systems that are 508 implemented to directly use TCP). 510 Different from TCP's semantics, if the sending application has 511 allowed that messages are not fully reliably transferred, or 512 delivered out of order, then such re-ordering or unreliability may be 513 reflected per message in the arriving data. Messages will always 514 stay intact - i.e. if an incomplete message is contained at the end 515 of the arriving data block, this message is guaranteed to continue in 516 the next arriving data block. 518 4. Conclusion 520 By decoupling applications from transport protocols, a transport 521 system provides a different abstraction level than the Berkeley 522 sockets interface. As with high- vs. low-level programming 523 languages, a higher abstraction level allows more freedom for 524 automation below the interface, yet it takes some control away from 525 the application programmer. This is the design trade-off that a 526 transport system developer is facing, and this document provides 527 guidance on the design of this abstraction level. Some transport 528 features are currently rarely offered by APIs, yet they must be 529 offered or they can never be used ("functional" transport features). 530 Other transport features are offered by the APIs of the protocols 531 covered here, but not exposing them in an API would allow for more 532 freedom to automate protocol usage in a transport system. The 533 minimal set presented in this document is an effort to find a middle 534 ground that can be recommended for transport systems to implement, on 535 the basis of the transport features discussed in [RFC8303]. 537 5. Acknowledgements 539 The authors would like to thank all the participants of the TAPS 540 Working Group and the NEAT and MAMI research projects for valuable 541 input to this document. We especially thank Michael Tuexen for help 542 with connection connection establishment/teardown and Gorry Fairhurst 543 for his suggestions regarding fragmentation and packet sizes. This 544 work has received funding from the European Union's Horizon 2020 545 research and innovation programme under grant agreement No. 644334 546 (NEAT). 548 6. IANA Considerations 550 XX RFC ED - PLEASE REMOVE THIS SECTION XXX 552 This memo includes no request to IANA. 554 7. Security Considerations 556 Authentication, confidentiality protection, and integrity protection 557 are identified as transport features by [RFC8095]. As currently 558 deployed in the Internet, these features are generally provided by a 559 protocol or layer on top of the transport protocol; no current full- 560 featured standards-track transport protocol provides all of these 561 transport features on its own. Therefore, these transport features 562 are not considered in this document, with the exception of native 563 authentication capabilities of TCP and SCTP for which the security 564 considerations in [RFC5925] and [RFC4895] apply. The minimum 565 security requirements for a transport system are discussed in a 566 separate document [I-D.ietf-taps-transport-security]. 568 8. References 570 8.1. Normative References 572 [RFC8303] Welzl, M., Tuexen, M., and N. Khademi, "On the Usage of 573 Transport Features Provided by IETF Transport Protocols", 574 RFC 8303, DOI 10.17487/RFC8303, February 2018, 575 . 577 8.2. Informative References 579 [COBS] Cheshire, S. and M. Baker, "Consistent Overhead Byte 580 Stuffing", September 1997, 581 . 583 [I-D.ietf-taps-transport-security] 584 Pauly, T., Perkins, C., Rose, K., and C. Wood, "A Survey 585 of Transport Security Protocols", draft-ietf-taps- 586 transport-security-01 (work in progress), May 2018. 588 [I-D.ietf-tsvwg-rtcweb-qos] 589 Jones, P., Dhesikan, S., Jennings, C., and D. Druta, "DSCP 590 Packet Markings for WebRTC QoS", draft-ietf-tsvwg-rtcweb- 591 qos-18 (work in progress), August 2016. 593 [LBE-draft] 594 Bless, R., "A Lower Effort Per-Hop Behavior (LE PHB)", 595 Internet-draft draft-tsvwg-le-phb-03, February 2018. 597 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 598 RFC 2914, DOI 10.17487/RFC2914, September 2000, 599 . 601 [RFC3758] Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P. 602 Conrad, "Stream Control Transmission Protocol (SCTP) 603 Partial Reliability Extension", RFC 3758, 604 DOI 10.17487/RFC3758, May 2004, 605 . 607 [RFC4895] Tuexen, M., Stewart, R., Lei, P., and E. Rescorla, 608 "Authenticated Chunks for the Stream Control Transmission 609 Protocol (SCTP)", RFC 4895, DOI 10.17487/RFC4895, August 610 2007, . 612 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 613 Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, 614 . 616 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 617 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 618 June 2010, . 620 [RFC7305] Lear, E., Ed., "Report from the IAB Workshop on Internet 621 Technology Adoption and Transition (ITAT)", RFC 7305, 622 DOI 10.17487/RFC7305, July 2014, 623 . 625 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 626 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 627 . 629 [RFC7496] Tuexen, M., Seggelmann, R., Stewart, R., and S. Loreto, 630 "Additional Policies for the Partially Reliable Stream 631 Control Transmission Protocol Extension", RFC 7496, 632 DOI 10.17487/RFC7496, April 2015, 633 . 635 [RFC8095] Fairhurst, G., Ed., Trammell, B., Ed., and M. Kuehlewind, 636 Ed., "Services Provided by IETF Transport Protocols and 637 Congestion Control Mechanisms", RFC 8095, 638 DOI 10.17487/RFC8095, March 2017, 639 . 641 [RFC8260] Stewart, R., Tuexen, M., Loreto, S., and R. Seggelmann, 642 "Stream Schedulers and User Message Interleaving for the 643 Stream Control Transmission Protocol", RFC 8260, 644 DOI 10.17487/RFC8260, November 2017, 645 . 647 [RFC8304] Fairhurst, G. and T. Jones, "Transport Features of the 648 User Datagram Protocol (UDP) and Lightweight UDP (UDP- 649 Lite)", RFC 8304, DOI 10.17487/RFC8304, February 2018, 650 . 652 [WWDC2015] 653 Lakhera, P. and S. Cheshire, "Your App and Next Generation 654 Networks", Apple Worldwide Developers Conference 2015, San 655 Francisco, USA, June 2015, 656 . 658 Appendix A. Deriving the minimal set 660 We approach the construction of a minimal set of transport features 661 in the following way: 663 1. Categorization: the superset of transport features from [RFC8303] 664 is presented, and transport features are categorized for later 665 reduction. 666 2. Reduction: a shorter list of transport features is derived from 667 the categorization in the first step. This removes all transport 668 features that do not require application-specific knowledge or 669 cannot be implemented with TCP or UDP. 670 3. Discussion: the resulting list shows a number of peculiarities 671 that are discussed, to provide a basis for constructing the 672 minimal set. 673 4. Construction: Based on the reduced set and the discussion of the 674 transport features therein, a minimal set is constructed. 676 The first three steps as well as the underlying rationale for 677 constructing the minimal set are described in this appendix. The 678 minimal set itself is described in Section 3. 680 A.1. Step 1: Categorization -- The Superset of Transport Features 682 Following [RFC8303], we divide the transport features into two main 683 groups as follows: 685 1. CONNECTION related transport features 686 - ESTABLISHMENT 687 - AVAILABILITY 688 - MAINTENANCE 689 - TERMINATION 691 2. DATA Transfer related transport features 692 - Sending Data 693 - Receiving Data 694 - Errors 696 We assume that applications have no specific requirements that need 697 knowledge about the network, e.g. regarding the choice of network 698 interface or the end-to-end path. Even with these assumptions, there 699 are certain requirements that are strictly kept by transport 700 protocols today, and these must also be kept by a transport system. 701 Some of these requirements relate to transport features that we call 702 "Functional". 704 Functional transport features provide functionality that cannot be 705 used without the application knowing about them, or else they violate 706 assumptions that might cause the application to fail. For example, 707 ordered message delivery is a functional transport feature: it cannot 708 be configured without the application knowing about it because the 709 application's assumption could be that messages always arrive in 710 order. Failure includes any change of the application behavior that 711 is not performance oriented, e.g. security. 713 "Change DSCP" and "Disable Nagle algorithm" are examples of transport 714 features that we call "Optimizing": if a transport system 715 autonomously decides to enable or disable them, an application will 716 not fail, but a transport system may be able to communicate more 717 efficiently if the application is in control of this optimizing 718 transport feature. These transport features require application- 719 specific knowledge (e.g., about delay/bandwidth requirements or the 720 length of future data blocks that are to be transmitted). 722 The transport features of IETF transport protocols that do not 723 require application-specific knowledge and could therefore be 724 transparently utilized by a transport system are called 725 "Automatable". 727 Finally, some transport features are aggregated and/or slightly 728 changed in the description below. These transport features are 729 marked as "ADDED". The corresponding transport features are 730 automatable, and they are listed immediately below the "ADDED" 731 transport feature. 733 In this description, transport services are presented following the 734 nomenclature "CATEGORY.[SUBCATEGORY].SERVICENAME.PROTOCOL", 735 equivalent to "pass 2" in [RFC8303]. We also sketch how some of the 736 transport features can be implemented by a transport system. For all 737 transport features that are categorized as "functional" or 738 "optimizing", and for which no matching TCP and/or UDP primitive 739 exists in "pass 2" of [RFC8303], a brief discussion on how to 740 implement them over TCP and/or UDP is included. 742 We designate some transport features as "automatable" on the basis of 743 a broader decision that affects multiple transport features: 745 o Most transport features that are related to multi-streaming were 746 designated as "automatable". This was done because the decision 747 on whether to use multi-streaming or not does not depend on 748 application-specific knowledge. This means that a connection that 749 is exhibited to an application could be implemented by using a 750 single stream of an SCTP association instead of mapping it to a 751 complete SCTP association or TCP connection. This could be 752 achieved by using more than one stream when an SCTP association is 753 first established (CONNECT.SCTP parameter "outbound stream 754 count"), maintaining an internal stream number, and using this 755 stream number when sending data (SEND.SCTP parameter "stream 756 number"). Closing or aborting a connection could then simply free 757 the stream number for future use. This is discussed further in 758 Appendix A.3.2. 759 o All transport features that are related to using multiple paths or 760 the choice of the network interface were designated as 761 "automatable". Choosing a path or an interface does not depend on 762 application-specific knowledge. For example, "Listen" could 763 always listen on all available interfaces and "Connect" could use 764 the default interface for the destination IP address. 766 A.1.1. CONNECTION Related Transport Features 768 ESTABLISHMENT: 770 o Connect 771 Protocols: TCP, SCTP, UDP(-Lite) 772 Functional because the notion of a connection is often reflected 773 in applications as an expectation to be able to communicate after 774 a "Connect" succeeded, with a communication sequence relating to 775 this transport feature that is defined by the application 776 protocol. 777 Implementation: via CONNECT.TCP, CONNECT.SCTP or CONNECT.UDP(- 778 Lite). 780 o Specify which IP Options must always be used 781 Protocols: TCP, UDP(-Lite) 782 Automatable because IP Options relate to knowledge about the 783 network, not the application. 785 o Request multiple streams 786 Protocols: SCTP 787 Automatable because using multi-streaming does not require 788 application-specific knowledge. 789 Implementation: see Appendix A.3.2. 791 o Limit the number of inbound streams 792 Protocols: SCTP 793 Automatable because using multi-streaming does not require 794 application-specific knowledge. 795 Implementation: see Appendix A.3.2. 797 o Specify number of attempts and/or timeout for the first 798 establishment message 799 Protocols: TCP, SCTP 800 Functional because this is closely related to potentially assumed 801 reliable data delivery for data that is sent before or during 802 connection establishment. 803 Implementation: Using a parameter of CONNECT.TCP and CONNECT.SCTP. 804 Implementation over UDP: Do nothing (this is irrelevant in case of 805 UDP because there, reliable data delivery is not assumed). 807 o Obtain multiple sockets 808 Protocols: SCTP 809 Automatable because the usage of multiple paths to communicate to 810 the same end host relates to knowledge about the network, not the 811 application. 813 o Disable MPTCP 814 Protocols: MPTCP 815 Automatable because the usage of multiple paths to communicate to 816 the same end host relates to knowledge about the network, not the 817 application. 818 Implementation: via a boolean parameter in CONNECT.MPTCP. 820 o Configure authentication 821 Protocols: TCP, SCTP 822 Functional because this has a direct influence on security. 823 Implementation: via parameters in CONNECT.TCP and CONNECT.SCTP. 824 Implementation over TCP: With TCP, this allows to configure Master 825 Key Tuples (MKTs) to authenticate complete segments (including the 826 TCP IPv4 pseudoheader, TCP header, and TCP data). With SCTP, this 827 allows to specify which chunk types must always be authenticated. 828 Authenticating only certain chunk types creates a reduced level of 829 security that is not supported by TCP; to be compatible, this 830 should therefore only allow to authenticate all chunk types. Key 831 material must be provided in a way that is compatible with both 832 [RFC4895] and [RFC5925]. 834 Implementation over UDP: Not possible. 836 o Indicate (and/or obtain upon completion) an Adaptation Layer via 837 an adaptation code point 838 Protocols: SCTP 839 Functional because it allows to send extra data for the sake of 840 identifying an adaptation layer, which by itself is application- 841 specific. 842 Implementation: via a parameter in CONNECT.SCTP. 843 Implementation over TCP: not possible. 844 Implementation over UDP: not possible. 846 o Request to negotiate interleaving of user messages 847 Protocols: SCTP 848 Automatable because it requires using multiple streams, but 849 requesting multiple streams in the CONNECTION.ESTABLISHMENT 850 category is automatable. 851 Implementation: via a parameter in CONNECT.SCTP. 853 o Hand over a message to reliably transfer (possibly multiple times) 854 before connection establishment 855 Protocols: TCP 856 Functional because this is closely tied to properties of the data 857 that an application sends or expects to receive. 858 Implementation: via a parameter in CONNECT.TCP. 859 Implementation over UDP: not possible. 861 o Hand over a message to reliably transfer during connection 862 establishment 863 Protocols: SCTP 864 Functional because this can only work if the message is limited in 865 size, making it closely tied to properties of the data that an 866 application sends or expects to receive. 867 Implementation: via a parameter in CONNECT.SCTP. 868 Implementation over UDP: not possible. 870 o Enable UDP encapsulation with a specified remote UDP port number 871 Protocols: SCTP 872 Automatable because UDP encapsulation relates to knowledge about 873 the network, not the application. 875 AVAILABILITY: 877 o Listen 878 Protocols: TCP, SCTP, UDP(-Lite) 879 Functional because the notion of accepting connection requests is 880 often reflected in applications as an expectation to be able to 881 communicate after a "Listen" succeeded, with a communication 882 sequence relating to this transport feature that is defined by the 883 application protocol. 884 ADDED. This differs from the 3 automatable transport features 885 below in that it leaves the choice of interfaces for listening 886 open. 887 Implementation: by listening on all interfaces via LISTEN.TCP (not 888 providing a local IP address) or LISTEN.SCTP (providing SCTP port 889 number / address pairs for all local IP addresses). LISTEN.UDP(- 890 Lite) supports both methods. 892 o Listen, 1 specified local interface 893 Protocols: TCP, SCTP, UDP(-Lite) 894 Automatable because decisions about local interfaces relate to 895 knowledge about the network and the Operating System, not the 896 application. 898 o Listen, N specified local interfaces 899 Protocols: SCTP 900 Automatable because decisions about local interfaces relate to 901 knowledge about the network and the Operating System, not the 902 application. 904 o Listen, all local interfaces 905 Protocols: TCP, SCTP, UDP(-Lite) 906 Automatable because decisions about local interfaces relate to 907 knowledge about the network and the Operating System, not the 908 application. 910 o Specify which IP Options must always be used 911 Protocols: TCP, UDP(-Lite) 912 Automatable because IP Options relate to knowledge about the 913 network, not the application. 915 o Disable MPTCP 916 Protocols: MPTCP 917 Automatable because the usage of multiple paths to communicate to 918 the same end host relates to knowledge about the network, not the 919 application. 921 o Configure authentication 922 Protocols: TCP, SCTP 923 Functional because this has a direct influence on security. 924 Implementation: via parameters in LISTEN.TCP and LISTEN.SCTP. 925 Implementation over TCP: With TCP, this allows to configure Master 926 Key Tuples (MKTs) to authenticate complete segments (including the 927 TCP IPv4 pseudoheader, TCP header, and TCP data). With SCTP, this 928 allows to specify which chunk types must always be authenticated. 929 Authenticating only certain chunk types creates a reduced level of 930 security that is not supported by TCP; to be compatible, this 931 should therefore only allow to authenticate all chunk types. Key 932 material must be provided in a way that is compatible with both 933 [RFC4895] and [RFC5925]. 934 Implementation over UDP: not possible. 936 o Obtain requested number of streams 937 Protocols: SCTP 938 Automatable because using multi-streaming does not require 939 application-specific knowledge. 940 Implementation: see Appendix A.3.2. 942 o Limit the number of inbound streams 943 Protocols: SCTP 944 Automatable because using multi-streaming does not require 945 application-specific knowledge. 946 Implementation: see Appendix A.3.2. 948 o Indicate (and/or obtain upon completion) an Adaptation Layer via 949 an adaptation code point 950 Protocols: SCTP 951 Functional because it allows to send extra data for the sake of 952 identifying an adaptation layer, which by itself is application- 953 specific. 954 Implementation: via a parameter in LISTEN.SCTP. 955 Implementation over TCP: not possible. 956 Implementation over UDP: not possible. 958 o Request to negotiate interleaving of user messages 959 Protocols: SCTP 960 Automatable because it requires using multiple streams, but 961 requesting multiple streams in the CONNECTION.ESTABLISHMENT 962 category is automatable. 963 Implementation: via a parameter in LISTEN.SCTP. 965 MAINTENANCE: 967 o Change timeout for aborting connection (using retransmit limit or 968 time value) 969 Protocols: TCP, SCTP 970 Functional because this is closely related to potentially assumed 971 reliable data delivery. 972 Implementation: via CHANGE_TIMEOUT.TCP or CHANGE_TIMEOUT.SCTP. 973 Implementation over UDP: not possible (UDP is unreliable and there 974 is no connection timeout). 976 o Suggest timeout to the peer 977 Protocols: TCP 978 Functional because this is closely related to potentially assumed 979 reliable data delivery. 980 Implementation: via CHANGE_TIMEOUT.TCP. 981 Implementation over UDP: not possible (UDP is unreliable and there 982 is no connection timeout). 984 o Disable Nagle algorithm 985 Protocols: TCP, SCTP 986 Optimizing because this decision depends on knowledge about the 987 size of future data blocks and the delay between them. 988 Implementation: via DISABLE_NAGLE.TCP and DISABLE_NAGLE.SCTP. 989 Implementation over UDP: do nothing (UDP does not implement the 990 Nagle algorithm). 992 o Request an immediate heartbeat, returning success/failure 993 Protocols: SCTP 994 Automatable because this informs about network-specific knowledge. 996 o Notification of Excessive Retransmissions (early warning below 997 abortion threshold) 998 Protocols: TCP 999 Optimizing because it is an early warning to the application, 1000 informing it of an impending functional event. 1001 Implementation: via ERROR.TCP. 1002 Implementation over UDP: do nothing (there is no abortion 1003 threshold). 1005 o Add path 1006 Protocols: MPTCP, SCTP 1007 MPTCP Parameters: source-IP; source-Port; destination-IP; 1008 destination-Port 1009 SCTP Parameters: local IP address 1010 Automatable because the usage of multiple paths to communicate to 1011 the same end host relates to knowledge about the network, not the 1012 application. 1014 o Remove path 1015 Protocols: MPTCP, SCTP 1016 MPTCP Parameters: source-IP; source-Port; destination-IP; 1017 destination-Port 1018 SCTP Parameters: local IP address 1019 Automatable because the usage of multiple paths to communicate to 1020 the same end host relates to knowledge about the network, not the 1021 application. 1023 o Set primary path 1024 Protocols: SCTP 1025 Automatable because the usage of multiple paths to communicate to 1026 the same end host relates to knowledge about the network, not the 1027 application. 1029 o Suggest primary path to the peer 1030 Protocols: SCTP 1031 Automatable because the usage of multiple paths to communicate to 1032 the same end host relates to knowledge about the network, not the 1033 application. 1035 o Configure Path Switchover 1036 Protocols: SCTP 1037 Automatable because the usage of multiple paths to communicate to 1038 the same end host relates to knowledge about the network, not the 1039 application. 1041 o Obtain status (query or notification) 1042 Protocols: SCTP, MPTCP 1043 SCTP parameters: association connection state; destination 1044 transport address list; destination transport address reachability 1045 states; current local and peer receiver window size; current local 1046 congestion window sizes; number of unacknowledged DATA chunks; 1047 number of DATA chunks pending receipt; primary path; most recent 1048 SRTT on primary path; RTO on primary path; SRTT and RTO on other 1049 destination addresses; MTU per path; interleaving supported yes/no 1050 MPTCP parameters: subflow-list (identified by source-IP; source- 1051 Port; destination-IP; destination-Port) 1052 Automatable because these parameters relate to knowledge about the 1053 network, not the application. 1055 o Specify DSCP field 1056 Protocols: TCP, SCTP, UDP(-Lite) 1057 Optimizing because choosing a suitable DSCP value requires 1058 application-specific knowledge. 1059 Implementation: via SET_DSCP.TCP / SET_DSCP.SCTP / SET_DSCP.UDP(- 1060 Lite) 1062 o Notification of ICMP error message arrival 1063 Protocols: TCP, UDP(-Lite) 1064 Optimizing because these messages can inform about success or 1065 failure of functional transport features (e.g., host unreachable 1066 relates to "Connect") 1067 Implementation: via ERROR.TCP or ERROR.UDP(-Lite). 1069 o Obtain information about interleaving support 1070 Protocols: SCTP 1071 Automatable because it requires using multiple streams, but 1072 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1073 category is automatable. 1074 Implementation: via STATUS.SCTP. 1076 o Change authentication parameters 1077 Protocols: TCP, SCTP 1078 Functional because this has a direct influence on security. 1079 Implementation: via SET_AUTH.TCP and SET_AUTH.SCTP. 1080 Implementation over TCP: With SCTP, this allows to adjust key_id, 1081 key, and hmac_id. With TCP, this allows to change the preferred 1082 outgoing MKT (current_key) and the preferred incoming MKT 1083 (rnext_key), respectively, for a segment that is sent on the 1084 connection. Key material must be provided in a way that is 1085 compatible with both [RFC4895] and [RFC5925]. 1086 Implementation over UDP: not possible. 1088 o Obtain authentication information 1089 Protocols: SCTP 1090 Functional because authentication decisions may have been made by 1091 the peer, and this has an influence on the necessary application- 1092 level measures to provide a certain level of security. 1093 Implementation: via GET_AUTH.SCTP. 1094 Implementation over TCP: With SCTP, this allows to obtain key_id 1095 and a chunk list. With TCP, this allows to obtain current_key and 1096 rnext_key from a previously received segment. Key material must 1097 be provided in a way that is compatible with both [RFC4895] and 1098 [RFC5925]. 1099 Implementation over UDP: not possible. 1101 o Reset Stream 1102 Protocols: SCTP 1103 Automatable because using multi-streaming does not require 1104 application-specific knowledge. 1105 Implementation: see Appendix A.3.2. 1107 o Notification of Stream Reset 1108 Protocols: STCP 1109 Automatable because using multi-streaming does not require 1110 application-specific knowledge. 1111 Implementation: see Appendix A.3.2. 1113 o Reset Association 1114 Protocols: SCTP 1115 Automatable because deciding to reset an association does not 1116 require application-specific knowledge. 1117 Implementation: via RESET_ASSOC.SCTP. 1119 o Notification of Association Reset 1120 Protocols: STCP 1121 Automatable because this notification does not relate to 1122 application-specific knowledge. 1124 o Add Streams 1125 Protocols: SCTP 1126 Automatable because using multi-streaming does not require 1127 application-specific knowledge. 1128 Implementation: see Appendix A.3.2. 1130 o Notification of Added Stream 1131 Protocols: STCP 1132 Automatable because using multi-streaming does not require 1133 application-specific knowledge. 1134 Implementation: see Appendix A.3.2. 1136 o Choose a scheduler to operate between streams of an association 1137 Protocols: SCTP 1138 Optimizing because the scheduling decision requires application- 1139 specific knowledge. However, if a transport system would not use 1140 this, or wrongly configure it on its own, this would only affect 1141 the performance of data transfers; the outcome would still be 1142 correct within the "best effort" service model. 1143 Implementation: using SET_STREAM_SCHEDULER.SCTP. 1144 Implementation over TCP: do nothing. 1145 Implementation over UDP: do nothing. 1147 o Configure priority or weight for a scheduler 1148 Protocols: SCTP 1149 Optimizing because the priority or weight requires application- 1150 specific knowledge. However, if a transport system would not use 1151 this, or wrongly configure it on its own, this would only affect 1152 the performance of data transfers; the outcome would still be 1153 correct within the "best effort" service model. 1154 Implementation: using CONFIGURE_STREAM_SCHEDULER.SCTP. 1155 Implementation over TCP: do nothing. 1156 Implementation over UDP: do nothing. 1158 o Configure send buffer size 1159 Protocols: SCTP 1160 Automatable because this decision relates to knowledge about the 1161 network and the Operating System, not the application (see also 1162 the discussion in Appendix A.3.4). 1164 o Configure receive buffer (and rwnd) size 1165 Protocols: SCTP 1166 Automatable because this decision relates to knowledge about the 1167 network and the Operating System, not the application. 1169 o Configure message fragmentation 1170 Protocols: SCTP 1171 Automatable because fragmentation relates to knowledge about the 1172 network and the Operating System, not the application. 1173 Implementation: by always enabling it with 1174 CONFIG_FRAGMENTATION.SCTP and auto-setting the fragmentation size 1175 based on network or Operating System conditions. 1177 o Configure PMTUD 1178 Protocols: SCTP 1179 Automatable because Path MTU Discovery relates to knowledge about 1180 the network, not the application. 1182 o Configure delayed SACK timer 1183 Protocols: SCTP 1184 Automatable because the receiver-side decision to delay sending 1185 SACKs relates to knowledge about the network, not the application 1186 (it can be relevant for a sending application to request not to 1187 delay the SACK of a message, but this is a different transport 1188 feature). 1190 o Set Cookie life value 1191 Protocols: SCTP 1192 Functional because it relates to security (possibly weakened by 1193 keeping a cookie very long) versus the time between connection 1194 establishment attempts. Knowledge about both issues can be 1195 application-specific. 1197 Implementation over TCP: the closest specified TCP functionality 1198 is the cookie in TCP Fast Open; for this, [RFC7413] states that 1199 the server "can expire the cookie at any time to enhance security" 1200 and section 4.1.2 describes an example implementation where 1201 updating the key on the server side causes the cookie to expire. 1202 Alternatively, for implementations that do not support TCP Fast 1203 Open, this transport feature could also affect the validity of SYN 1204 cookies (see Section 3.6 of [RFC4987]). 1205 Implementation over UDP: do nothing. 1207 o Set maximum burst 1208 Protocols: SCTP 1209 Automatable because it relates to knowledge about the network, not 1210 the application. 1212 o Configure size where messages are broken up for partial delivery 1213 Protocols: SCTP 1214 Functional because this is closely tied to properties of the data 1215 that an application sends or expects to receive. 1216 Implementation over TCP: not possible. 1217 Implementation over UDP: not possible. 1219 o Disable checksum when sending 1220 Protocols: UDP 1221 Functional because application-specific knowledge is necessary to 1222 decide whether it can be acceptable to lose data integrity. 1223 Implementation: via SET_CHECKSUM_ENABLED.UDP. 1224 Implementation over TCP: do nothing. 1226 o Disable checksum requirement when receiving 1227 Protocols: UDP 1228 Functional because application-specific knowledge is necessary to 1229 decide whether it can be acceptable to lose data integrity. 1230 Implementation: via SET_CHECKSUM_REQUIRED.UDP. 1231 Implementation over TCP: do nothing. 1233 o Specify checksum coverage used by the sender 1234 Protocols: UDP-Lite 1235 Functional because application-specific knowledge is necessary to 1236 decide for which parts of the data it can be acceptable to lose 1237 data integrity. 1238 Implementation: via SET_CHECKSUM_COVERAGE.UDP-Lite. 1239 Implementation over TCP: do nothing. 1241 o Specify minimum checksum coverage required by receiver 1242 Protocols: UDP-Lite 1243 Functional because application-specific knowledge is necessary to 1244 decide for which parts of the data it can be acceptable to lose 1245 data integrity. 1246 Implementation: via SET_MIN_CHECKSUM_COVERAGE.UDP-Lite. 1247 Implementation over TCP: do nothing. 1249 o Specify DF field 1250 Protocols: UDP(-Lite) 1251 Optimizing because the DF field can be used to carry out Path MTU 1252 Discovery, which can lead an application to choose message sizes 1253 that can be transmitted more efficiently. 1254 Implementation: via MAINTENANCE.SET_DF.UDP(-Lite) and 1255 SEND_FAILURE.UDP(-Lite). 1256 Implementation over TCP: do nothing. With TCP the sender is not 1257 in control of transport message sizes, making this functionality 1258 irrelevant. 1260 o Get max. transport-message size that may be sent using a non- 1261 fragmented IP packet from the configured interface 1262 Protocols: UDP(-Lite) 1263 Optimizing because this can lead an application to choose message 1264 sizes that can be transmitted more efficiently. 1265 Implementation over TCP: do nothing: this information is not 1266 available with TCP. 1268 o Get max. transport-message size that may be received from the 1269 configured interface 1270 Protocols: UDP(-Lite) 1271 Optimizing because this can, for example, influence an 1272 application's memory management. 1273 Implementation over TCP: do nothing: this information is not 1274 available with TCP. 1276 o Specify TTL/Hop count field 1277 Protocols: UDP(-Lite) 1278 Automatable because a transport system can use a large enough 1279 system default to avoid communication failures. Allowing an 1280 application to configure it differently can produce notifications 1281 of ICMP error message arrivals that yield information which only 1282 relates to knowledge about the network, not the application. 1284 o Obtain TTL/Hop count field 1285 Protocols: UDP(-Lite) 1286 Automatable because the TTL/Hop count field relates to knowledge 1287 about the network, not the application. 1289 o Specify ECN field 1290 Protocols: UDP(-Lite) 1291 Automatable because the ECN field relates to knowledge about the 1292 network, not the application. 1294 o Obtain ECN field 1295 Protocols: UDP(-Lite) 1296 Optimizing because this information can be used by an application 1297 to better carry out congestion control (this is relevant when 1298 choosing a data transmission transport service that does not 1299 already do congestion control). 1300 Implementation over TCP: do nothing: this information is not 1301 available with TCP. 1303 o Specify IP Options 1304 Protocols: UDP(-Lite) 1305 Automatable because IP Options relate to knowledge about the 1306 network, not the application. 1308 o Obtain IP Options 1309 Protocols: UDP(-Lite) 1310 Automatable because IP Options relate to knowledge about the 1311 network, not the application. 1313 o Enable and configure a "Low Extra Delay Background Transfer" 1314 Protocols: A protocol implementing the LEDBAT congestion control 1315 mechanism 1316 Optimizing because whether this service is appropriate or not 1317 depends on application-specific knowledge. However, wrongly using 1318 this will only affect the speed of data transfers (albeit 1319 including other transfers that may compete with the transport 1320 system's transfer in the network), so it is still correct within 1321 the "best effort" service model. 1322 Implementation: via CONFIGURE.LEDBAT and/or SET_DSCP.TCP / 1323 SET_DSCP.SCTP / SET_DSCP.UDP(-Lite) [LBE-draft]. 1324 Implementation over TCP: do nothing. 1325 Implementation over UDP: do nothing. 1327 TERMINATION: 1329 o Close after reliably delivering all remaining data, causing an 1330 event informing the application on the other side 1331 Protocols: TCP, SCTP 1332 Functional because the notion of a connection is often reflected 1333 in applications as an expectation to have all outstanding data 1334 delivered and no longer be able to communicate after a "Close" 1335 succeeded, with a communication sequence relating to this 1336 transport feature that is defined by the application protocol. 1337 Implementation: via CLOSE.TCP and CLOSE.SCTP. 1338 Implementation over UDP: not possible. 1340 o Abort without delivering remaining data, causing an event 1341 informing the application on the other side 1342 Protocols: TCP, SCTP 1343 Functional because the notion of a connection is often reflected 1344 in applications as an expectation to potentially not have all 1345 outstanding data delivered and no longer be able to communicate 1346 after an "Abort" succeeded. On both sides of a connection, an 1347 application protocol may define a communication sequence relating 1348 to this transport feature. 1349 Implementation: via ABORT.TCP and ABORT.SCTP. 1350 Implementation over UDP: not possible. 1352 o Abort without delivering remaining data, not causing an event 1353 informing the application on the other side 1354 Protocols: UDP(-Lite) 1355 Functional because the notion of a connection is often reflected 1356 in applications as an expectation to potentially not have all 1357 outstanding data delivered and no longer be able to communicate 1358 after an "Abort" succeeded. On both sides of a connection, an 1359 application protocol may define a communication sequence relating 1360 to this transport feature. 1361 Implementation: via ABORT.UDP(-Lite). 1362 Implementation over TCP: stop using the connection, wait for a 1363 timeout. 1365 o Timeout event when data could not be delivered for too long 1366 Protocols: TCP, SCTP 1367 Functional because this notifies that potentially assumed reliable 1368 data delivery is no longer provided. 1369 Implementation: via TIMEOUT.TCP and TIMEOUT.SCTP. 1370 Implementation over UDP: do nothing: this event will not occur 1371 with UDP. 1373 A.1.2. DATA Transfer Related Transport Features 1375 A.1.2.1. Sending Data 1377 o Reliably transfer data, with congestion control 1378 Protocols: TCP, SCTP 1379 Functional because this is closely tied to properties of the data 1380 that an application sends or expects to receive. 1381 Implementation: via SEND.TCP and SEND.SCTP. 1382 Implementation over UDP: not possible. 1384 o Reliably transfer a message, with congestion control 1385 Protocols: SCTP 1386 Functional because this is closely tied to properties of the data 1387 that an application sends or expects to receive. 1388 Implementation: via SEND.SCTP. 1389 Implementation over TCP: via SEND.TCP. With SEND.TCP, messages 1390 will not be identifiable by the receiver. 1391 Implementation over UDP: not possible. 1393 o Unreliably transfer a message 1394 Protocols: SCTP, UDP(-Lite) 1395 Optimizing because only applications know about the time 1396 criticality of their communication, and reliably transfering a 1397 message is never incorrect for the receiver of a potentially 1398 unreliable data transfer, it is just slower. 1399 ADDED. This differs from the 2 automatable transport features 1400 below in that it leaves the choice of congestion control open. 1401 Implementation: via SEND.SCTP or SEND.UDP(-Lite). 1402 Implementation over TCP: use SEND.TCP. With SEND.TCP, messages 1403 will be sent reliably, and they will not be identifiable by the 1404 receiver. 1406 o Unreliably transfer a message, with congestion control 1407 Protocols: SCTP 1408 Automatable because congestion control relates to knowledge about 1409 the network, not the application. 1411 o Unreliably transfer a message, without congestion control 1412 Protocols: UDP(-Lite) 1413 Automatable because congestion control relates to knowledge about 1414 the network, not the application. 1416 o Configurable Message Reliability 1417 Protocols: SCTP 1418 Optimizing because only applications know about the time 1419 criticality of their communication, and reliably transfering a 1420 message is never incorrect for the receiver of a potentially 1421 unreliable data transfer, it is just slower. 1422 Implementation: via SEND.SCTP. 1423 Implementation over TCP: By using SEND.TCP and ignoring this 1424 configuration: based on the assumption of the best-effort service 1425 model, unnecessarily delivering data does not violate application 1426 expectations. Moreover, it is not possible to associate the 1427 requested reliability to a "message" in TCP anyway. 1428 Implementation over UDP: not possible. 1430 o Choice of stream 1431 Protocols: SCTP 1432 Automatable because it requires using multiple streams, but 1433 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1434 category is automatable. Implementation: see Appendix A.3.2. 1436 o Choice of path (destination address) 1437 Protocols: SCTP 1438 Automatable because it requires using multiple sockets, but 1439 obtaining multiple sockets in the CONNECTION.ESTABLISHMENT 1440 category is automatable. 1442 o Ordered message delivery (potentially slower than unordered) 1443 Protocols: SCTP 1444 Functional because this is closely tied to properties of the data 1445 that an application sends or expects to receive. 1446 Implementation: via SEND.SCTP. 1447 Implementation over TCP: By using SEND.TCP. With SEND.TCP, 1448 messages will not be identifiable by the receiver. 1449 Implementation over UDP: not possible. 1451 o Unordered message delivery (potentially faster than ordered) 1452 Protocols: SCTP, UDP(-Lite) 1453 Functional because this is closely tied to properties of the data 1454 that an application sends or expects to receive. 1455 Implementation: via SEND.SCTP. 1456 Implementation over TCP: By using SEND.TCP and always sending data 1457 ordered: based on the assumption of the best-effort service model, 1458 ordered delivery may just be slower and does not violate 1459 application expectations. Moreover, it is not possible to 1460 associate the requested delivery order to a "message" in TCP 1461 anyway. 1463 o Request not to bundle messages 1464 Protocols: SCTP 1465 Optimizing because this decision depends on knowledge about the 1466 size of future data blocks and the delay between them. 1467 Implementation: via SEND.SCTP. 1468 Implementation over TCP: By using SEND.TCP and DISABLE_NAGLE.TCP 1469 to disable the Nagle algorithm when the request is made and enable 1470 it again when the request is no longer made. Note that this is 1471 not fully equivalent because it relates to the time of issuing the 1472 request rather than a specific message. 1474 Implementation over UDP: do nothing (UDP never bundles messages). 1476 o Specifying a "payload protocol-id" (handed over as such by the 1477 receiver) 1478 Protocols: SCTP 1479 Functional because it allows to send extra application data with 1480 every message, for the sake of identification of data, which by 1481 itself is application-specific. 1482 Implementation: SEND.SCTP. 1483 Implementation over TCP: not possible. 1484 Implementation over UDP: not possible. 1486 o Specifying a key id to be used to authenticate a message 1487 Protocols: SCTP 1488 Functional because this has a direct influence on security. 1489 Implementation: via a parameter in SEND.SCTP. 1490 Implementation over TCP: This could be emulated by using 1491 SET_AUTH.TCP before and after the message is sent. Note that this 1492 is not fully equivalent because it relates to the time of issuing 1493 the request rather than a specific message. 1494 Implementation over UDP: not possible. 1496 o Request not to delay the acknowledgement (SACK) of a message 1497 Protocols: SCTP 1498 Optimizing because only an application knows for which message it 1499 wants to quickly be informed about success / failure of its 1500 delivery. 1501 Implementation over TCP: do nothing. 1502 Implementation over UDP: do nothing. 1504 A.1.2.2. Receiving Data 1506 o Receive data (with no message delimiting) 1507 Protocols: TCP 1508 Functional because a transport system must be able to send and 1509 receive data. 1510 Implementation: via RECEIVE.TCP. 1511 Implementation over UDP: do nothing (hand over a message, let the 1512 application ignore message boundaries). 1514 o Receive a message 1515 Protocols: SCTP, UDP(-Lite) 1516 Functional because this is closely tied to properties of the data 1517 that an application sends or expects to receive. 1518 Implementation: via RECEIVE.SCTP and RECEIVE.UDP(-Lite). 1519 Implementation over TCP: not possible. 1521 o Choice of stream to receive from 1522 Protocols: SCTP 1523 Automatable because it requires using multiple streams, but 1524 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1525 category is automatable. 1526 Implementation: see Appendix A.3.2. 1528 o Information about partial message arrival 1529 Protocols: SCTP 1530 Functional because this is closely tied to properties of the data 1531 that an application sends or expects to receive. 1532 Implementation: via RECEIVE.SCTP. 1533 Implementation over TCP: do nothing: this information is not 1534 available with TCP. 1535 Implementation over UDP: do nothing: this information is not 1536 available with UDP. 1538 A.1.2.3. Errors 1540 This section describes sending failures that are associated with a 1541 specific call to in the "Sending Data" category (Appendix A.1.2.1). 1543 o Notification of send failures 1544 Protocols: SCTP, UDP(-Lite) 1545 Functional because this notifies that potentially assumed reliable 1546 data delivery is no longer provided. 1547 ADDED. This differs from the 2 automatable transport features 1548 below in that it does not distinugish between unsent and 1549 unacknowledged messages. 1550 Implementation: via SENDFAILURE-EVENT.SCTP and SEND_FAILURE.UDP(- 1551 Lite). 1552 Implementation over TCP: do nothing: this notification is not 1553 available and will therefore not occur with TCP. 1555 o Notification of an unsent (part of a) message 1556 Protocols: SCTP, UDP(-Lite) 1557 Automatable because the distinction between unsent and 1558 unacknowledged is network-specific. 1560 o Notification of an unacknowledged (part of a) message 1561 Protocols: SCTP 1562 Automatable because the distinction between unsent and 1563 unacknowledged is network-specific. 1565 o Notification that the stack has no more user data to send 1566 Protocols: SCTP 1567 Optimizing because reacting to this notification requires the 1568 application to be involved, and ensuring that the stack does not 1569 run dry of data (for too long) can improve performance. 1570 Implementation over TCP: do nothing. See also the discussion in 1571 Appendix A.3.4. 1572 Implementation over UDP: do nothing. This notification is not 1573 available and will therefore not occur with UDP. 1575 o Notification to a receiver that a partial message delivery has 1576 been aborted 1577 Protocols: SCTP 1578 Functional because this is closely tied to properties of the data 1579 that an application sends or expects to receive. 1580 Implementation over TCP: do nothing. This notification is not 1581 available and will therefore not occur with TCP. 1582 Implementation over UDP: do nothing. This notification is not 1583 available and will therefore not occur with UDP. 1585 A.2. Step 2: Reduction -- The Reduced Set of Transport Features 1587 By hiding automatable transport features from the application, a 1588 transport system can gain opportunities to automate the usage of 1589 network-related functionality. This can facilitate using the 1590 transport system for the application programmer and it allows for 1591 optimizations that may not be possible for an application. For 1592 instance, system-wide configurations regarding the usage of multiple 1593 interfaces can better be exploited if the choice of the interface is 1594 not entirely up to the application. Therefore, since they are not 1595 strictly necessary to expose in a transport system, we do not include 1596 automatable transport features in the reduced set of transport 1597 features. This leaves us with only the transport features that are 1598 either optimizing or functional. 1600 A transport system should be able to communicate via TCP or UDP if 1601 alternative transport protocols are found not to work. For many 1602 transport features, this is possible -- often by simply not doing 1603 anything when a specific request is made. For some transport 1604 features, however, it was identified that direct usage of neither TCP 1605 nor UDP is possible: in these cases, even not doing anything would 1606 incur semantically incorrect behavior. Whenever an application would 1607 make use of one of these transport features, this would eliminate the 1608 possibility to use TCP or UDP. Thus, we only keep the functional and 1609 optimizing transport features for which an implementation over either 1610 TCP or UDP is possible in our reduced set. 1612 In the following list, we precede a transport feature with "T:" if an 1613 implementation over TCP is possible, "U:" if an implementation over 1614 UDP is possible, and "TU:" if an implementation over either TCP or 1615 UDP is possible. 1617 A.2.1. CONNECTION Related Transport Features 1619 ESTABLISHMENT: 1621 o T,U: Connect 1622 o T,U: Specify number of attempts and/or timeout for the first 1623 establishment message 1624 o T: Configure authentication 1625 o T: Hand over a message to reliably transfer (possibly multiple 1626 times) before connection establishment 1627 o T: Hand over a message to reliably transfer during connection 1628 establishment 1630 AVAILABILITY: 1632 o T,U: Listen 1633 o T: Configure authentication 1635 MAINTENANCE: 1637 o T: Change timeout for aborting connection (using retransmit limit 1638 or time value) 1639 o T: Suggest timeout to the peer 1640 o T,U: Disable Nagle algorithm 1641 o T,U: Notification of Excessive Retransmissions (early warning 1642 below abortion threshold) 1643 o T,U: Specify DSCP field 1644 o T,U: Notification of ICMP error message arrival 1645 o T: Change authentication parameters 1646 o T: Obtain authentication information 1647 o T,U: Set Cookie life value 1648 o T,U: Choose a scheduler to operate between streams of an 1649 association 1650 o T,U: Configure priority or weight for a scheduler 1651 o T,U: Disable checksum when sending 1652 o T,U: Disable checksum requirement when receiving 1653 o T,U: Specify checksum coverage used by the sender 1654 o T,U: Specify minimum checksum coverage required by receiver 1655 o T,U: Specify DF field 1656 o T,U: Get max. transport-message size that may be sent using a non- 1657 fragmented IP packet from the configured interface 1658 o T,U: Get max. transport-message size that may be received from the 1659 configured interface 1660 o T,U: Obtain ECN field 1661 o T,U: Enable and configure a "Low Extra Delay Background Transfer" 1663 TERMINATION: 1665 o T: Close after reliably delivering all remaining data, causing an 1666 event informing the application on the other side 1667 o T: Abort without delivering remaining data, causing an event 1668 informing the application on the other side 1669 o T,U: Abort without delivering remaining data, not causing an event 1670 informing the application on the other side 1671 o T,U: Timeout event when data could not be delivered for too long 1673 A.2.2. DATA Transfer Related Transport Features 1675 A.2.2.1. Sending Data 1677 o T: Reliably transfer data, with congestion control 1678 o T: Reliably transfer a message, with congestion control 1679 o T,U: Unreliably transfer a message 1680 o T: Configurable Message Reliability 1681 o T: Ordered message delivery (potentially slower than unordered) 1682 o T,U: Unordered message delivery (potentially faster than ordered) 1683 o T,U: Request not to bundle messages 1684 o T: Specifying a key id to be used to authenticate a message 1685 o T,U: Request not to delay the acknowledgement (SACK) of a message 1687 A.2.2.2. Receiving Data 1689 o T,U: Receive data (with no message delimiting) 1690 o U: Receive a message 1691 o T,U: Information about partial message arrival 1693 A.2.2.3. Errors 1695 This section describes sending failures that are associated with a 1696 specific call to in the "Sending Data" category (Appendix A.1.2.1). 1698 o T,U: Notification of send failures 1699 o T,U: Notification that the stack has no more user data to send 1700 o T,U: Notification to a receiver that a partial message delivery 1701 has been aborted 1703 A.3. Step 3: Discussion 1705 The reduced set in the previous section exhibits a number of 1706 peculiarities, which we will discuss in the following. This section 1707 focuses on TCP because, with the exception of one particular 1708 transport feature ("Receive a message" -- we will discuss this in 1709 Appendix A.3.1), the list shows that UDP is strictly a subset of TCP. 1710 We can first try to understand how to build a transport system that 1711 can run over TCP, and then narrow down the result further to allow 1712 that the system can always run over either TCP or UDP (which 1713 effectively means removing everything related to reliability, 1714 ordering, authentication and closing/aborting with a notification to 1715 the peer). 1717 Note that, because the functional transport features of UDP are -- 1718 with the exception of "Receive a message" -- a subset of TCP, TCP can 1719 be used as a replacement for UDP whenever an application does not 1720 need message delimiting (e.g., because the application-layer protocol 1721 already does it). This has been recognized by many applications that 1722 already do this in practice, by trying to communicate with UDP at 1723 first, and falling back to TCP in case of a connection failure. 1725 A.3.1. Sending Messages, Receiving Bytes 1727 For implementing a transport system over TCP, there are several 1728 transport features related to sending, but only a single transport 1729 feature related to receiving: "Receive data (with no message 1730 delimiting)" (and, strangely, "information about partial message 1731 arrival"). Notably, the transport feature "Receive a message" is 1732 also the only non-automatable transport feature of UDP(-Lite) for 1733 which no implementation over TCP is possible. 1735 To support these TCP receiver semantics, we define an "Application- 1736 Framed Bytestream" (AFra-Bytestream). AFra-Bytestreams allow senders 1737 to operate on messages while minimizing changes to the TCP socket 1738 API. In particular, nothing changes on the receiver side - data can 1739 be accepted via a normal TCP socket. 1741 In an AFra-Bytestream, the sending application can optionally inform 1742 the transport about message boundaries and required properties per 1743 message (configurable order and reliability, or embedding a request 1744 not to delay the acknowledgement of a message). Whenever the sending 1745 application specifies per-message properties that relax the notion of 1746 reliable in-order delivery of bytes, it must assume that the 1747 receiving application is 1) able to determine message boundaries, 1748 provided that messages are always kept intact, and 2) able to accept 1749 these relaxed per-message properties. Any signaling of such 1750 information to the peer is up to an application-layer protocol and 1751 considered out of scope of this document. 1753 For example, if an application requests to transfer fixed-size 1754 messages of 100 bytes with partial reliability, this needs the 1755 receiving application to be prepared to accept data in chunks of 100 1756 bytes. If, then, some of these 100-byte messages are missing (e.g., 1757 if SCTP with Configurable Reliability is used), this is the expected 1758 application behavior. With TCP, no messages would be missing, but 1759 this is also correct for the application, and the possible 1760 retransmission delay is acceptable within the best effort service 1761 model [RFC7305]. Still, the receiving application would separate the 1762 byte stream into 100-byte chunks. 1764 Note that this usage of messages does not require all messages to be 1765 equal in size. Many application protocols use some form of Type- 1766 Length-Value (TLV) encoding, e.g. by defining a header including 1767 length fields; another alternative is the use of byte stuffing 1768 methods such as COBS [COBS]. If an application needs message 1769 numbers, e.g. to restore the correct sequence of messages, these must 1770 also be encoded by the application itself, as the sequence number 1771 related transport features of SCTP are not provided by the "minimum 1772 set" (in the interest of enabling usage of TCP). 1774 A.3.2. Stream Schedulers Without Streams 1776 We have already stated that multi-streaming does not require 1777 application-specific knowledge. Potential benefits or disadvantages 1778 of, e.g., using two streams of an SCTP association versus using two 1779 separate SCTP associations or TCP connections are related to 1780 knowledge about the network and the particular transport protocol in 1781 use, not the application. However, the transport features "Choose a 1782 scheduler to operate between streams of an association" and 1783 "Configure priority or weight for a scheduler" operate on streams. 1784 Here, streams identify communication channels between which a 1785 scheduler operates, and they can be assigned a priority. Moreover, 1786 the transport features in the MAINTENANCE category all operate on 1787 assocations in case of SCTP, i.e. they apply to all streams in that 1788 assocation. 1790 With only these semantics necessary to represent, the interface to a 1791 transport system becomes easier if we assume that connections may be 1792 a transport protocol's connection or association, but could also be a 1793 stream of an existing SCTP association, for example. We only need to 1794 allow for a way to define a possible grouping of connections. Then, 1795 all MAINTENANCE transport features can be said to operate on 1796 connection groups, not connections, and a scheduler operates on the 1797 connections within a group. 1799 To be compatible with multiple transport protocols and uniformly 1800 allow access to both transport connections and streams of a multi- 1801 streaming protocol, the semantics of opening and closing need to be 1802 the most restrictive subset of all of the underlying options. For 1803 example, TCP's support of half-closed connections can be seen as a 1804 feature on top of the more restrictive "ABORT"; this feature cannot 1805 be supported because not all protocols used by a transport system 1806 (including streams of an association) support half-closed 1807 connections. 1809 A.3.3. Early Data Transmission 1811 There are two transport features related to transferring a message 1812 early: "Hand over a message to reliably transfer (possibly multiple 1813 times) before connection establishment", which relates to TCP Fast 1814 Open [RFC7413], and "Hand over a message to reliably transfer during 1815 connection establishment", which relates to SCTP's ability to 1816 transfer data together with the COOKIE-Echo chunk. Also without TCP 1817 Fast Open, TCP can transfer data during the handshake, together with 1818 the SYN packet -- however, the receiver of this data may not hand it 1819 over to the application until the handshake has completed. Also, 1820 different from TCP Fast Open, this data is not delimited as a message 1821 by TCP (thus, not visible as a ``message''). This functionality is 1822 commonly available in TCP and supported in several implementations, 1823 even though the TCP specification does not explain how to provide it 1824 to applications. 1826 A transport system could differentiate between the cases of 1827 transmitting data "before" (possibly multiple times) or "during" the 1828 handshake. Alternatively, it could also assume that data that are 1829 handed over early will be transmitted as early as possible, and 1830 "before" the handshake would only be used for messages that are 1831 explicitly marked as "idempotent" (i.e., it would be acceptable to 1832 transfer them multiple times). 1834 The amount of data that can successfully be transmitted before or 1835 during the handshake depends on various factors: the transport 1836 protocol, the use of header options, the choice of IPv4 and IPv6 and 1837 the Path MTU. A transport system should therefore allow a sending 1838 application to query the maximum amount of data it can possibly 1839 transmit before (or, if exposed, during) connection establishment. 1841 A.3.4. Sender Running Dry 1843 The transport feature "Notification that the stack has no more user 1844 data to send" relates to SCTP's "SENDER DRY" notification. Such 1845 notifications can, in principle, be used to avoid having an 1846 unnecessarily large send buffer, yet ensure that the transport sender 1847 always has data available when it has an opportunity to transmit it. 1848 This has been found to be very beneficial for some applications 1849 [WWDC2015]. However, "SENDER DRY" truly means that the entire send 1850 buffer (including both unsent and unacknowledged data) has emptied -- 1851 i.e., when it notifies the sender, it is already too late, the 1852 transport protocol already missed an opportunity to send data. Some 1853 modern TCP implementations now include the unspecified 1854 "TCP_NOTSENT_LOWAT" socket option that was proposed in [WWDC2015], 1855 which limits the amount of unsent data that TCP can keep in the 1856 socket buffer; this allows to specify at which buffer filling level 1857 the socket becomes writable, rather than waiting for the buffer to 1858 run empty. 1860 SCTP allows to configure the sender-side buffer too: the automatable 1861 Transport Feature "Configure send buffer size" provides this 1862 functionality, but only for the complete buffer, which includes both 1863 unsent and unacknowledged data. SCTP does not allow to control these 1864 two sizes separately. It therefore makes sense for a transport 1865 system to allow for uniform access to "TCP_NOTSENT_LOWAT" as well as 1866 the "SENDER DRY" notification. 1868 A.3.5. Capacity Profile 1870 The transport features: 1872 o Disable Nagle algorithm 1873 o Enable and configure a "Low Extra Delay Background Transfer" 1874 o Specify DSCP field 1876 all relate to a QoS-like application need such as "low latency" or 1877 "scavenger". In the interest of flexibility of a transport system, 1878 they could therefore be offered in a uniform, more abstract way, 1879 where a transport system could e.g. decide by itself how to use 1880 combinations of LEDBAT-like congestion control and certain DSCP 1881 values, and an application would only specify a general "capacity 1882 profile" (a description of how it wants to use the available 1883 capacity). A need for "lowest possible latency at the expense of 1884 overhead" could then translate into automatically disabling the Nagle 1885 algorithm. 1887 In some cases, the Nagle algorithm is best controlled directly by the 1888 application because it is not only related to a general profile but 1889 also to knowledge about the size of future messages. For fine-grain 1890 control over Nagle-like functionality, the "Request not to bundle 1891 messages" is available. 1893 A.3.6. Security 1895 Both TCP and SCTP offer authentication. TCP authenticates complete 1896 segments. SCTP allows to configure which of SCTP's chunk types must 1897 always be authenticated -- if this is exposed as such, it creates an 1898 undesirable dependency on the transport protocol. For compatibility 1899 with TCP, a transport system should only allow to configure complete 1900 transport layer packets, including headers, IP pseudo-header (if any) 1901 and payload. 1903 Security is discussed in a separate document 1904 [I-D.ietf-taps-transport-security]. The minimal set presented in the 1905 present document therefore excludes all security related transport 1906 features: "Configure authentication", "Change authentication 1907 parameters", "Obtain authentication information" and and "Set Cookie 1908 life value" as well as "Specifying a key id to be used to 1909 authenticate a message". 1911 A.3.7. Packet Size 1913 UDP(-Lite) has a transport feature called "Specify DF field". This 1914 yields an error message in case of sending a message that exceeds the 1915 Path MTU, which is necessary for a UDP-based application to be able 1916 to implement Path MTU Discovery (a function that UDP-based 1917 applications must do by themselves). The "Get max. transport-message 1918 size that may be sent using a non-fragmented IP packet from the 1919 configured interface" transport feature yields an upper limit for the 1920 Path MTU (minus headers) and can therefore help to implement Path MTU 1921 Discovery more efficiently. 1923 Appendix B. Revision information 1925 XXX RFC-Ed please remove this section prior to publication. 1927 -02: implementation suggestions added, discussion section added, 1928 terminology extended, DELETED category removed, various other fixes; 1929 list of Transport Features adjusted to -01 version of [RFC8303] 1930 except that MPTCP is not included. 1932 -03: updated to be consistent with -02 version of [RFC8303]. 1934 -04: updated to be consistent with -03 version of [RFC8303]. 1935 Reorganized document, rewrote intro and conclusion, and made a first 1936 stab at creating a real "minimal set". 1938 -05: updated to be consistent with -05 version of [RFC8303] (minor 1939 changes). Fixed a mistake regarding Cookie Life value. Exclusion of 1940 security related transport features (to be covered in a separate 1941 document). Reorganized the document (now begins with the minset, 1942 derivation is in the appendix). First stab at an abstract API for 1943 the minset. 1945 draft-ietf-taps-minset-00: updated to be consistent with -08 version 1946 of [RFC8303] ("obtain message delivery number" was removed, as this 1947 has also been removed in [RFC8303] because it was a mistake in 1948 RFC4960. This led to the removal of two more transport features that 1949 were only designated as functional because they affected "obtain 1950 message delivery number"). Fall-back to UDP incorporated (this was 1951 requested at IETF-99); this also affected the transport feature 1952 "Choice between unordered (potentially faster) or ordered delivery of 1953 messages" because this is a boolean which is always true for one 1954 fall-back protocol, and always false for the other one. This was 1955 therefore now divided into two features, one for ordered, one for 1956 unordered delivery. The word "reliably" was added to the transport 1957 features "Hand over a message to reliably transfer (possibly multiple 1958 times) before connection establishment" and "Hand over a message to 1959 reliably transfer during connection establishment" to make it clearer 1960 why this is not supported by UDP. Clarified that the "minset 1961 abstract interface" is not proposing a specific API for all TAPS 1962 systems to implement, but it is just a way to describe the minimum 1963 set. Author order changed. 1965 WG -01: "fall-back to" (TCP or UDP) replaced (mostly with 1966 "implementation over"). References to post-sockets removed (these 1967 were statments that assumed that post-sockets requires two-sided 1968 implementation). Replaced "flow" with "TAPS Connection" and "frame" 1969 with "message" to avoid introducing new terminology. Made sections 3 1970 and 4 in line with the categorization that is already used in the 1971 appendix and [RFC8303], and changed style of section 4 to be even 1972 shorter and less interface-like. Updated reference draft-ietf-tsvwg- 1973 sctp-ndata to RFC8260. 1975 WG -02: rephrased "the TAPS system" and "TAPS connection" etc. to 1976 more generally talk about transport after the intro (mostly replacing 1977 "TAPS system" with "transport system" and "TAPS connection" with 1978 "connection". Merged sections 3 and 4 to form a new section 3. 1980 WG -03: updated sentence referencing 1981 [I-D.ietf-taps-transport-security] to say that "the minimum security 1982 requirements for a taps system are discussed in a separate security 1983 document", wrote "example" in the paragraph introducing the decision 1984 tree. Removed reference draft-grinnemo-taps-he-03 and the sentence 1985 that referred to it. 1987 WG -04: addressed comments from Theresa Enghardt and Tommy Pauly. As 1988 part of that, removed "TAPS" as a term everywhere (abstract, intro, 1989 ..). 1991 Authors' Addresses 1993 Michael Welzl 1994 University of Oslo 1995 PO Box 1080 Blindern 1996 Oslo N-0316 1997 Norway 1999 Phone: +47 22 85 24 20 2000 Email: michawe@ifi.uio.no 2002 Stein Gjessing 2003 University of Oslo 2004 PO Box 1080 Blindern 2005 Oslo N-0316 2006 Norway 2008 Phone: +47 22 85 24 44 2009 Email: steing@ifi.uio.no