idnits 2.17.1 draft-ietf-taps-minset-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 21, 2018) is 2227 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'SUBCATEGORY' is mentioned on line 737, but not defined -- Unexpected draft version: The latest known version of draft-tsvwg-le-phb is -00, but you're referring to -03. Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TAPS M. Welzl 3 Internet-Draft S. Gjessing 4 Intended status: Informational University of Oslo 5 Expires: September 22, 2018 March 21, 2018 7 A Minimal Set of Transport Services for TAPS Systems 8 draft-ietf-taps-minset-03 10 Abstract 12 This draft recommends a minimal set of IETF Transport Services 13 offered by end systems supporting TAPS, and gives guidance on 14 choosing among the available mechanisms and protocols. It is based 15 on the set of transport features in RFC 8303. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at https://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on September 22, 2018. 34 Copyright Notice 36 Copyright (c) 2018 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (https://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 52 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 53 3. The Minimal Set of Transport Features . . . . . . . . . . . . 5 54 3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION . . . . . . . 5 55 3.2. MAINTENANCE . . . . . . . . . . . . . . . . . . . . . . . 8 56 3.2.1. Connection groups . . . . . . . . . . . . . . . . . . 8 57 3.2.2. Individual connections . . . . . . . . . . . . . . . 10 58 3.3. DATA Transfer . . . . . . . . . . . . . . . . . . . . . . 10 59 3.3.1. Sending Data . . . . . . . . . . . . . . . . . . . . 10 60 3.3.2. Receiving Data . . . . . . . . . . . . . . . . . . . 11 61 4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 12 62 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 63 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 64 7. Security Considerations . . . . . . . . . . . . . . . . . . . 12 65 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 66 8.1. Normative References . . . . . . . . . . . . . . . . . . 13 67 8.2. Informative References . . . . . . . . . . . . . . . . . 13 68 Appendix A. Deriving the minimal set . . . . . . . . . . . . . . 15 69 A.1. Step 1: Categorization -- The Superset of Transport 70 Features . . . . . . . . . . . . . . . . . . . . . . . . 15 71 A.1.1. CONNECTION Related Transport Features . . . . . . . . 17 72 A.1.2. DATA Transfer Related Transport Features . . . . . . 32 73 A.2. Step 2: Reduction -- The Reduced Set of Transport 74 Features . . . . . . . . . . . . . . . . . . . . . . . . 37 75 A.2.1. CONNECTION Related Transport Features . . . . . . . . 38 76 A.2.2. DATA Transfer Related Transport Features . . . . . . 39 77 A.3. Step 3: Discussion . . . . . . . . . . . . . . . . . . . 40 78 A.3.1. Sending Messages, Receiving Bytes . . . . . . . . . . 40 79 A.3.2. Stream Schedulers Without Streams . . . . . . . . . . 41 80 A.3.3. Early Data Transmission . . . . . . . . . . . . . . . 42 81 A.3.4. Sender Running Dry . . . . . . . . . . . . . . . . . 43 82 A.3.5. Capacity Profile . . . . . . . . . . . . . . . . . . 43 83 A.3.6. Security . . . . . . . . . . . . . . . . . . . . . . 44 84 A.3.7. Packet Size . . . . . . . . . . . . . . . . . . . . . 44 85 Appendix B. Revision information . . . . . . . . . . . . . . . . 45 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 46 88 1. Introduction 90 The task of any system that implements TAPS is to offer transport 91 services to its applications, i.e. the applications running on top of 92 the transport system, without binding them to a particular transport 93 protocol. Currently, the set of transport services that most 94 applications use is based on TCP and UDP (and protocols that are 95 layered on top of them); this limits the ability for the network 96 stack to make use of features of other transport protocols. For 97 example, if a protocol supports out-of-order message delivery but 98 applications always assume that the network provides an ordered 99 bytestream, then the network stack can not immediately deliver a 100 message that arrives out-of-order: doing so would break a fundamental 101 assumption of the application. The net result is unnecessary head- 102 of-line blocking delay. 104 By exposing the transport services of multiple transport protocols, a 105 TAPS transport system can make it possible to use these services 106 without having to statically bind an application to a specific 107 transport protocol. The first step towards the design of such a 108 system was taken by [RFC8095], which surveys a large number of 109 transports, and [RFC8303] as well as [RFC8304], which identify the 110 specific transport features that are exposed to applications by the 111 protocols TCP, MPTCP, UDP(-Lite) and SCTP as well as the LEDBAT 112 congestion control mechanism. This memo is based on these documents 113 and follows the same terminology (also listed below). Because the 114 considered transport protocols conjointly cover a wide range of 115 transport features, there is reason to hope that the resulting set 116 (and the reasoning that led to it) will also apply to many aspects of 117 other transport protocols. 119 The number of transport features of current IETF transports is large, 120 and exposing all of them has a number of disadvantages: generally, 121 the more functionality is exposed, the less freedom a transport 122 system has to automate usage of the various functions of its 123 available set of transport protocols. Some functions only exist in 124 one particular protocol, and if an application would use them, this 125 would statically tie the application to this protocol, counteracting 126 the purpose of TAPS. Also, if the number of exposed features is 127 exceedingly large, a transport system might become very difficult to 128 use for an application programmer. Taking [RFC8303] as a basis, this 129 document therefore develops a minimal set of transport features, 130 removing the ones that could be harmful to the purpose of TAPS but 131 keeping the ones that must be retained for applications to benefit 132 from useful transport functionality. 134 Applications use a wide variety of APIs today. The transport 135 features in the minimal set in this document must be reflected in 136 *all* network APIs in order for the underlying functionality to 137 become usable everywhere. For example, it does not help an 138 application that talks to a middleware if only the Berkeley Sockets 139 API is extended to offer "unordered message delivery", but the 140 middleware only offers an ordered bytestream. Both the Berkeley 141 Sockets API and the middleware would have to expose the "unordered 142 message delivery" transport feature (alternatively, there may be ways 143 for certain types of middleware to use this transport feature without 144 exposing it, based on knowledge about the applications -- but this is 145 not the general case). In most situations, in the interest of being 146 as flexible and efficient as possible, the best choice will be for a 147 middleware or library to expose at least all of the transport 148 features that are recommended as a "minimal set" here. 150 This "minimal set" can be implemented one-sided over TCP (or UDP, if 151 certain limitations are put in place). This means that a sender-side 152 TAPS system implementing it can talk to a non-TAPS TCP (or UDP) 153 receiver, and a receiver-side TAPS system implementing it can talk to 154 a non-TAPS TCP (or UDP) sender. 156 2. Terminology 158 The following terms are used throughout this document, and in 159 subsequent documents produced by TAPS that describe the composition 160 and decomposition of transport services. 162 Transport Feature: a specific end-to-end feature that the transport 163 layer provides to an application. Examples include 164 confidentiality, reliable delivery, ordered delivery, message- 165 versus-stream orientation, etc. 166 Transport Service: a set of Transport Features, without an 167 association to any given framing protocol, which provides a 168 complete service to an application. 169 Transport Protocol: an implementation that provides one or more 170 different transport services using a specific framing and header 171 format on the wire. 172 Transport Service Instance: an arrangement of transport protocols 173 with a selected set of features and configuration parameters that 174 implements a single transport service, e.g., a protocol stack (RTP 175 over UDP). 176 Application: an entity that uses the transport layer for end-to-end 177 delivery data across the network (this may also be an upper layer 178 protocol or tunnel encapsulation). 179 Application-specific knowledge: knowledge that only applications 180 have. 181 Endpoint: an entity that communicates with one or more other 182 endpoints using a transport protocol. 183 Connection: shared state of two or more endpoints that persists 184 across messages that are transmitted between these endpoints. 185 Socket: the combination of a destination IP address and a 186 destination port number. 188 Moreover, throughout the document, the protocol name "UDP(-Lite)" is 189 used when discussing transport features that are equivalent for UDP 190 and UDP-Lite; similarly, the protocol name "TCP" refers to both TCP 191 and MPTCP. 193 3. The Minimal Set of Transport Features 195 Based on the categorization, reduction and discussion in Appendix A, 196 this section describes the minimal set of transport features that is 197 offered by end systems supporting TAPS. The described transport 198 system can be implemented over TCP; elements of the system that may 199 prohibit implementation over UDP are marked with "!UDP". To 200 implement a transport system that can also work over UDP, these 201 marked transport features should be excluded. 203 As in Appendix A, Appendix A.2 and [RFC8303], we categorize the 204 minimal set of transport features as 1) CONNECTION related 205 (ESTABLISHMENT, AVAILABILITY, MAINTENANCE, TERMINATION) and 2) DATA 206 Transfer related (Sending Data, Receiving Data, Errors). Here, the 207 focus is on connections that the transport system offers, as opposed 208 to connections of transport protocols that the transport system uses. 210 3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION 212 A connection must first be "created" to allow for some initial 213 configuration to be carried out before the transport system can 214 actively or passively establish communication with a remote endpoint. 215 All configuration parameters in Section 3.2 can be used initially, 216 although some of them may only take effect when a connection has been 217 established with a chosen transport protocol. Configuring a 218 connection early helps a transport system make the right decisions. 219 For example, grouping information can influence the transport system 220 to implement a connection as a stream of a multi-streaming protocol's 221 existing association or not. 223 For ungrouped connections, early configuration is necessary because 224 it allows the transport system to know which protocols it should try 225 to use. In particular, a transport system that only makes a one-time 226 choice for a particular protocol must know early about strict 227 requirements that must be kept, or it can end up in a deadlock 228 situation (e.g., having chosen UDP and later be asked to support 229 reliable transfer). As an example description of how to correctly 230 handle these cases, we provide the following decision tree (this is 231 derived from Appendix A.2.1 excluding authentication, as explained in 232 Section 7): 234 - Will it ever be necessary to offer any of the following? 235 * Reliably transfer data 236 * Notify the peer of closing/aborting 237 * Preserve data ordering 239 Yes: SCTP or TCP can be used. 240 - Is any of the following useful to the application? 241 * Choosing a scheduler to operate between connections 242 in a group, with the possibility to configure a priority 243 or weight per connection 244 * Configurable message reliability 245 * Unordered message delivery 246 * Request not to delay the acknowledgement (SACK) of a message 248 Yes: SCTP is preferred. 249 No: 250 - Is any of the following useful to the application? 251 * Hand over a message to reliably transfer (possibly 252 multiple times) before connection establishment 253 * Suggest timeout to the peer 254 * Notification of Excessive Retransmissions (early 255 warning below abortion threshold) 256 * Notification of ICMP error message arrival 258 Yes: TCP is preferred. 259 No: SCTP and TCP are equally preferable. 261 No: all protocols can be used. 262 - Is any of the following useful to the application? 263 * Specify checksum coverage used by the sender 264 * Specify minimum checksum coverage required by receiver 266 Yes: UDP-Lite is preferred. 267 No: UDP is preferred. 269 Note that this decision tree is not optimal for all cases. For 270 example, if an application wants to use "Specify checksum coverage 271 used by the sender", which is only offered by UDP-Lite, and 272 "Configure priority or weight for a scheduler", which is only offered 273 by SCTP, the above decision tree will always choose UDP-Lite, making 274 it impossible to use SCTP's schedulers with priorities between 275 grouped connections. The transport system must know which choice is 276 more important for the application in order to make the best 277 decision. We caution implementers to be aware of the full set of 278 trade-offs, for which we recommend consulting the list in 279 Appendix A.2.1 when deciding how to initialize a connection. 281 To summarize, the following parameters serve as input for the 282 transport system to help it choose and configure a suitable protocol: 284 o Reliability: a boolean that should be set to true when any of the 285 following will be useful to the application: reliably transfer 286 data; notify the peer of closing/aborting; preserve data ordering. 287 o Checksum_coverage: a boolean to specify whether it will be useful 288 to the application to specify checksum coverage when sending or 289 receiving. 290 o Config_msg_prio: a boolean that should be set to true when any of 291 the following per-message configuration or prioritization 292 mechanisms will be useful to the application: choosing a scheduler 293 to operate between grouped connections, with the possibility to 294 configure a priority or weight per connection; configurable 295 message reliability; unordered message delivery; requesting not to 296 delay the acknowledgement (SACK) of a message. 297 o Earlymsg_timeout_notifications: a boolean that should be set to 298 true when any of the following will be useful to the application: 299 hand over a message to reliably transfer (possibly multiple times) 300 before connection establishment; suggest timeout to the peer; 301 notification of excessive retransmissions (early warning below 302 abortion threshold); notification of ICMP error message arrival. 304 Once a connection is created, it can be queried for the maximum 305 amount of data that an application can possibly expect to have 306 reliably transmitted before or during transport connection 307 establishment (with zero being a possible answer) (see 308 Section 3.2.1). An application can also give the connection a 309 message for reliable transmission before or during connection 310 establishment (!UDP); the transport system will then try to transmit 311 it as early as possible. An application can facilitate sending a 312 message particularly early by marking it as "idempotent" (see 313 Section 3.3.1); in this case, the receiving application must be 314 prepared to potentially receive multiple copies of the message 315 (because idempotent messages are reliably transferred, asking for 316 idempotence is not necessary for systems that support UDP). 318 After creation, a transport system can actively establish 319 communication with a peer, or it can passively listen for incoming 320 connection requests. Note that active establishment may or may not 321 trigger a notification on the listening side. It is possible that 322 the first notification on the listening side is the arrival of the 323 first data that the active side sends (a receiver-side transport 324 system could handle this by continuing to block a "Listen" call, 325 immediately followed by issuing "Receive", for example; callback- 326 based implementations could simply skip the equivalent of "Listen"). 327 This also means that the active opening side is assumed to be the 328 first side sending data. 330 A transport system can actively close a connection, i.e. terminate it 331 after reliably delivering all remaining data to the peer (if reliable 332 data delivery was requested earlier (!UDP)), in which case the peer 333 is notified that the connection is closed. Alternatively, a 334 connection can be aborted without delivering outstanding data to the 335 peer. In case reliable or partially reliable data delivery was 336 requested earlier (!UDP), the peer is notified that the connection is 337 aborted. A timeout can be configured to abort a connection when data 338 could not be delivered for too long (!UDP); however, timeout-based 339 abortion does not notify the peer application that the connection has 340 been aborted. Because half-closed connections are not supported, 341 when a host implementing TAPS receives a notification that the peer 342 is closing or aborting the connection (!UDP), its peer may not be 343 able to read outstanding data. This means that unacknowledged data 344 residing a transport system's send buffer may have to be dropped from 345 that buffer upon arrival of a "close" or "abort" notification from 346 the peer. 348 3.2. MAINTENANCE 350 A transport system must offer means to group connections, but it 351 cannot guarantee truly grouping them using the transport protocols 352 that it uses (e.g., it cannot be guaranteed that connections become 353 multiplexed as streams on a single SCTP association when SCTP may not 354 be available). The transport system must therefore ensure that 355 group- versus non-group-configurations are handled correctly in some 356 way (e.g., by applying the configuration to all grouped connections 357 even when they are not multiplexed, or informing the application 358 about grouping success or failure). 360 As a general rule, any configuration described below should be 361 carried out as early as possible to aid the transport system's 362 decision making. 364 3.2.1. Connection groups 366 The following transport features and notifications (some directly 367 from Appendix A.2, some new or changed, based on the discussion in 368 Appendix A.3) automatically apply to all grouped connections: 370 (!UDP) Configure a timeout: this can be done with the following 371 parameters: 373 o A timeout value for aborting connections, in seconds 374 o A timeout value to be suggested to the peer (if possible), in 375 seconds 376 o The number of retransmissions after which the application should 377 be notifed of "Excessive Retransmissions" 379 Configure urgency: this can be done with the following parameters: 381 o A number to identify the type of scheduler that should be used to 382 operate between connections in the group (no guarantees given). 383 Schedulers are defined in [RFC8260]. 384 o A "capacity profile" number to identify how an application wants 385 to use its available capacity. Choices can be "lowest possible 386 latency at the expense of overhead" (which would disable any 387 Nagle-like algorithm), "scavenger", or values that help determine 388 the DSCP value for a connection (e.g. similar to table 1 in 389 [I-D.ietf-tsvwg-rtcweb-qos]). 390 o A buffer limit (in bytes); when the sender has less then the 391 provided limit of bytes in the buffer, the application may be 392 notified. Notifications are not guaranteed, and it is optional 393 for a transport system to support buffer limit values greater than 394 0. Note that this limit and its notification should operate 395 across the buffers of the whole transport system, i.e. also any 396 potential buffers that the transport system itself may use on top 397 of the transport's send buffer. 399 Following Appendix A.3.7, these properties can be queried: 401 o The maximum message size that may be sent without fragmentation 402 via the configured interface. This is optional for a transport 403 system to offer, and may return an error ("not available"). It 404 can aid applications implementing Path MTU Discovery. 405 o The maximum transport message size that can be sent, in bytes. 406 Irrespective of fragmentation, there is a size limit for the 407 messages that can be handed over to SCTP or UDP(-Lite); because 408 the service provided by a transport system is independent of the 409 transport protocol, it must allow an application to query this 410 value -- the maximum size of a message in an Application-Framed- 411 Bytestream (see Appendix A.3.1). This may also return an error 412 when data is not delimited ("not available"). 413 o The maximum transport message size that can be received from the 414 configured interface, in bytes (or "not available"). 415 o The maximum amount of data that can possibly be sent before or 416 during connection establishment, in bytes. 418 In addition to the already mentioned closing / aborting notifications 419 and possible send errors, the following notifications can occur: 421 o Excessive Retransmissions: the configured (or a default) number of 422 retransmissions has been reached, yielding this early warning 423 below an abortion threshold. 424 o ICMP Arrival (parameter: ICMP message): an ICMP packet carrying 425 the conveyed ICMP message has arrived. 427 o ECN Arrival (parameter: ECN value): a packet carrying the conveyed 428 ECN value has arrived. This can be useful for applications 429 implementing congestion control. 430 o Timeout (parameter: s seconds): data could not be delivered for s 431 seconds. 432 o Drain: the send buffer has either drained below the configured 433 buffer limit or it has become completely empty. This is a generic 434 notification that tries to enable uniform access to 435 "TCP_NOTSENT_LOWAT" as well as the "SENDER DRY" notification (as 436 discussed in Appendix A.3.4 -- SCTP's "SENDER DRY" is a special 437 case where the threshold (for unsent data) is 0 and there is also 438 no more unacknowledged data in the send buffer). 440 3.2.2. Individual connections 442 Configure priority or weight for a scheduler, as described in 443 [RFC8260]. 445 Configure checksum usage: this can be done with the following 446 parameters, but there is no guarantee that any checksum limitations 447 will indeed be enforced (the default behavior is "full coverage, 448 checksum enabled"): 450 o A boolean to enable / disable usage of a checksum when sending 451 o The desired coverage (in bytes) of the checksum used when sending 452 o A boolean to enable / disable requiring a checksum when receiving 453 o The required minimum coverage (in bytes) of the checksum when 454 receiving 456 3.3. DATA Transfer 458 3.3.1. Sending Data 460 When sending a message, no guarantees are given about the 461 preservation of message boundaries to the peer; if message boundaries 462 are needed, the receiving application at the peer must know about 463 them beforehand (or the transport system cannot use TCP). Note that 464 an application should already be able to hand over data before the 465 transport system establishes a connection with a chosen transport 466 protocol. Regarding the message that is being handed over, the 467 following parameters can be used: 469 o Reliability: this parameter is used to convey a choice of: fully 470 reliable (!UDP), unreliable without congestion control, unreliable 471 (!UDP), partially reliable (see [RFC3758] and [RFC7496] for 472 details on how to specify partial reliability) (!UDP). The latter 473 two choices are optional for a transport system to offer and may 474 result in full reliability. Note that applications sending 475 unreliable data without congestion control should themselves 476 perform congestion control in accordance with [RFC2914]. 477 o (!UDP) Ordered: this boolean parameter lets an application choose 478 between ordered message delivery (true) and possibly unordered, 479 potentially faster message delivery (false). 480 o Bundle: a boolean that expresses a preference for allowing to 481 bundle messages (true) or not (false). No guarantees are given. 482 o DelAck: a boolean that, if false, lets an application request that 483 the peer would not delay the acknowledgement for this message. 484 o Fragment: a boolean that expresses a preference for allowing to 485 fragment messages (true) or not (false), at the IP level. No 486 guarantees are given. 487 o (!UDP) Idempotent: a boolean that expresses whether a message is 488 idempotent (true) or not (false). Idempotent messages may arrive 489 multiple times at the receiver (but they will arrive at least 490 once). When data is idempotent it can be used by the receiver 491 immediately on a connection establishment attempt. Thus, if data 492 is handed over before the transport system establishes a 493 connection with a chosen transport protocol, stating that a 494 message is idempotent facilitates transmitting it to the peer 495 application particularly early. 497 An application can be notified of a failure to send a specific 498 message. There is no guarantee of such notifications, i.e. send 499 failures can also silently occur. 501 3.3.2. Receiving Data 503 A receiving application obtains an "Application-Framed Bytestream" 504 (AFra-Bytestream); this concept is further described in 505 Appendix A.3.1). In line with TCP's receiver semantics, an AFra- 506 Bytestream is just a stream of bytes to the receiver. If message 507 boundaries were specified by the sender, a receiver-side transport 508 system implementing only the minimum set of transport services 509 defined here will still not inform the receiving application about 510 them (this limitation is only needed for transport systems that are 511 implemented to directly use TCP). 513 Different from TCP's semantics, if the sending application has 514 allowed that messages are not fully reliably transferred, or 515 delivered out of order, then such re-ordering or unreliability may be 516 reflected per message in the arriving data. Messages will always 517 stay intact - i.e. if an incomplete message is contained at the end 518 of the arriving data block, this message is guaranteed to continue in 519 the next arriving data block. 521 4. Conclusion 523 By decoupling applications from transport protocols, a TAPS transport 524 system provides a different abstraction level than the Berkeley 525 sockets interface. As with high- vs. low-level programming 526 languages, a higher abstraction level allows more freedom for 527 automation below the interface, yet it takes some control away from 528 the application programmer. This is the design trade-off that a 529 transport system developer is facing, and this document provides 530 guidance on the design of this abstraction level. Some transport 531 features are currently rarely offered by APIs, yet they must be 532 offered or they can never be used ("functional" transport features). 533 Other transport features are offered by the APIs of the protocols 534 covered here, but not exposing them in a TAPS API would allow for 535 more freedom to automate protocol usage in a transport system. The 536 minimal set presented in this document is an effort to find a middle 537 ground that can be recommended for transport systems to implement, on 538 the basis of the transport features discussed in [RFC8303]. 540 5. Acknowledgements 542 The authors would like to thank all the participants of the TAPS 543 Working Group and the NEAT and MAMI research projects for valuable 544 input to this document. We especially thank Michael Tuexen for help 545 with connection connection establishment/teardown and Gorry Fairhurst 546 for his suggestions regarding fragmentation and packet sizes. This 547 work has received funding from the European Union's Horizon 2020 548 research and innovation programme under grant agreement No. 644334 549 (NEAT). 551 6. IANA Considerations 553 XX RFC ED - PLEASE REMOVE THIS SECTION XXX 555 This memo includes no request to IANA. 557 7. Security Considerations 559 Authentication, confidentiality protection, and integrity protection 560 are identified as transport features by [RFC8095]. As currently 561 deployed in the Internet, these features are generally provided by a 562 protocol or layer on top of the transport protocol; no current full- 563 featured standards-track transport protocol provides all of these 564 transport features on its own. Therefore, these transport features 565 are not considered in this document, with the exception of native 566 authentication capabilities of TCP and SCTP for which the security 567 considerations in [RFC5925] and [RFC4895] apply. The minimum 568 security requirements for a taps system are discussed in a separate 569 security document [I-D.pauly-taps-transport-security]. 571 8. References 573 8.1. Normative References 575 [RFC8303] Welzl, M., Tuexen, M., and N. Khademi, "On the Usage of 576 Transport Features Provided by IETF Transport Protocols", 577 RFC 8303, DOI 10.17487/RFC8303, February 2018, 578 . 580 8.2. Informative References 582 [COBS] Cheshire, S. and M. Baker, "Consistent Overhead Byte 583 Stuffing", September 1997, 584 . 586 [I-D.ietf-tsvwg-rtcweb-qos] 587 Jones, P., Dhesikan, S., Jennings, C., and D. Druta, "DSCP 588 Packet Markings for WebRTC QoS", draft-ietf-tsvwg-rtcweb- 589 qos-18 (work in progress), August 2016. 591 [I-D.pauly-taps-transport-security] 592 Pauly, T., Perkins, C., Rose, K., and C. Wood, "A Survey 593 of Transport Security Protocols", draft-pauly-taps- 594 transport-security-02 (work in progress), March 2018. 596 [LBE-draft] 597 Bless, R., "A Lower Effort Per-Hop Behavior (LE PHB)", 598 Internet-draft draft-tsvwg-le-phb-03, February 2018. 600 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 601 RFC 2914, DOI 10.17487/RFC2914, September 2000, 602 . 604 [RFC3758] Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P. 605 Conrad, "Stream Control Transmission Protocol (SCTP) 606 Partial Reliability Extension", RFC 3758, 607 DOI 10.17487/RFC3758, May 2004, 608 . 610 [RFC4895] Tuexen, M., Stewart, R., Lei, P., and E. Rescorla, 611 "Authenticated Chunks for the Stream Control Transmission 612 Protocol (SCTP)", RFC 4895, DOI 10.17487/RFC4895, August 613 2007, . 615 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 616 Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, 617 . 619 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 620 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 621 June 2010, . 623 [RFC7305] Lear, E., Ed., "Report from the IAB Workshop on Internet 624 Technology Adoption and Transition (ITAT)", RFC 7305, 625 DOI 10.17487/RFC7305, July 2014, 626 . 628 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 629 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 630 . 632 [RFC7496] Tuexen, M., Seggelmann, R., Stewart, R., and S. Loreto, 633 "Additional Policies for the Partially Reliable Stream 634 Control Transmission Protocol Extension", RFC 7496, 635 DOI 10.17487/RFC7496, April 2015, 636 . 638 [RFC8095] Fairhurst, G., Ed., Trammell, B., Ed., and M. Kuehlewind, 639 Ed., "Services Provided by IETF Transport Protocols and 640 Congestion Control Mechanisms", RFC 8095, 641 DOI 10.17487/RFC8095, March 2017, 642 . 644 [RFC8260] Stewart, R., Tuexen, M., Loreto, S., and R. Seggelmann, 645 "Stream Schedulers and User Message Interleaving for the 646 Stream Control Transmission Protocol", RFC 8260, 647 DOI 10.17487/RFC8260, November 2017, 648 . 650 [RFC8304] Fairhurst, G. and T. Jones, "Transport Features of the 651 User Datagram Protocol (UDP) and Lightweight UDP (UDP- 652 Lite)", RFC 8304, DOI 10.17487/RFC8304, February 2018, 653 . 655 [WWDC2015] 656 Lakhera, P. and S. Cheshire, "Your App and Next Generation 657 Networks", Apple Worldwide Developers Conference 2015, San 658 Francisco, USA, June 2015, 659 . 661 Appendix A. Deriving the minimal set 663 We approach the construction of a minimal set of transport features 664 in the following way: 666 1. Categorization: the superset of transport features from [RFC8303] 667 is presented, and transport features are categorized for later 668 reduction. 669 2. Reduction: a shorter list of transport features is derived from 670 the categorization in the first step. This removes all transport 671 features that do not require application-specific knowledge or 672 cannot be implemented with TCP or UDP. 673 3. Discussion: the resulting list shows a number of peculiarities 674 that are discussed, to provide a basis for constructing the 675 minimal set. 676 4. Construction: Based on the reduced set and the discussion of the 677 transport features therein, a minimal set is constructed. 679 The first three steps as well as the underlying rationale for 680 constructing the minimal set are described in this appendix. The 681 minimal set itself is described in Section 3. 683 A.1. Step 1: Categorization -- The Superset of Transport Features 685 Following [RFC8303], we divide the transport features into two main 686 groups as follows: 688 1. CONNECTION related transport features 689 - ESTABLISHMENT 690 - AVAILABILITY 691 - MAINTENANCE 692 - TERMINATION 694 2. DATA Transfer related transport features 695 - Sending Data 696 - Receiving Data 697 - Errors 699 We assume that applications have no specific requirements that need 700 knowledge about the network, e.g. regarding the choice of network 701 interface or the end-to-end path. Even with these assumptions, there 702 are certain requirements that are strictly kept by transport 703 protocols today, and these must also be kept by a transport system. 704 Some of these requirements relate to transport features that we call 705 "Functional". 707 Functional transport features provide functionality that cannot be 708 used without the application knowing about them, or else they violate 709 assumptions that might cause the application to fail. For example, 710 ordered message delivery is a functional transport feature: it cannot 711 be configured without the application knowing about it because the 712 application's assumption could be that messages always arrive in 713 order. Failure includes any change of the application behavior that 714 is not performance oriented, e.g. security. 716 "Change DSCP" and "Disable Nagle algorithm" are examples of transport 717 features that we call "Optimizing": if a transport system 718 autonomously decides to enable or disable them, an application will 719 not fail, but a transport system may be able to communicate more 720 efficiently if the application is in control of this optimizing 721 transport feature. These transport features require application- 722 specific knowledge (e.g., about delay/bandwidth requirements or the 723 length of future data blocks that are to be transmitted). 725 The transport features of IETF transport protocols that do not 726 require application-specific knowledge and could therefore be 727 transparently utilized by a transport system are called 728 "Automatable". 730 Finally, some transport features are aggregated and/or slightly 731 changed in the description below. These transport features are 732 marked as "ADDED". The corresponding transport features are 733 automatable, and they are listed immediately below the "ADDED" 734 transport feature. 736 In this description, transport services are presented following the 737 nomenclature "CATEGORY.[SUBCATEGORY].SERVICENAME.PROTOCOL", 738 equivalent to "pass 2" in [RFC8303]. We also sketch how some of the 739 TAPS transport features can be implemented by a transport system. 740 For all transport features that are categorized as "functional" or 741 "optimizing", and for which no matching TCP and/or UDP primitive 742 exists in "pass 2" of [RFC8303], a brief discussion on how to 743 implement them over TCP and/or UDP is included. 745 We designate some transport features as "automatable" on the basis of 746 a broader decision that affects multiple transport features: 748 o Most transport features that are related to multi-streaming were 749 designated as "automatable". This was done because the decision 750 on whether to use multi-streaming or not does not depend on 751 application-specific knowledge. This means that a connection that 752 is exhibited to an application could be implemented by using a 753 single stream of an SCTP association instead of mapping it to a 754 complete SCTP association or TCP connection. This could be 755 achieved by using more than one stream when an SCTP association is 756 first established (CONNECT.SCTP parameter "outbound stream 757 count"), maintaining an internal stream number, and using this 758 stream number when sending data (SEND.SCTP parameter "stream 759 number"). Closing or aborting a connection could then simply free 760 the stream number for future use. This is discussed further in 761 Appendix A.3.2. 762 o All transport features that are related to using multiple paths or 763 the choice of the network interface were designated as 764 "automatable". Choosing a path or an interface does not depend on 765 application-specific knowledge. For example, "Listen" could 766 always listen on all available interfaces and "Connect" could use 767 the default interface for the destination IP address. 769 A.1.1. CONNECTION Related Transport Features 771 ESTABLISHMENT: 773 o Connect 774 Protocols: TCP, SCTP, UDP(-Lite) 775 Functional because the notion of a connection is often reflected 776 in applications as an expectation to be able to communicate after 777 a "Connect" succeeded, with a communication sequence relating to 778 this transport feature that is defined by the application 779 protocol. 780 Implementation: via CONNECT.TCP, CONNECT.SCTP or CONNECT.UDP(- 781 Lite). 783 o Specify which IP Options must always be used 784 Protocols: TCP, UDP(-Lite) 785 Automatable because IP Options relate to knowledge about the 786 network, not the application. 788 o Request multiple streams 789 Protocols: SCTP 790 Automatable because using multi-streaming does not require 791 application-specific knowledge. 792 Implementation: see Appendix A.3.2. 794 o Limit the number of inbound streams 795 Protocols: SCTP 796 Automatable because using multi-streaming does not require 797 application-specific knowledge. 799 Implementation: see Appendix A.3.2. 801 o Specify number of attempts and/or timeout for the first 802 establishment message 803 Protocols: TCP, SCTP 804 Functional because this is closely related to potentially assumed 805 reliable data delivery for data that is sent before or during 806 connection establishment. 807 Implementation: Using a parameter of CONNECT.TCP and CONNECT.SCTP. 808 Implementation over UDP: Do nothing (this is irrelevant in case of 809 UDP because there, reliable data delivery is not assumed). 811 o Obtain multiple sockets 812 Protocols: SCTP 813 Automatable because the usage of multiple paths to communicate to 814 the same end host relates to knowledge about the network, not the 815 application. 817 o Disable MPTCP 818 Protocols: MPTCP 819 Automatable because the usage of multiple paths to communicate to 820 the same end host relates to knowledge about the network, not the 821 application. 822 Implementation: via a boolean parameter in CONNECT.MPTCP. 824 o Configure authentication 825 Protocols: TCP, SCTP 826 Functional because this has a direct influence on security. 827 Implementation: via parameters in CONNECT.TCP and CONNECT.SCTP. 828 Implementation over TCP: With TCP, this allows to configure Master 829 Key Tuples (MKTs) to authenticate complete segments (including the 830 TCP IPv4 pseudoheader, TCP header, and TCP data). With SCTP, this 831 allows to specify which chunk types must always be authenticated. 832 Authenticating only certain chunk types creates a reduced level of 833 security that is not supported by TCP; to be compatible, this 834 should therefore only allow to authenticate all chunk types. Key 835 material must be provided in a way that is compatible with both 836 [RFC4895] and [RFC5925]. 837 Implementation over UDP: Not possible. 839 o Indicate (and/or obtain upon completion) an Adaptation Layer via 840 an adaptation code point 841 Protocols: SCTP 842 Functional because it allows to send extra data for the sake of 843 identifying an adaptation layer, which by itself is application- 844 specific. 845 Implementation: via a parameter in CONNECT.SCTP. 846 Implementation over TCP: not possible. 847 Implementation over UDP: not possible. 849 o Request to negotiate interleaving of user messages 850 Protocols: SCTP 851 Automatable because it requires using multiple streams, but 852 requesting multiple streams in the CONNECTION.ESTABLISHMENT 853 category is automatable. 854 Implementation: via a parameter in CONNECT.SCTP. 856 o Hand over a message to reliably transfer (possibly multiple times) 857 before connection establishment 858 Protocols: TCP 859 Functional because this is closely tied to properties of the data 860 that an application sends or expects to receive. 861 Implementation: via a parameter in CONNECT.TCP. 862 Implementation over UDP: not possible. 864 o Hand over a message to reliably transfer during connection 865 establishment 866 Protocols: SCTP 867 Functional because this can only work if the message is limited in 868 size, making it closely tied to properties of the data that an 869 application sends or expects to receive. 870 Implementation: via a parameter in CONNECT.SCTP. 871 Implementation over UDP: not possible. 873 o Enable UDP encapsulation with a specified remote UDP port number 874 Protocols: SCTP 875 Automatable because UDP encapsulation relates to knowledge about 876 the network, not the application. 878 AVAILABILITY: 880 o Listen 881 Protocols: TCP, SCTP, UDP(-Lite) 882 Functional because the notion of accepting connection requests is 883 often reflected in applications as an expectation to be able to 884 communicate after a "Listen" succeeded, with a communication 885 sequence relating to this transport feature that is defined by the 886 application protocol. 887 ADDED. This differs from the 3 automatable transport features 888 below in that it leaves the choice of interfaces for listening 889 open. 890 Implementation: by listening on all interfaces via LISTEN.TCP (not 891 providing a local IP address) or LISTEN.SCTP (providing SCTP port 892 number / address pairs for all local IP addresses). LISTEN.UDP(- 893 Lite) supports both methods. 895 o Listen, 1 specified local interface 896 Protocols: TCP, SCTP, UDP(-Lite) 897 Automatable because decisions about local interfaces relate to 898 knowledge about the network and the Operating System, not the 899 application. 901 o Listen, N specified local interfaces 902 Protocols: SCTP 903 Automatable because decisions about local interfaces relate to 904 knowledge about the network and the Operating System, not the 905 application. 907 o Listen, all local interfaces 908 Protocols: TCP, SCTP, UDP(-Lite) 909 Automatable because decisions about local interfaces relate to 910 knowledge about the network and the Operating System, not the 911 application. 913 o Specify which IP Options must always be used 914 Protocols: TCP, UDP(-Lite) 915 Automatable because IP Options relate to knowledge about the 916 network, not the application. 918 o Disable MPTCP 919 Protocols: MPTCP 920 Automatable because the usage of multiple paths to communicate to 921 the same end host relates to knowledge about the network, not the 922 application. 924 o Configure authentication 925 Protocols: TCP, SCTP 926 Functional because this has a direct influence on security. 927 Implementation: via parameters in LISTEN.TCP and LISTEN.SCTP. 928 Implementation over TCP: With TCP, this allows to configure Master 929 Key Tuples (MKTs) to authenticate complete segments (including the 930 TCP IPv4 pseudoheader, TCP header, and TCP data). With SCTP, this 931 allows to specify which chunk types must always be authenticated. 932 Authenticating only certain chunk types creates a reduced level of 933 security that is not supported by TCP; to be compatible, this 934 should therefore only allow to authenticate all chunk types. Key 935 material must be provided in a way that is compatible with both 936 [RFC4895] and [RFC5925]. 937 Implementation over UDP: not possible. 939 o Obtain requested number of streams 940 Protocols: SCTP 941 Automatable because using multi-streaming does not require 942 application-specific knowledge. 943 Implementation: see Appendix A.3.2. 945 o Limit the number of inbound streams 946 Protocols: SCTP 947 Automatable because using multi-streaming does not require 948 application-specific knowledge. 949 Implementation: see Appendix A.3.2. 951 o Indicate (and/or obtain upon completion) an Adaptation Layer via 952 an adaptation code point 953 Protocols: SCTP 954 Functional because it allows to send extra data for the sake of 955 identifying an adaptation layer, which by itself is application- 956 specific. 957 Implementation: via a parameter in LISTEN.SCTP. 958 Implementation over TCP: not possible. 959 Implementation over UDP: not possible. 961 o Request to negotiate interleaving of user messages 962 Protocols: SCTP 963 Automatable because it requires using multiple streams, but 964 requesting multiple streams in the CONNECTION.ESTABLISHMENT 965 category is automatable. 966 Implementation: via a parameter in LISTEN.SCTP. 968 MAINTENANCE: 970 o Change timeout for aborting connection (using retransmit limit or 971 time value) 972 Protocols: TCP, SCTP 973 Functional because this is closely related to potentially assumed 974 reliable data delivery. 975 Implementation: via CHANGE_TIMEOUT.TCP or CHANGE_TIMEOUT.SCTP. 976 Implementation over UDP: not possible (UDP is unreliable and there 977 is no connection timeout). 979 o Suggest timeout to the peer 980 Protocols: TCP 981 Functional because this is closely related to potentially assumed 982 reliable data delivery. 983 Implementation: via CHANGE_TIMEOUT.TCP. 984 Implementation over UDP: not possible (UDP is unreliable and there 985 is no connection timeout). 987 o Disable Nagle algorithm 988 Protocols: TCP, SCTP 989 Optimizing because this decision depends on knowledge about the 990 size of future data blocks and the delay between them. 991 Implementation: via DISABLE_NAGLE.TCP and DISABLE_NAGLE.SCTP. 992 Implementation over UDP: do nothing (UDP does not implement the 993 Nagle algorithm). 995 o Request an immediate heartbeat, returning success/failure 996 Protocols: SCTP 997 Automatable because this informs about network-specific knowledge. 999 o Notification of Excessive Retransmissions (early warning below 1000 abortion threshold) 1001 Protocols: TCP 1002 Optimizing because it is an early warning to the application, 1003 informing it of an impending functional event. 1004 Implementation: via ERROR.TCP. 1005 Implementation over UDP: do nothing (there is no abortion 1006 threshold). 1008 o Add path 1009 Protocols: MPTCP, SCTP 1010 MPTCP Parameters: source-IP; source-Port; destination-IP; 1011 destination-Port 1012 SCTP Parameters: local IP address 1013 Automatable because the usage of multiple paths to communicate to 1014 the same end host relates to knowledge about the network, not the 1015 application. 1017 o Remove path 1018 Protocols: MPTCP, SCTP 1019 MPTCP Parameters: source-IP; source-Port; destination-IP; 1020 destination-Port 1021 SCTP Parameters: local IP address 1022 Automatable because the usage of multiple paths to communicate to 1023 the same end host relates to knowledge about the network, not the 1024 application. 1026 o Set primary path 1027 Protocols: SCTP 1028 Automatable because the usage of multiple paths to communicate to 1029 the same end host relates to knowledge about the network, not the 1030 application. 1032 o Suggest primary path to the peer 1033 Protocols: SCTP 1034 Automatable because the usage of multiple paths to communicate to 1035 the same end host relates to knowledge about the network, not the 1036 application. 1038 o Configure Path Switchover 1039 Protocols: SCTP 1040 Automatable because the usage of multiple paths to communicate to 1041 the same end host relates to knowledge about the network, not the 1042 application. 1044 o Obtain status (query or notification) 1045 Protocols: SCTP, MPTCP 1046 SCTP parameters: association connection state; destination 1047 transport address list; destination transport address reachability 1048 states; current local and peer receiver window size; current local 1049 congestion window sizes; number of unacknowledged DATA chunks; 1050 number of DATA chunks pending receipt; primary path; most recent 1051 SRTT on primary path; RTO on primary path; SRTT and RTO on other 1052 destination addresses; MTU per path; interleaving supported yes/no 1053 MPTCP parameters: subflow-list (identified by source-IP; source- 1054 Port; destination-IP; destination-Port) 1055 Automatable because these parameters relate to knowledge about the 1056 network, not the application. 1058 o Specify DSCP field 1059 Protocols: TCP, SCTP, UDP(-Lite) 1060 Optimizing because choosing a suitable DSCP value requires 1061 application-specific knowledge. 1062 Implementation: via SET_DSCP.TCP / SET_DSCP.SCTP / SET_DSCP.UDP(- 1063 Lite) 1065 o Notification of ICMP error message arrival 1066 Protocols: TCP, UDP(-Lite) 1067 Optimizing because these messages can inform about success or 1068 failure of functional transport features (e.g., host unreachable 1069 relates to "Connect") 1070 Implementation: via ERROR.TCP or ERROR.UDP(-Lite). 1072 o Obtain information about interleaving support 1073 Protocols: SCTP 1074 Automatable because it requires using multiple streams, but 1075 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1076 category is automatable. 1077 Implementation: via STATUS.SCTP. 1079 o Change authentication parameters 1080 Protocols: TCP, SCTP 1081 Functional because this has a direct influence on security. 1082 Implementation: via SET_AUTH.TCP and SET_AUTH.SCTP. 1083 Implementation over TCP: With SCTP, this allows to adjust key_id, 1084 key, and hmac_id. With TCP, this allows to change the preferred 1085 outgoing MKT (current_key) and the preferred incoming MKT 1086 (rnext_key), respectively, for a segment that is sent on the 1087 connection. Key material must be provided in a way that is 1088 compatible with both [RFC4895] and [RFC5925]. 1089 Implementation over UDP: not possible. 1091 o Obtain authentication information 1092 Protocols: SCTP 1093 Functional because authentication decisions may have been made by 1094 the peer, and this has an influence on the necessary application- 1095 level measures to provide a certain level of security. 1096 Implementation: via GET_AUTH.SCTP. 1097 Implementation over TCP: With SCTP, this allows to obtain key_id 1098 and a chunk list. With TCP, this allows to obtain current_key and 1099 rnext_key from a previously received segment. Key material must 1100 be provided in a way that is compatible with both [RFC4895] and 1101 [RFC5925]. 1102 Implementation over UDP: not possible. 1104 o Reset Stream 1105 Protocols: SCTP 1106 Automatable because using multi-streaming does not require 1107 application-specific knowledge. 1108 Implementation: see Appendix A.3.2. 1110 o Notification of Stream Reset 1111 Protocols: STCP 1112 Automatable because using multi-streaming does not require 1113 application-specific knowledge. 1114 Implementation: see Appendix A.3.2. 1116 o Reset Association 1117 Protocols: SCTP 1118 Automatable because deciding to reset an association does not 1119 require application-specific knowledge. 1120 Implementation: via RESET_ASSOC.SCTP. 1122 o Notification of Association Reset 1123 Protocols: STCP 1124 Automatable because this notification does not relate to 1125 application-specific knowledge. 1127 o Add Streams 1128 Protocols: SCTP 1129 Automatable because using multi-streaming does not require 1130 application-specific knowledge. 1131 Implementation: see Appendix A.3.2. 1133 o Notification of Added Stream 1134 Protocols: STCP 1135 Automatable because using multi-streaming does not require 1136 application-specific knowledge. 1137 Implementation: see Appendix A.3.2. 1139 o Choose a scheduler to operate between streams of an association 1140 Protocols: SCTP 1141 Optimizing because the scheduling decision requires application- 1142 specific knowledge. However, if a transport system would not use 1143 this, or wrongly configure it on its own, this would only affect 1144 the performance of data transfers; the outcome would still be 1145 correct within the "best effort" service model. 1146 Implementation: using SET_STREAM_SCHEDULER.SCTP. 1147 Implementation over TCP: do nothing. 1148 Implementation over UDP: do nothing. 1150 o Configure priority or weight for a scheduler 1151 Protocols: SCTP 1152 Optimizing because the priority or weight requires application- 1153 specific knowledge. However, if a transport system would not use 1154 this, or wrongly configure it on its own, this would only affect 1155 the performance of data transfers; the outcome would still be 1156 correct within the "best effort" service model. 1157 Implementation: using CONFIGURE_STREAM_SCHEDULER.SCTP. 1158 Implementation over TCP: do nothing. 1159 Implementation over UDP: do nothing. 1161 o Configure send buffer size 1162 Protocols: SCTP 1163 Automatable because this decision relates to knowledge about the 1164 network and the Operating System, not the application (see also 1165 the discussion in Appendix A.3.4). 1167 o Configure receive buffer (and rwnd) size 1168 Protocols: SCTP 1169 Automatable because this decision relates to knowledge about the 1170 network and the Operating System, not the application. 1172 o Configure message fragmentation 1173 Protocols: SCTP 1174 Automatable because fragmentation relates to knowledge about the 1175 network and the Operating System, not the application. 1176 Implementation: by always enabling it with 1177 CONFIG_FRAGMENTATION.SCTP and auto-setting the fragmentation size 1178 based on network or Operating System conditions. 1180 o Configure PMTUD 1181 Protocols: SCTP 1182 Automatable because Path MTU Discovery relates to knowledge about 1183 the network, not the application. 1185 o Configure delayed SACK timer 1186 Protocols: SCTP 1187 Automatable because the receiver-side decision to delay sending 1188 SACKs relates to knowledge about the network, not the application 1189 (it can be relevant for a sending application to request not to 1190 delay the SACK of a message, but this is a different transport 1191 feature). 1193 o Set Cookie life value 1194 Protocols: SCTP 1195 Functional because it relates to security (possibly weakened by 1196 keeping a cookie very long) versus the time between connection 1197 establishment attempts. Knowledge about both issues can be 1198 application-specific. 1200 Implementation over TCP: the closest specified TCP functionality 1201 is the cookie in TCP Fast Open; for this, [RFC7413] states that 1202 the server "can expire the cookie at any time to enhance security" 1203 and section 4.1.2 describes an example implementation where 1204 updating the key on the server side causes the cookie to expire. 1205 Alternatively, for implementations that do not support TCP Fast 1206 Open, this transport feature could also affect the validity of SYN 1207 cookies (see Section 3.6 of [RFC4987]). 1208 Implementation over UDP: do nothing. 1210 o Set maximum burst 1211 Protocols: SCTP 1212 Automatable because it relates to knowledge about the network, not 1213 the application. 1215 o Configure size where messages are broken up for partial delivery 1216 Protocols: SCTP 1217 Functional because this is closely tied to properties of the data 1218 that an application sends or expects to receive. 1219 Implementation over TCP: not possible. 1220 Implementation over UDP: not possible. 1222 o Disable checksum when sending 1223 Protocols: UDP 1224 Functional because application-specific knowledge is necessary to 1225 decide whether it can be acceptable to lose data integrity. 1226 Implementation: via SET_CHECKSUM_ENABLED.UDP. 1227 Implementation over TCP: do nothing. 1229 o Disable checksum requirement when receiving 1230 Protocols: UDP 1231 Functional because application-specific knowledge is necessary to 1232 decide whether it can be acceptable to lose data integrity. 1233 Implementation: via SET_CHECKSUM_REQUIRED.UDP. 1234 Implementation over TCP: do nothing. 1236 o Specify checksum coverage used by the sender 1237 Protocols: UDP-Lite 1238 Functional because application-specific knowledge is necessary to 1239 decide for which parts of the data it can be acceptable to lose 1240 data integrity. 1241 Implementation: via SET_CHECKSUM_COVERAGE.UDP-Lite. 1242 Implementation over TCP: do nothing. 1244 o Specify minimum checksum coverage required by receiver 1245 Protocols: UDP-Lite 1246 Functional because application-specific knowledge is necessary to 1247 decide for which parts of the data it can be acceptable to lose 1248 data integrity. 1249 Implementation: via SET_MIN_CHECKSUM_COVERAGE.UDP-Lite. 1250 Implementation over TCP: do nothing. 1252 o Specify DF field 1253 Protocols: UDP(-Lite) 1254 Optimizing because the DF field can be used to carry out Path MTU 1255 Discovery, which can lead an application to choose message sizes 1256 that can be transmitted more efficiently. 1257 Implementation: via MAINTENANCE.SET_DF.UDP(-Lite) and 1258 SEND_FAILURE.UDP(-Lite). 1259 Implementation over TCP: do nothing. With TCP the sender is not 1260 in control of transport message sizes, making this functionality 1261 irrelevant. 1263 o Get max. transport-message size that may be sent using a non- 1264 fragmented IP packet from the configured interface 1265 Protocols: UDP(-Lite) 1266 Optimizing because this can lead an application to choose message 1267 sizes that can be transmitted more efficiently. 1268 Implementation over TCP: do nothing: this information is not 1269 available with TCP. 1271 o Get max. transport-message size that may be received from the 1272 configured interface 1273 Protocols: UDP(-Lite) 1274 Optimizing because this can, for example, influence an 1275 application's memory management. 1276 Implementation over TCP: do nothing: this information is not 1277 available with TCP. 1279 o Specify TTL/Hop count field 1280 Protocols: UDP(-Lite) 1281 Automatable because a transport system can use a large enough 1282 system default to avoid communication failures. Allowing an 1283 application to configure it differently can produce notifications 1284 of ICMP error message arrivals that yield information which only 1285 relates to knowledge about the network, not the application. 1287 o Obtain TTL/Hop count field 1288 Protocols: UDP(-Lite) 1289 Automatable because the TTL/Hop count field relates to knowledge 1290 about the network, not the application. 1292 o Specify ECN field 1293 Protocols: UDP(-Lite) 1294 Automatable because the ECN field relates to knowledge about the 1295 network, not the application. 1297 o Obtain ECN field 1298 Protocols: UDP(-Lite) 1299 Optimizing because this information can be used by an application 1300 to better carry out congestion control (this is relevant when 1301 choosing a data transmission transport service that does not 1302 already do congestion control). 1303 Implementation over TCP: do nothing: this information is not 1304 available with TCP. 1306 o Specify IP Options 1307 Protocols: UDP(-Lite) 1308 Automatable because IP Options relate to knowledge about the 1309 network, not the application. 1311 o Obtain IP Options 1312 Protocols: UDP(-Lite) 1313 Automatable because IP Options relate to knowledge about the 1314 network, not the application. 1316 o Enable and configure a "Low Extra Delay Background Transfer" 1317 Protocols: A protocol implementing the LEDBAT congestion control 1318 mechanism 1319 Optimizing because whether this service is appropriate or not 1320 depends on application-specific knowledge. However, wrongly using 1321 this will only affect the speed of data transfers (albeit 1322 including other transfers that may compete with the transport 1323 system's transfer in the network), so it is still correct within 1324 the "best effort" service model. 1325 Implementation: via CONFIGURE.LEDBAT and/or SET_DSCP.TCP / 1326 SET_DSCP.SCTP / SET_DSCP.UDP(-Lite) [LBE-draft]. 1327 Implementation over TCP: do nothing. 1328 Implementation over UDP: do nothing. 1330 TERMINATION: 1332 o Close after reliably delivering all remaining data, causing an 1333 event informing the application on the other side 1334 Protocols: TCP, SCTP 1335 Functional because the notion of a connection is often reflected 1336 in applications as an expectation to have all outstanding data 1337 delivered and no longer be able to communicate after a "Close" 1338 succeeded, with a communication sequence relating to this 1339 transport feature that is defined by the application protocol. 1340 Implementation: via CLOSE.TCP and CLOSE.SCTP. 1341 Implementation over UDP: not possible. 1343 o Abort without delivering remaining data, causing an event 1344 informing the application on the other side 1345 Protocols: TCP, SCTP 1346 Functional because the notion of a connection is often reflected 1347 in applications as an expectation to potentially not have all 1348 outstanding data delivered and no longer be able to communicate 1349 after an "Abort" succeeded. On both sides of a connection, an 1350 application protocol may define a communication sequence relating 1351 to this transport feature. 1352 Implementation: via ABORT.TCP and ABORT.SCTP. 1353 Implementation over UDP: not possible. 1355 o Abort without delivering remaining data, not causing an event 1356 informing the application on the other side 1357 Protocols: UDP(-Lite) 1358 Functional because the notion of a connection is often reflected 1359 in applications as an expectation to potentially not have all 1360 outstanding data delivered and no longer be able to communicate 1361 after an "Abort" succeeded. On both sides of a connection, an 1362 application protocol may define a communication sequence relating 1363 to this transport feature. 1364 Implementation: via ABORT.UDP(-Lite). 1365 Implementation over TCP: stop using the connection, wait for a 1366 timeout. 1368 o Timeout event when data could not be delivered for too long 1369 Protocols: TCP, SCTP 1370 Functional because this notifies that potentially assumed reliable 1371 data delivery is no longer provided. 1372 Implementation: via TIMEOUT.TCP and TIMEOUT.SCTP. 1373 Implementation over UDP: do nothing: this event will not occur 1374 with UDP. 1376 A.1.2. DATA Transfer Related Transport Features 1378 A.1.2.1. Sending Data 1380 o Reliably transfer data, with congestion control 1381 Protocols: TCP, SCTP 1382 Functional because this is closely tied to properties of the data 1383 that an application sends or expects to receive. 1384 Implementation: via SEND.TCP and SEND.SCTP. 1385 Implementation over UDP: not possible. 1387 o Reliably transfer a message, with congestion control 1388 Protocols: SCTP 1389 Functional because this is closely tied to properties of the data 1390 that an application sends or expects to receive. 1391 Implementation: via SEND.SCTP. 1392 Implementation over TCP: via SEND.TCP. With SEND.TCP, messages 1393 will not be identifiable by the receiver. 1394 Implementation over UDP: not possible. 1396 o Unreliably transfer a message 1397 Protocols: SCTP, UDP(-Lite) 1398 Optimizing because only applications know about the time 1399 criticality of their communication, and reliably transfering a 1400 message is never incorrect for the receiver of a potentially 1401 unreliable data transfer, it is just slower. 1402 ADDED. This differs from the 2 automatable transport features 1403 below in that it leaves the choice of congestion control open. 1404 Implementation: via SEND.SCTP or SEND.UDP(-Lite). 1405 Implementation over TCP: use SEND.TCP. With SEND.TCP, messages 1406 will be sent reliably, and they will not be identifiable by the 1407 receiver. 1409 o Unreliably transfer a message, with congestion control 1410 Protocols: SCTP 1411 Automatable because congestion control relates to knowledge about 1412 the network, not the application. 1414 o Unreliably transfer a message, without congestion control 1415 Protocols: UDP(-Lite) 1416 Automatable because congestion control relates to knowledge about 1417 the network, not the application. 1419 o Configurable Message Reliability 1420 Protocols: SCTP 1421 Optimizing because only applications know about the time 1422 criticality of their communication, and reliably transfering a 1423 message is never incorrect for the receiver of a potentially 1424 unreliable data transfer, it is just slower. 1425 Implementation: via SEND.SCTP. 1426 Implementation over TCP: By using SEND.TCP and ignoring this 1427 configuration: based on the assumption of the best-effort service 1428 model, unnecessarily delivering data does not violate application 1429 expectations. Moreover, it is not possible to associate the 1430 requested reliability to a "message" in TCP anyway. 1431 Implementation over UDP: not possible. 1433 o Choice of stream 1434 Protocols: SCTP 1435 Automatable because it requires using multiple streams, but 1436 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1437 category is automatable. Implementation: see Appendix A.3.2. 1439 o Choice of path (destination address) 1440 Protocols: SCTP 1441 Automatable because it requires using multiple sockets, but 1442 obtaining multiple sockets in the CONNECTION.ESTABLISHMENT 1443 category is automatable. 1445 o Ordered message delivery (potentially slower than unordered) 1446 Protocols: SCTP 1447 Functional because this is closely tied to properties of the data 1448 that an application sends or expects to receive. 1449 Implementation: via SEND.SCTP. 1450 Implementation over TCP: By using SEND.TCP. With SEND.TCP, 1451 messages will not be identifiable by the receiver. 1452 Implementation over UDP: not possible. 1454 o Unordered message delivery (potentially faster than ordered) 1455 Protocols: SCTP, UDP(-Lite) 1456 Functional because this is closely tied to properties of the data 1457 that an application sends or expects to receive. 1458 Implementation: via SEND.SCTP. 1459 Implementation over TCP: By using SEND.TCP and always sending data 1460 ordered: based on the assumption of the best-effort service model, 1461 ordered delivery may just be slower and does not violate 1462 application expectations. Moreover, it is not possible to 1463 associate the requested delivery order to a "message" in TCP 1464 anyway. 1466 o Request not to bundle messages 1467 Protocols: SCTP 1468 Optimizing because this decision depends on knowledge about the 1469 size of future data blocks and the delay between them. 1470 Implementation: via SEND.SCTP. 1471 Implementation over TCP: By using SEND.TCP and DISABLE_NAGLE.TCP 1472 to disable the Nagle algorithm when the request is made and enable 1473 it again when the request is no longer made. Note that this is 1474 not fully equivalent because it relates to the time of issuing the 1475 request rather than a specific message. 1477 Implementation over UDP: do nothing (UDP never bundles messages). 1479 o Specifying a "payload protocol-id" (handed over as such by the 1480 receiver) 1481 Protocols: SCTP 1482 Functional because it allows to send extra application data with 1483 every message, for the sake of identification of data, which by 1484 itself is application-specific. 1485 Implementation: SEND.SCTP. 1486 Implementation over TCP: not possible. 1487 Implementation over UDP: not possible. 1489 o Specifying a key id to be used to authenticate a message 1490 Protocols: SCTP 1491 Functional because this has a direct influence on security. 1492 Implementation: via a parameter in SEND.SCTP. 1493 Implementation over TCP: This could be emulated by using 1494 SET_AUTH.TCP before and after the message is sent. Note that this 1495 is not fully equivalent because it relates to the time of issuing 1496 the request rather than a specific message. 1497 Implementation over UDP: not possible. 1499 o Request not to delay the acknowledgement (SACK) of a message 1500 Protocols: SCTP 1501 Optimizing because only an application knows for which message it 1502 wants to quickly be informed about success / failure of its 1503 delivery. 1504 Implementation over TCP: do nothing. 1505 Implementation over UDP: do nothing. 1507 A.1.2.2. Receiving Data 1509 o Receive data (with no message delimiting) 1510 Protocols: TCP 1511 Functional because a transport system must be able to send and 1512 receive data. 1513 Implementation: via RECEIVE.TCP. 1514 Implementation over UDP: do nothing (hand over a message, let the 1515 application ignore message boundaries). 1517 o Receive a message 1518 Protocols: SCTP, UDP(-Lite) 1519 Functional because this is closely tied to properties of the data 1520 that an application sends or expects to receive. 1521 Implementation: via RECEIVE.SCTP and RECEIVE.UDP(-Lite). 1522 Implementation over TCP: not possible. 1524 o Choice of stream to receive from 1525 Protocols: SCTP 1526 Automatable because it requires using multiple streams, but 1527 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1528 category is automatable. 1529 Implementation: see Appendix A.3.2. 1531 o Information about partial message arrival 1532 Protocols: SCTP 1533 Functional because this is closely tied to properties of the data 1534 that an application sends or expects to receive. 1535 Implementation: via RECEIVE.SCTP. 1536 Implementation over TCP: do nothing: this information is not 1537 available with TCP. 1538 Implementation over UDP: do nothing: this information is not 1539 available with UDP. 1541 A.1.2.3. Errors 1543 This section describes sending failures that are associated with a 1544 specific call to in the "Sending Data" category (Appendix A.1.2.1). 1546 o Notification of send failures 1547 Protocols: SCTP, UDP(-Lite) 1548 Functional because this notifies that potentially assumed reliable 1549 data delivery is no longer provided. 1550 ADDED. This differs from the 2 automatable transport features 1551 below in that it does not distinugish between unsent and 1552 unacknowledged messages. 1553 Implementation: via SENDFAILURE-EVENT.SCTP and SEND_FAILURE.UDP(- 1554 Lite). 1555 Implementation over TCP: do nothing: this notification is not 1556 available and will therefore not occur with TCP. 1558 o Notification of an unsent (part of a) message 1559 Protocols: SCTP, UDP(-Lite) 1560 Automatable because the distinction between unsent and 1561 unacknowledged is network-specific. 1563 o Notification of an unacknowledged (part of a) message 1564 Protocols: SCTP 1565 Automatable because the distinction between unsent and 1566 unacknowledged is network-specific. 1568 o Notification that the stack has no more user data to send 1569 Protocols: SCTP 1570 Optimizing because reacting to this notification requires the 1571 application to be involved, and ensuring that the stack does not 1572 run dry of data (for too long) can improve performance. 1573 Implementation over TCP: do nothing. See also the discussion in 1574 Appendix A.3.4. 1575 Implementation over UDP: do nothing. This notification is not 1576 available and will therefore not occur with UDP. 1578 o Notification to a receiver that a partial message delivery has 1579 been aborted 1580 Protocols: SCTP 1581 Functional because this is closely tied to properties of the data 1582 that an application sends or expects to receive. 1583 Implementation over TCP: do nothing. This notification is not 1584 available and will therefore not occur with TCP. 1585 Implementation over UDP: do nothing. This notification is not 1586 available and will therefore not occur with UDP. 1588 A.2. Step 2: Reduction -- The Reduced Set of Transport Features 1590 By hiding automatable transport features from the application, a 1591 transport system can gain opportunities to automate the usage of 1592 network-related functionality. This can facilitate using the 1593 transport system for the application programmer and it allows for 1594 optimizations that may not be possible for an application. For 1595 instance, system-wide configurations regarding the usage of multiple 1596 interfaces can better be exploited if the choice of the interface is 1597 not entirely up to the application. Therefore, since they are not 1598 strictly necessary to expose in a transport system, we do not include 1599 automatable transport features in the reduced set of transport 1600 features. This leaves us with only the transport features that are 1601 either optimizing or functional. 1603 A transport system should be able to communicate via TCP or UDP if 1604 alternative transport protocols are found not to work. For many 1605 transport features, this is possible -- often by simply not doing 1606 anything when a specific request is made. For some transport 1607 features, however, it was identified that direct usage of neither TCP 1608 nor UDP is possible: in these cases, even not doing anything would 1609 incur semantically incorrect behavior. Whenever an application would 1610 make use of one of these transport features, this would eliminate the 1611 possibility to use TCP or UDP. Thus, we only keep the functional and 1612 optimizing transport features for which an implementation over either 1613 TCP or UDP is possible in our reduced set. 1615 In the following list, we precede a transport feature with "T:" if an 1616 implementation over TCP is possible, "U:" if an implementation over 1617 UDP is possible, and "TU:" if an implementation over either TCP or 1618 UDP is possible. 1620 A.2.1. CONNECTION Related Transport Features 1622 ESTABLISHMENT: 1624 o T,U: Connect 1625 o T,U: Specify number of attempts and/or timeout for the first 1626 establishment message 1627 o T: Configure authentication 1628 o T: Hand over a message to reliably transfer (possibly multiple 1629 times) before connection establishment 1630 o T: Hand over a message to reliably transfer during connection 1631 establishment 1633 AVAILABILITY: 1635 o T,U: Listen 1636 o T: Configure authentication 1638 MAINTENANCE: 1640 o T: Change timeout for aborting connection (using retransmit limit 1641 or time value) 1642 o T: Suggest timeout to the peer 1643 o T,U: Disable Nagle algorithm 1644 o T,U: Notification of Excessive Retransmissions (early warning 1645 below abortion threshold) 1646 o T,U: Specify DSCP field 1647 o T,U: Notification of ICMP error message arrival 1648 o T: Change authentication parameters 1649 o T: Obtain authentication information 1650 o T,U: Set Cookie life value 1651 o T,U: Choose a scheduler to operate between streams of an 1652 association 1653 o T,U: Configure priority or weight for a scheduler 1654 o T,U: Disable checksum when sending 1655 o T,U: Disable checksum requirement when receiving 1656 o T,U: Specify checksum coverage used by the sender 1657 o T,U: Specify minimum checksum coverage required by receiver 1658 o T,U: Specify DF field 1659 o T,U: Get max. transport-message size that may be sent using a non- 1660 fragmented IP packet from the configured interface 1661 o T,U: Get max. transport-message size that may be received from the 1662 configured interface 1663 o T,U: Obtain ECN field 1664 o T,U: Enable and configure a "Low Extra Delay Background Transfer" 1666 TERMINATION: 1668 o T: Close after reliably delivering all remaining data, causing an 1669 event informing the application on the other side 1670 o T: Abort without delivering remaining data, causing an event 1671 informing the application on the other side 1672 o T,U: Abort without delivering remaining data, not causing an event 1673 informing the application on the other side 1674 o T,U: Timeout event when data could not be delivered for too long 1676 A.2.2. DATA Transfer Related Transport Features 1678 A.2.2.1. Sending Data 1680 o T: Reliably transfer data, with congestion control 1681 o T: Reliably transfer a message, with congestion control 1682 o T,U: Unreliably transfer a message 1683 o T: Configurable Message Reliability 1684 o T: Ordered message delivery (potentially slower than unordered) 1685 o T,U: Unordered message delivery (potentially faster than ordered) 1686 o T,U: Request not to bundle messages 1687 o T: Specifying a key id to be used to authenticate a message 1688 o T,U: Request not to delay the acknowledgement (SACK) of a message 1690 A.2.2.2. Receiving Data 1692 o T,U: Receive data (with no message delimiting) 1693 o U: Receive a message 1694 o T,U: Information about partial message arrival 1696 A.2.2.3. Errors 1698 This section describes sending failures that are associated with a 1699 specific call to in the "Sending Data" category (Appendix A.1.2.1). 1701 o T,U: Notification of send failures 1702 o T,U: Notification that the stack has no more user data to send 1703 o T,U: Notification to a receiver that a partial message delivery 1704 has been aborted 1706 A.3. Step 3: Discussion 1708 The reduced set in the previous section exhibits a number of 1709 peculiarities, which we will discuss in the following. This section 1710 focuses on TCP because, with the exception of one particular 1711 transport feature ("Receive a message" -- we will discuss this in 1712 Appendix A.3.1), the list shows that UDP is strictly a subset of TCP. 1713 We can first try to understand how to build a transport system that 1714 can run over TCP, and then narrow down the result further to allow 1715 that the system can always run over either TCP or UDP (which 1716 effectively means removing everything related to reliability, 1717 ordering, authentication and closing/aborting with a notification to 1718 the peer). 1720 Note that, because the functional transport features of UDP are -- 1721 with the exception of "Receive a message" -- a subset of TCP, TCP can 1722 be used as a replacement for UDP whenever an application does not 1723 need message delimiting (e.g., because the application-layer protocol 1724 already does it). This has been recognized by many applications that 1725 already do this in practice, by trying to communicate with UDP at 1726 first, and falling back to TCP in case of a connection failure. 1728 A.3.1. Sending Messages, Receiving Bytes 1730 For implementing a transport system over TCP, there are several 1731 transport features related to sending, but only a single transport 1732 feature related to receiving: "Receive data (with no message 1733 delimiting)" (and, strangely, "information about partial message 1734 arrival"). Notably, the transport feature "Receive a message" is 1735 also the only non-automatable transport feature of UDP(-Lite) for 1736 which no implementation over TCP is possible. 1738 To support these TCP receiver semantics, we define an "Application- 1739 Framed Bytestream" (AFra-Bytestream). AFra-Bytestreams allow senders 1740 to operate on messages while minimizing changes to the TCP socket 1741 API. In particular, nothing changes on the receiver side - data can 1742 be accepted via a normal TCP socket. 1744 In an AFra-Bytestream, the sending application can optionally inform 1745 the transport about message boundaries and required properties per 1746 message (configurable order and reliability, or embedding a request 1747 not to delay the acknowledgement of a message). Whenever the sending 1748 application specifies per-message properties that relax the notion of 1749 reliable in-order delivery of bytes, it must assume that the 1750 receiving application is 1) able to determine message boundaries, 1751 provided that messages are always kept intact, and 2) able to accept 1752 these relaxed per-message properties. Any signaling of such 1753 information to the peer is up to an application-layer protocol and 1754 considered out of scope of this document. 1756 For example, if an application requests to transfer fixed-size 1757 messages of 100 bytes with partial reliability, this needs the 1758 receiving application to be prepared to accept data in chunks of 100 1759 bytes. If, then, some of these 100-byte messages are missing (e.g., 1760 if SCTP with Configurable Reliability is used), this is the expected 1761 application behavior. With TCP, no messages would be missing, but 1762 this is also correct for the application, and the possible 1763 retransmission delay is acceptable within the best effort service 1764 model [RFC7305]. Still, the receiving application would separate the 1765 byte stream into 100-byte chunks. 1767 Note that this usage of messages does not require all messages to be 1768 equal in size. Many application protocols use some form of Type- 1769 Length-Value (TLV) encoding, e.g. by defining a header including 1770 length fields; another alternative is the use of byte stuffing 1771 methods such as COBS [COBS]. If an application needs message 1772 numbers, e.g. to restore the correct sequence of messages, these must 1773 also be encoded by the application itself, as the sequence number 1774 related transport features of SCTP are not provided by the "minimum 1775 set" (in the interest of enabling usage of TCP). 1777 A.3.2. Stream Schedulers Without Streams 1779 We have already stated that multi-streaming does not require 1780 application-specific knowledge. Potential benefits or disadvantages 1781 of, e.g., using two streams of an SCTP association versus using two 1782 separate SCTP associations or TCP connections are related to 1783 knowledge about the network and the particular transport protocol in 1784 use, not the application. However, the transport features "Choose a 1785 scheduler to operate between streams of an association" and 1786 "Configure priority or weight for a scheduler" operate on streams. 1787 Here, streams identify communication channels between which a 1788 scheduler operates, and they can be assigned a priority. Moreover, 1789 the transport features in the MAINTENANCE category all operate on 1790 assocations in case of SCTP, i.e. they apply to all streams in that 1791 assocation. 1793 With only these semantics necessary to represent, the interface to a 1794 transport system becomes easier if we assume that connections may be 1795 a transport protocol's connection or association, but could also be a 1796 stream of an existing SCTP association, for example. We only need to 1797 allow for a way to define a possible grouping of connections. Then, 1798 all MAINTENANCE transport features can be said to operate on 1799 connection groups, not connections, and a scheduler operates on the 1800 connections within a group. 1802 To be compatible with multiple transport protocols and uniformly 1803 allow access to both transport connections and streams of a multi- 1804 streaming protocol, the semantics of opening and closing need to be 1805 the most restrictive subset of all of the underlying options. For 1806 example, TCP's support of half-closed connections can be seen as a 1807 feature on top of the more restrictive "ABORT"; this feature cannot 1808 be supported because not all protocols used by a transport system 1809 (including streams of an association) support half-closed 1810 connections. 1812 A.3.3. Early Data Transmission 1814 There are two transport features related to transferring a message 1815 early: "Hand over a message to reliably transfer (possibly multiple 1816 times) before connection establishment", which relates to TCP Fast 1817 Open [RFC7413], and "Hand over a message to reliably transfer during 1818 connection establishment", which relates to SCTP's ability to 1819 transfer data together with the COOKIE-Echo chunk. Also without TCP 1820 Fast Open, TCP can transfer data during the handshake, together with 1821 the SYN packet -- however, the receiver of this data may not hand it 1822 over to the application until the handshake has completed. Also, 1823 different from TCP Fast Open, this data is not delimited as a message 1824 by TCP (thus, not visible as a ``message''). This functionality is 1825 commonly available in TCP and supported in several implementations, 1826 even though the TCP specification does not explain how to provide it 1827 to applications. 1829 A transport system could differentiate between the cases of 1830 transmitting data "before" (possibly multiple times) or "during" the 1831 handshake. Alternatively, it could also assume that data that are 1832 handed over early will be transmitted as early as possible, and 1833 "before" the handshake would only be used for messages that are 1834 explicitly marked as "idempotent" (i.e., it would be acceptable to 1835 transfer them multiple times). 1837 The amount of data that can successfully be transmitted before or 1838 during the handshake depends on various factors: the transport 1839 protocol, the use of header options, the choice of IPv4 and IPv6 and 1840 the Path MTU. A transport system should therefore allow a sending 1841 application to query the maximum amount of data it can possibly 1842 transmit before (or, if exposed, during) connection establishment. 1844 A.3.4. Sender Running Dry 1846 The transport feature "Notification that the stack has no more user 1847 data to send" relates to SCTP's "SENDER DRY" notification. Such 1848 notifications can, in principle, be used to avoid having an 1849 unnecessarily large send buffer, yet ensure that the transport sender 1850 always has data available when it has an opportunity to transmit it. 1851 This has been found to be very beneficial for some applications 1852 [WWDC2015]. However, "SENDER DRY" truly means that the entire send 1853 buffer (including both unsent and unacknowledged data) has emptied -- 1854 i.e., when it notifies the sender, it is already too late, the 1855 transport protocol already missed an opportunity to send data. Some 1856 modern TCP implementations now include the unspecified 1857 "TCP_NOTSENT_LOWAT" socket option that was proposed in [WWDC2015], 1858 which limits the amount of unsent data that TCP can keep in the 1859 socket buffer; this allows to specify at which buffer filling level 1860 the socket becomes writable, rather than waiting for the buffer to 1861 run empty. 1863 SCTP allows to configure the sender-side buffer too: the automatable 1864 Transport Feature "Configure send buffer size" provides this 1865 functionality, but only for the complete buffer, which includes both 1866 unsent and unacknowledged data. SCTP does not allow to control these 1867 two sizes separately. It therefore makes sense for a transport 1868 system to allow for uniform access to "TCP_NOTSENT_LOWAT" as well as 1869 the "SENDER DRY" notification. 1871 A.3.5. Capacity Profile 1873 The transport features: 1875 o Disable Nagle algorithm 1876 o Enable and configure a "Low Extra Delay Background Transfer" 1877 o Specify DSCP field 1879 all relate to a QoS-like application need such as "low latency" or 1880 "scavenger". In the interest of flexibility of a transport system, 1881 they could therefore be offered in a uniform, more abstract way, 1882 where a transport system could e.g. decide by itself how to use 1883 combinations of LEDBAT-like congestion control and certain DSCP 1884 values, and an application would only specify a general "capacity 1885 profile" (a description of how it wants to use the available 1886 capacity). A need for "lowest possible latency at the expense of 1887 overhead" could then translate into automatically disabling the Nagle 1888 algorithm. 1890 In some cases, the Nagle algorithm is best controlled directly by the 1891 application because it is not only related to a general profile but 1892 also to knowledge about the size of future messages. For fine-grain 1893 control over Nagle-like functionality, the "Request not to bundle 1894 messages" is available. 1896 A.3.6. Security 1898 Both TCP and SCTP offer authentication. TCP authenticates complete 1899 segments. SCTP allows to configure which of SCTP's chunk types must 1900 always be authenticated -- if this is exposed as such, it creates an 1901 undesirable dependency on the transport protocol. For compatibility 1902 with TCP, a transport system should only allow to configure complete 1903 transport layer packets, including headers, IP pseudo-header (if any) 1904 and payload. 1906 Security is discussed in a separate TAPS document 1907 [I-D.pauly-taps-transport-security]. The minimal set presented in 1908 the present document therefore excludes all security related 1909 transport features: "Configure authentication", "Change 1910 authentication parameters", "Obtain authentication information" and 1911 and "Set Cookie life value" as well as "Specifying a key id to be 1912 used to authenticate a message". 1914 A.3.7. Packet Size 1916 UDP(-Lite) has a transport feature called "Specify DF field". This 1917 yields an error message in case of sending a message that exceeds the 1918 Path MTU, which is necessary for a UDP-based application to be able 1919 to implement Path MTU Discovery (a function that UDP-based 1920 applications must do by themselves). The "Get max. transport-message 1921 size that may be sent using a non-fragmented IP packet from the 1922 configured interface" transport feature yields an upper limit for the 1923 Path MTU (minus headers) and can therefore help to implement Path MTU 1924 Discovery more efficiently. 1926 Appendix B. Revision information 1928 XXX RFC-Ed please remove this section prior to publication. 1930 -02: implementation suggestions added, discussion section added, 1931 terminology extended, DELETED category removed, various other fixes; 1932 list of Transport Features adjusted to -01 version of [RFC8303] 1933 except that MPTCP is not included. 1935 -03: updated to be consistent with -02 version of [RFC8303]. 1937 -04: updated to be consistent with -03 version of [RFC8303]. 1938 Reorganized document, rewrote intro and conclusion, and made a first 1939 stab at creating a real "minimal set". 1941 -05: updated to be consistent with -05 version of [RFC8303] (minor 1942 changes). Fixed a mistake regarding Cookie Life value. Exclusion of 1943 security related transport features (to be covered in a separate 1944 document). Reorganized the document (now begins with the minset, 1945 derivation is in the appendix). First stab at an abstract API for 1946 the minset. 1948 draft-ietf-taps-minset-00: updated to be consistent with -08 version 1949 of [RFC8303] ("obtain message delivery number" was removed, as this 1950 has also been removed in [RFC8303] because it was a mistake in 1951 RFC4960. This led to the removal of two more transport features that 1952 were only designated as functional because they affected "obtain 1953 message delivery number"). Fall-back to UDP incorporated (this was 1954 requested at IETF-99); this also affected the transport feature 1955 "Choice between unordered (potentially faster) or ordered delivery of 1956 messages" because this is a boolean which is always true for one 1957 fall-back protocol, and always false for the other one. This was 1958 therefore now divided into two features, one for ordered, one for 1959 unordered delivery. The word "reliably" was added to the transport 1960 features "Hand over a message to reliably transfer (possibly multiple 1961 times) before connection establishment" and "Hand over a message to 1962 reliably transfer during connection establishment" to make it clearer 1963 why this is not supported by UDP. Clarified that the "minset 1964 abstract interface" is not proposing a specific API for all TAPS 1965 systems to implement, but it is just a way to describe the minimum 1966 set. Author order changed. 1968 WG -01: "fall-back to" (TCP or UDP) replaced (mostly with 1969 "implementation over"). References to post-sockets removed (these 1970 were statments that assumed that post-sockets requires two-sided 1971 implementation). Replaced "flow" with "TAPS Connection" and "frame" 1972 with "message" to avoid introducing new terminology. Made sections 3 1973 and 4 in line with the categorization that is already used in the 1974 appendix and [RFC8303], and changed style of section 4 to be even 1975 shorter and less interface-like. Updated reference draft-ietf-tsvwg- 1976 sctp-ndata to RFC8260. 1978 WG -02: rephrased "the TAPS system" and "TAPS connection" etc. to 1979 more generally talk about transport after the intro (mostly replacing 1980 "TAPS system" with "transport system" and "TAPS connection" with 1981 "connection". Merged sections 3 and 4 to form a new section 3. 1983 WG -03: updated sentence referencing 1984 [I-D.pauly-taps-transport-security] to say that "the minimum security 1985 requirements for a taps system are discussed in a separate security 1986 document", wrote "example" in the paragraph introducing the decision 1987 tree. Removed reference draft-grinnemo-taps-he-03 and the sentence 1988 that referred to it. 1990 Authors' Addresses 1992 Michael Welzl 1993 University of Oslo 1994 PO Box 1080 Blindern 1995 Oslo N-0316 1996 Norway 1998 Phone: +47 22 85 24 20 1999 Email: michawe@ifi.uio.no 2001 Stein Gjessing 2002 University of Oslo 2003 PO Box 1080 Blindern 2004 Oslo N-0316 2005 Norway 2007 Phone: +47 22 85 24 44 2008 Email: steing@ifi.uio.no