idnits 2.17.1 draft-ietf-taps-minset-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 6, 2018) is 2264 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'SUBCATEGORY' is mentioned on line 891, but not defined == Outdated reference: A later version (-09) exists of draft-ietf-taps-transports-usage-08 == Outdated reference: A later version (-02) exists of draft-pauly-taps-transport-security-01 -- Unexpected draft version: The latest known version of draft-tsvwg-le-phb is -00, but you're referring to -03. Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TAPS M. Welzl 3 Internet-Draft S. Gjessing 4 Intended status: Informational University of Oslo 5 Expires: August 10, 2018 February 6, 2018 7 A Minimal Set of Transport Services for TAPS Systems 8 draft-ietf-taps-minset-01 10 Abstract 12 This draft recommends a minimal set of IETF Transport Services 13 offered by end systems supporting TAPS, and gives guidance on 14 choosing among the available mechanisms and protocols. It is based 15 on the set of transport features given in the TAPS document draft- 16 ietf-taps-transports-usage-09. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at https://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on August 10, 2018. 35 Copyright Notice 37 Copyright (c) 2018 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (https://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 53 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 54 3. The Minimal Set of Transport Features . . . . . . . . . . . . 5 55 3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION . . . . . . . 5 56 3.2. MAINTENANCE . . . . . . . . . . . . . . . . . . . . . . . 8 57 3.3. DATA Transfer . . . . . . . . . . . . . . . . . . . . . . 9 58 3.3.1. Sending Data . . . . . . . . . . . . . . . . . . . . 9 59 3.3.2. Receiving Data . . . . . . . . . . . . . . . . . . . 10 60 4. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 61 4.1. ESTABLISHMENT, AVAILABILITY and TERMINATION . . . . . . . 11 62 4.2. MAINTENANCE . . . . . . . . . . . . . . . . . . . . . . . 12 63 4.2.1. Connection groups . . . . . . . . . . . . . . . . . . 12 64 4.2.2. Individual connections . . . . . . . . . . . . . . . 13 65 4.3. DATA Transfer . . . . . . . . . . . . . . . . . . . . . . 14 66 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 15 67 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 68 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 69 8. Security Considerations . . . . . . . . . . . . . . . . . . . 16 70 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 71 9.1. Normative References . . . . . . . . . . . . . . . . . . 16 72 9.2. Informative References . . . . . . . . . . . . . . . . . 16 73 Appendix A. Deriving the minimal set . . . . . . . . . . . . . . 18 74 A.1. Step 1: Categorization -- The Superset of Transport 75 Features . . . . . . . . . . . . . . . . . . . . . . . . 19 76 A.1.1. CONNECTION Related Transport Features . . . . . . . . 20 77 A.1.2. DATA Transfer Related Transport Features . . . . . . 36 78 A.2. Step 2: Reduction -- The Reduced Set of Transport 79 Features . . . . . . . . . . . . . . . . . . . . . . . . 41 80 A.2.1. CONNECTION Related Transport Features . . . . . . . . 42 81 A.2.2. DATA Transfer Related Transport Features . . . . . . 43 82 A.3. Step 3: Discussion . . . . . . . . . . . . . . . . . . . 43 83 A.3.1. Sending Messages, Receiving Bytes . . . . . . . . . . 44 84 A.3.2. Stream Schedulers Without Streams . . . . . . . . . . 46 85 A.3.3. Early Data Transmission . . . . . . . . . . . . . . . 47 86 A.3.4. Sender Running Dry . . . . . . . . . . . . . . . . . 48 87 A.3.5. Capacity Profile . . . . . . . . . . . . . . . . . . 48 88 A.3.6. Security . . . . . . . . . . . . . . . . . . . . . . 49 89 A.3.7. Packet Size . . . . . . . . . . . . . . . . . . . . . 49 90 Appendix B. Revision information . . . . . . . . . . . . . . . . 49 91 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 50 93 1. Introduction 95 The task of any system that implements TAPS is to offer transport 96 services to its applications, i.e. the applications running on top of 97 TAPS, without binding them to a particular transport protocol. 99 Currently, the set of transport services that most applications use 100 is based on TCP and UDP (and protocols running on top of them); this 101 limits the ability for the network stack to make use of features of 102 other protocols. For example, if a protocol supports out-of-order 103 message delivery but applications always assume that the network 104 provides an ordered bytestream, then the network stack can never 105 utilize out-of-order message delivery: doing so would break a 106 fundamental assumption of the application. 108 By exposing the transport services of multiple transport protocols, a 109 TAPS system can make it possible to use these services without having 110 to statically bind an application to a specific transport protocol. 111 The first step towards the design of such a system was taken by 112 [RFC8095], which surveys a large number of transports, and [TAPS2] as 113 well as [TAPS2UDP], which identify the specific transport features 114 that are exposed to applications by the protocols TCP, MPTCP, UDP(- 115 Lite) and SCTP as well as the LEDBAT congestion control mechanism. 116 The present draft is based on these documents and follows the same 117 terminology (also listed below). Because the considered transport 118 protocols together cover a wide range of transport features, there is 119 reason to hope that the resulting set (and the reasoning that led to 120 it) will also apply to many aspects of other transport protocols such 121 as QUIC. 123 The number of transport features of current IETF transports is large, 124 and exposing all of them has a number of disadvantages: generally, 125 the more functionality is exposed, the less freedom a TAPS system has 126 to automate usage of the various functions of its available set of 127 transport protocols. Some functions only exist in one particular 128 protocol, and if an application would use them, this would statically 129 tie the application to this protocol, counteracting the purpose of a 130 TAPS system. Also, if the number of exposed features is exceedingly 131 large, a TAPS system might become very hard to use for an application 132 programmer. Taking [TAPS2] as a basis, this document therefore 133 develops a minimal set of transport features, removing the ones that 134 could be harmful to the purpose of a TAPS system but keeping the ones 135 that must be retained for applications to benefit from useful 136 transport functionality. 138 Applications use a wide variety of APIs today. The transport 139 features in the minimal set in this document must be reflected in 140 *all* network APIs in order for the underlying functionality to 141 become usable everywhere. For example, it does not help an 142 application that talks to a middleware if only the Berkeley Sockets 143 API is extended to offer "unordered message delivery", but the 144 middleware only offers an ordered bytestream. Both the Berkeley 145 Sockets API and the middleware would have to expose the "unordered 146 message delivery" transport feature (alternatively, there may be ways 147 for certain types of middleware to use this transport feature without 148 exposing it, based on knowledge about the applications -- but this is 149 not the general case). In most situations, in the interest of being 150 as flexible and efficient as possible, the best choice will be for a 151 middleware or library to expose at least all of the transport 152 features that are recommended as a "minimal set" here. 154 This "minimal set" can be implemented one-sided over TCP (or UDP, if 155 certain limitations are put in place). This means that a sender-side 156 TAPS system implementing it can talk to a non-TAPS TCP (or UDP) 157 receiver, and a receiver-side TAPS system implementing it can talk to 158 a non-TAPS TCP (or UDP) sender. 160 2. Terminology 162 The following terms are used throughout this document, and in 163 subsequent documents produced by TAPS that describe the composition 164 and decomposition of transport services. 166 Transport Feature: a specific end-to-end feature that the transport 167 layer provides to an application. Examples include 168 confidentiality, reliable delivery, ordered delivery, message- 169 versus-stream orientation, etc. 170 Transport Service: a set of Transport Features, without an 171 association to any given framing protocol, which provides a 172 complete service to an application. 173 Transport Protocol: an implementation that provides one or more 174 different transport services using a specific framing and header 175 format on the wire. 176 Transport Service Instance: an arrangement of transport protocols 177 with a selected set of features and configuration parameters that 178 implements a single transport service, e.g., a protocol stack (RTP 179 over UDP). 180 Application: an entity that uses the transport layer for end-to-end 181 delivery data across the network (this may also be an upper layer 182 protocol or tunnel encapsulation). 183 Application-specific knowledge: knowledge that only applications 184 have. 185 Endpoint: an entity that communicates with one or more other 186 endpoints using a transport protocol. 187 Connection: shared state of two or more endpoints that persists 188 across messages that are transmitted between these endpoints. 189 Socket: the combination of a destination IP address and a 190 destination port number. 192 Moreover, throughout the document, the protocol name "UDP(-Lite)" is 193 used when discussing transport features that are equivalent for UDP 194 and UDP-Lite; similarly, the protocol name "TCP" refers to both TCP 195 and MPTCP. 197 3. The Minimal Set of Transport Features 199 Based on the categorization, reduction and discussion in Appendix A, 200 this section describes the minimal set of transport features that is 201 offered by end systems supporting TAPS. This TAPS system can be 202 implemented over TCP; elements of the system that may prohibit 203 implementation over UDP are marked with "!UDP". To implement a TAPS 204 system that can also work over UDP, these marked transport features 205 should be excluded. 207 As in Appendix A, Appendix A.2 and [TAPS2], we categorize the minimal 208 set of transport features as 1) CONNECTION related (ESTABLISHMENT, 209 AVAILABILITY, MAINTENANCE, TERMINATION) and 2) DATA Transfer related 210 (Sending Data, Receiving Data, Errors). Here, the focus is on "TAPS 211 Connections": connections that the TAPS system offers, as opposed to 212 connections of transport protocols that the TAPS system uses. 214 3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION 216 A TAPS connection must first be "created" to allow for some initial 217 configuration to be carried out before the TAPS system can actively 218 or passively establish a transport connection. All configuration 219 parameters in Section 3.2 and can be used initially, although some of 220 them may only take effect when a transport connection has been 221 established. Configuring a connection early helps a TAPS system make 222 the right decisions. In particular, grouping information can 223 influence the TAPS system to implement a TAPS connection as a stream 224 of a multi-streaming protocol's existing association or not. 226 For ungrouped TAPS connections, early configuration is necessary 227 because it allows the TAPS system to know which protocols it should 228 try to use (to steer a mechanism such as "Happy Eyeballs" 229 [I-D.grinnemo-taps-he]). In particular, a TAPS system that only 230 makes a one-time choice for a particular protocol must know early 231 about strict requirements that must be kept, or it can end up in a 232 deadlock situation (e.g., having chosen UDP and later be asked to 233 support reliable transfer). As a possibility to correctly handle 234 these cases, we provide the following decision tree (this is derived 235 from Appendix A.2.1 excluding authentication, as explained in 236 Section 8): 238 - Will it ever be necessary to offer any of the following? 239 * Reliably transfer data 240 * Notify the peer of closing/aborting 241 * Preserve data ordering 243 Yes: SCTP or TCP can be used. 244 - Is any of the following useful to the application? 245 * Choosing a scheduler to operate between TAPS connections 246 in a group, with the possibility to configure a priority 247 or weight per connection 248 * Configurable message reliability 249 * Unordered message delivery 250 * Request not to delay the acknowledgement (SACK) of a message 252 Yes: SCTP is preferred. 253 No: 254 - Is any of the following useful to the application? 255 * Hand over a message to reliably transfer (possibly 256 multiple times) before connection establishment 257 * Suggest timeout to the peer 258 * Notification of Excessive Retransmissions (early 259 warning below abortion threshold) 260 * Notification of ICMP error message arrival 262 Yes: TCP is preferred. 263 No: SCTP and TCP are equally preferable. 265 No: all protocols can be used. 266 - Is any of the following useful to the application? 267 * Specify checksum coverage used by the sender 268 * Specify minimum checksum coverage required by receiver 270 Yes: UDP-Lite is preferred. 271 No: UDP is preferred. 273 Note that this decision tree is not optimal for all cases. For 274 example, if an application wants to use "Specify checksum coverage 275 used by the sender", which is only offered by UDP-Lite, and 276 "Configure priority or weight for a scheduler", which is only offered 277 by SCTP, the above decision tree will always choose UDP-Lite, making 278 it impossible to use SCTP's schedulers with priorities between 279 grouped TAPS connections. The TAPS system must know which choice is 280 more important for the application in order to make the best 281 decision. We caution implementers to be aware of the full set of 282 trade-offs, for which we recommend consulting the list in 283 Appendix A.2.1 when deciding how to initialize a TAPS connection. 285 Once a TAPS connection is created, it can be queried for the maximum 286 amount of data that an application can possibly expect to have 287 reliably transmitted before or during transport connection 288 establishment (with zero being a possible answer). An application 289 can also give the TAPS connection a message for reliable transmission 290 before or during connection establishment (!UDP); the TAPS system 291 will then try to transmit it as early as possible. An application 292 can facilitate sending the message particularly early by marking it 293 as "idempotent"; in this case, the receiving application must be 294 prepared to potentially receive multiple copies of the message 295 (because idempotent messages are reliably transferred, asking for 296 idempotence is not necessary for systems that support UDP). 298 After creation, a TAPS system can actively establish communication 299 with a peer, or it can passively listen for incoming connection 300 requests. Note that "Establish" may or may not trigger a 301 notification on the listening side. It is possible that the first 302 notification on the listening side is the arrival of the first data 303 that the active side sends (a receiver-side TAPS system could handle 304 this by continuing to block a "Listen" call, immediately followed by 305 issuing "Receive", for example; callback-based implementations could 306 simply skip the equivalent of "Listen"). This also means that the 307 active opening side is assumed to be the first side sending data. 309 A TAPS system can actively close a connection, i.e. terminate it 310 after reliably delivering all remaining data to the peer, or it can 311 abort it, i.e. terminate it without delivering remaining data. 312 Unless all data transfers only used unreliable message transmission 313 without congestion control (i.e., UDP-style transfer), closing a 314 connection is guaranteed to cause an event to notify the peer 315 application that the connection has been closed (!UDP). Similarly, 316 for anything but (UDP-style) unreliable non-congestion-controlled 317 data transfer, aborting a connection will cause an event to notify 318 the peer application that the connection has been aborted (!UDP). A 319 timeout can be configured to abort a TAPS connection when data could 320 not be delivered for too long (!UDP); however, timeout-based abortion 321 does not notify the peer application that the connection has been 322 aborted. Because half-closed connections are not supported, when a 323 TAPS host receives a notification that the peer is closing or 324 aborting the connection (!UDP), its peer may not be able to read 325 outstanding data. This means that unacknowledged data residing in 326 the TAPS system's send buffer may have to be dropped from that buffer 327 upon arrival of a "close" or "abort" notification from the peer. 329 3.2. MAINTENANCE 331 A TAPS connection group can be configured with a number of transport 332 features, and there are some notifications to applications about a 333 connection group. The following transport features and notifications 334 from Appendix A.2 automatically apply to grouped TAPS connections 335 (e.g., when a TAPS connection is mapped to a stream of a multi- 336 streaming protocol): 338 Timeout, error notifications: 340 o Change timeout for aborting connection (using retransmit limit or 341 time value) (!UDP) 342 o Suggest timeout to the peer (!UDP) 343 o Notification of Excessive Retransmissions (early warning below 344 abortion threshold) 345 o Notification of ICMP error message arrival 347 Others: 349 o Choose a scheduler to operate between connections of a group 350 o Obtain ECN field 352 The following transport features are new or changed, based on the 353 discussion in Appendix A.3: 355 o Capacity profile 356 This describes how an application wants to use its available 357 capacity. Choices can be "lowest possible latency at the expense 358 of overhead" (which would disable any Nagle-like algorithm), 359 "scavenger", and values that help determine the DSCP value for a 360 connection (e.g. similar to table 1 in 361 [I-D.ietf-tsvwg-rtcweb-qos]). 363 The following transport features and notifications from Appendix A.2 364 only apply to a single TAPS connection: 366 Configure priority or weight for a scheduler 368 Checksums: 370 o Disable checksum when sending 371 o Disable checksum requirement when receiving 372 o Specify checksum coverage used by the sender 373 o Specify minimum checksum coverage required by receiver 374 A TAPS system must offer means to group connections; at the same 375 time, it cannot guarantee truly grouping them below (e.g., it cannot 376 be guaranteed that TAPS connections become multiplexed as streams on 377 a single SCTP association when SCTP may not be available). The TAPS 378 system must therefore ensure that group versus non-group 379 configurations listed above are handled correctly in some way (e.g., 380 by applying the configuration to all grouped connections even when 381 they are not multiplexed, or informing the application about grouping 382 success or failure). 384 3.3. DATA Transfer 386 3.3.1. Sending Data 388 This section discusses how to send data after connection 389 establishment. Section 3.1 discusses the possiblity to hand over a 390 message to reliably send before or during establishment. 392 Here we list per-message properties that a sender can optionally 393 configure if it hands over a delimited message for sending with 394 congestion control (!UDP), taken from Appendix A.2: 396 o Configurable Message Reliability 397 o Ordered message delivery (potentially slower than unordered) 398 o Unordered message delivery (potentially faster than ordered) 399 o Request not to bundle messages 400 o Request not to delay the acknowledgement (SACK) of a message 402 Additionally, an application can hand over delimited messages for 403 unreliable transmission without congestion control (note that such 404 applications should perform congestion control in accordance with 405 [RFC2914]). Then, none of the per-message properties listed above 406 have any effect, but it is possible to use the transport feature 407 "Specify DF field" to allow/disallow fragmentation. 409 Following Appendix A.3.7, there are three transport features (two 410 old, one new): 412 o Get max. transport message size that may be sent without 413 fragmentation from the configured interface 414 This is optional for a TAPS system to offer, and may return an 415 error ("not available"). It can aid applications implementing 416 Path MTU Discovery. 418 o Get max. transport message size that may be received from the 419 configured interface 420 This is optional for a TAPS system to offer, and may return an 421 error ("not available"). 423 o Get maximum transport message size 424 Irrespective of fragmentation, there is a size limit for the 425 messages that can be handed over to SCTP or UDP(-Lite); because a 426 TAPS system is independent of the transport, it must allow a TAPS 427 application to query this value -- the maximum size of a message 428 in an Application-Framed-Bytestream (see Appendix A.3.1). This 429 may also return an error when data is not delimited ("not 430 available"). 432 There are two more sender-side notifications. These are unreliable, 433 i.e. a TAPS system cannot be assumed to implement them, but they may 434 occur: 436 o Notification of send failures 437 A TAPS system may inform a sender application of a failure to send 438 a specific message. 440 o Notification of draining below a low water mark 441 A TAPS system can notify a sender application when the TAPS 442 system's filling level of the buffer of unsent data is below a 443 configurable threshold in bytes. Even for TAPS systems that do 444 implement this notification, supporting thresholds other than 0 is 445 optional. 447 "Notification of draining below a low water mark" is a generic 448 notification that tries to enable uniform access to 449 "TCP_NOTSENT_LOWAT" as well as the "SENDER DRY" notification (as 450 discussed in Appendix A.3.4 -- SCTP's "SENDER DRY" is a special case 451 where the threshold (for unsent data) is 0 and there is also no more 452 unacknowledged data in the send buffer). Note that this threshold 453 and its notification should operate across the buffers of the whole 454 TAPS system, i.e. also any potential buffers that the TAPS system 455 itself may use on top of the transport's send buffer. 457 3.3.2. Receiving Data 459 A receiving application obtains an "Application-Framed Bytestream" 460 (AFra-Bytestream); this concept is further described in 461 Appendix A.3.1). In line with TCP's receiver semantics, an AFra- 462 Bytestream is just a stream of bytes to the receiver. If message 463 boundaries were specified by the sender, a receiver-side TAPS system 464 implementing only the minimum set of transport services defined here 465 will still not inform the receiving application about them. Within 466 the bytestream, messages themselves will always stay intact (partial 467 messages are not supported). Different from TCP's semantics, there 468 is no guarantee that all messages in the bytestream are transmitted 469 from the sender to the receiver, and that all of them are in the same 470 sequence in which they were handed over by the sender. If an 471 application is aware of message delimiters in the bytestream, and if 472 the sender-side application has informed the TAPS system about these 473 boundaries and about potentially relaxed requirements regarding the 474 sequence of messages or per-message reliability, messages within the 475 receiver-side bytestream may be out-of-order or missing. 477 4. Summary 479 Here we summarize the minimum set of transport features in a more 480 compact form. 482 4.1. ESTABLISHMENT, AVAILABILITY and TERMINATION 484 A TAPS connection is created and associated with an existing or new 485 TAPS connection group. Grouping can influence the TAPS system to 486 multiplex TAPS connections on a single transport connection or not, 487 and the other parameters serve as input to the decision tree 488 described in Section 3.1. The TAPS systems gives no guarantees about 489 honoring any of the requests at this stage, these parameters are just 490 meant to help it choose and configure a suitable protocol. Note that 491 the parameters below affect all grouped TAPS connections. 493 A TAPS connection can actively connect to a peer; this may or may not 494 trigger a notification on the listening side. If the application 495 sends data (see Section 4.3) before the TAPS system establishes a 496 transport connection, then such data may be transmitted early, upon 497 connecting. When a TAPS system listens for incoming connections, the 498 first arriving message may already be the first block of data. 500 Creation / connection / configuration parameters: 502 reliability: a boolean that should be set to true when any of the 503 following will be useful to the application: reliably transfer 504 data; notify the peer of closing/aborting; preserve data ordering. 505 checksum_coverage: a boolean to specify whether it will be useful to 506 the application to specify checksum coverage when sending or 507 receiving. 508 config_msg_prio: a boolean that should be set to true when any of 509 the following per-message configuration or prioritization 510 mechanisms will be useful to the application: choosing a scheduler 511 to operate between grouped connections, with the possibility to 512 configure a priority or weight per connection; configurable 513 message reliability; unordered message delivery; requesting not to 514 delay the acknowledgement (SACK) of a message. 515 earlymsg_timeout_notifications: a boolean that should be set to true 516 when any of the following will be useful to the application: hand 517 over a message to reliably transfer (possibly multiple times) 518 before connection establishment; suggest timeout to the peer; 519 notification of excessive retransmissions (early warning below 520 abortion threshold); notification of ICMP error message arrival. 522 A TAPS connection can be closed after all outstanding data is 523 reliably delivered to the peer (if reliable data delivery was 524 requested earlier (!UDP)), in which case the peer is notified that 525 the connection is closed. Alternatively, a TAPS connection can be 526 aborted without delivering outstanding data to the peer. In case 527 reliable or partially reliable data delivery was requested earlier 528 (!UDP), the peer is notified that the connection is aborted. 530 4.2. MAINTENANCE 532 As a general rule, any configuration described below should be 533 carried out as early as possible to aid the TAPS system's decision 534 taking. 536 4.2.1. Connection groups 538 The transport features below apply to all TAPS connections in the 539 same group: 541 (!UDP) Configure a timeout: this can be done with the following 542 parameters: 544 o A timeout value for aborting connections, in seconds 545 o A timeout value to be suggested to the peer (if possible), in 546 seconds 547 o The number of retransmissions after which the application should 548 be notifed of "Excessive Retransmissions" 550 Configure urgency: this can be done with the following parameters: 552 o A number to identify the type of scheduler that should be used to 553 operate between connections in the group (no guarantees given). 554 Schedulers are defined in [RFC8260]. 555 o A "capacity profile" number to identify how an application wants 556 to use its available capacity. Choices can be "lowest possible 557 latency at the expense of overhead" (which would disable any 558 Nagle-like algorithm), "scavenger", or values that help determine 559 the DSCP value for a connection (e.g. similar to table 1 in 560 [I-D.ietf-tsvwg-rtcweb-qos]). 561 o A buffer limit (in bytes); when the sender has less then 562 low_watermark bytes in the buffer, the application may be 563 notified. Notifications are not guaranteed, and supporting 564 watermark values greater than 0 is not guaranteed. 566 The following properties can be queried: 568 o The maximum message size that may be sent without fragmentation, 569 in bytes (or "not available") 570 o The maximum transport message size that can be sent, in bytes (or 571 "not available") 572 o The maximum transport message size that can be received, in bytes 573 (or "not available") 574 o The maximum amount of data that can possibly be sent before or 575 during connection establishment, in bytes (or "not available") 577 In addition to the already mentioned closing / aborting notifications 578 and possible send errors, the following notifications can occur: 580 o Excessive Retransmissions: the configured (or a default) number of 581 retransmissions has been reached, yielding this early warning 582 below an abortion threshold. 583 o ICMP Arrival (parameter: ICMP message): an ICMP packet carrying 584 the conveyed ICMP message has arrived. 585 o ECN Arrival (parameter: ECN value): a packet carrying the conveyed 586 ECN value has arrived. This can be useful for applications 587 implementing congestion control. 588 o Timeout (parameter: s seconds): data could not be delivered for s 589 seconds. 590 o Drain: the send buffer has either drained below the configured low 591 water mark or it has become completely empty. 593 4.2.2. Individual connections 595 The transport features below apply to individual TAPS connections: 597 Configure priority or weight for a scheduler, as described in 598 [RFC8260]. 600 Configure checksum usage: this can be done with the following 601 parameters, but there is no guarantee that any checksum limitations 602 will indeed be enforced (the default behavior is "full coverage, 603 checksum enabled"): 605 o A boolean to enable / disable usage of a checksum when sending 606 o The desired coverage (in bytes) of the checksum used when sending 607 o A boolean to enable / disable requiring a checksum when receiving 608 o The required minimum coverage (in bytes) of the checksum when 609 receiving 611 4.3. DATA Transfer 613 When sending a message, no guarantees are given about the 614 preservation of message boundaries to the peer; if message boundaries 615 are needed, the receiving application at the peer must know about 616 them beforehand (or the TAPS system cannot use TCP). Note that an 617 application should already be able to hand over data before the TAPS 618 system establishes a transport connection. Regarding the message 619 that is being handed over, the following parameters can be used: 621 o (!UDP) Reliability: this parameter is used to convey a choice of: 622 fully reliable, unreliable without congestion control (which is 623 guaranteed), unreliable, partially reliable (see [RFC3758] and 624 [RFC7496] for details on how to specify partial reliability). The 625 latter two choices are not guaranteed and may result in full 626 reliability. 627 o (!UDP) Ordered: this boolean parameter lets an application choose 628 between ordered message delivery (true) and possibly unordered, 629 potentially faster message delivery (false). 630 o Bundle: a boolean that expresses a preference for allowing to 631 bundle messages (true) or not (false). No guarantees are given. 632 o DelAck: a boolean that, if false, lets an application request that 633 the peer would not delay the acknowledgement for this message. 634 o Fragment: a boolean that expresses a preference for allowing to 635 fragment messages (true) or not (false), at the IP level. No 636 guarantees are given. 637 o (!UDP) Idempotent: a boolean that expresses whether a message is 638 idempotent (true) or not (false). Idempotent messages may arrive 639 multiple times at the receiver (but they will arrive at least 640 once). When data is idempotent it can be used by the receiver 641 immediately on a connection establishment attempt. Thus, if data 642 is handed over before the TAPS system establishes a transport 643 connection, stating that a message is idempotent facilitates 644 transmitting it to the peer application particularly early. 646 An application can be notified of a failure to send a specific 647 message. There is no guarantee of such notifications, i.e. send 648 failures can also silently occur. 650 When receiving data blocks, these blocks may or may not correspond to 651 a sender-side message, i.e. the receiving application is not informed 652 about message boundaries (this limitation is only needed for TAPS 653 systems that are implemented to directly use TCP). However, if the 654 sending application has allowed that messages are not fully reliably 655 transferred, or delivered out of order, then such re-ordering or 656 unreliability may be reflected per message in the arriving data. 657 Messages will always stay intact - i.e. if an incomplete message is 658 contained at the end of the arriving data block, this message is 659 guaranteed to continue in the next arriving data block. 661 5. Conclusion 663 By decoupling applications from transport protocols, a TAPS system 664 provides a different abstraction level than the Berkeley sockets 665 interface. As with high- vs. low-level programming languages, a 666 higher abstraction level allows more freedom for automation below the 667 interface, yet it takes some control away from the application 668 programmer. This is the design trade-off that a TAPS system 669 developer is facing, and this document provides guidance on the 670 design of this abstraction level. Some transport features are 671 currently rarely offered by APIs, yet they must be offered or they 672 can never be used ("functional" transport features). Other transport 673 features are offered by the APIs of the protocols covered here, but 674 not exposing them in a TAPS API would allow for more freedom to 675 automate protocol usage in a TAPS system. The minimal set presented 676 in this document is an effort to find a middle ground that can be 677 recommended for TAPS systems to implement, on the basis of the 678 transport features discussed in [TAPS2]. 680 6. Acknowledgements 682 The authors would like to thank all the participants of the TAPS 683 Working Group and the NEAT and MAMI research projects for valuable 684 input to this document. We especially thank Michael Tuexen for help 685 with TAPS connection connection establishment/teardown and Gorry 686 Fairhurst for his suggestions regarding fragmentation and packet 687 sizes. This work has received funding from the European Union's 688 Horizon 2020 research and innovation programme under grant agreement 689 No. 644334 (NEAT). 691 7. IANA Considerations 693 XX RFC ED - PLEASE REMOVE THIS SECTION XXX 695 This memo includes no request to IANA. 697 8. Security Considerations 699 Authentication, confidentiality protection, and integrity protection 700 are identified as transport features by [RFC8095]. As currently 701 deployed in the Internet, these features are generally provided by a 702 protocol or layer on top of the transport protocol; no current full- 703 featured standards-track transport protocol provides all of these 704 transport features on its own. Therefore, these transport features 705 are not considered in this document, with the exception of native 706 authentication capabilities of TCP and SCTP for which the security 707 considerations in [RFC5925] and [RFC4895] apply. 709 9. References 711 9.1. Normative References 713 [RFC8095] Fairhurst, G., Ed., Trammell, B., Ed., and M. Kuehlewind, 714 Ed., "Services Provided by IETF Transport Protocols and 715 Congestion Control Mechanisms", RFC 8095, 716 DOI 10.17487/RFC8095, March 2017, 717 . 719 [TAPS2] Welzl, M., Tuexen, M., and N. Khademi, "On the Usage of 720 Transport Features Provided by IETF Transport Protocols", 721 Internet-draft draft-ietf-taps-transports-usage-08, August 722 2017. 724 [TAPS2UDP] 725 Fairhurst, G. and T. Jones, "Features of the User Datagram 726 Protocol (UDP) and Lightweight UDP (UDP-Lite) Transport 727 Protocols", Internet-draft draft-ietf-taps-transports- 728 usage-udp-07, September 2017. 730 9.2. Informative References 732 [COBS] Cheshire, S. and M. Baker, "Consistent Overhead Byte 733 Stuffing", September 1997, 734 . 736 [I-D.grinnemo-taps-he] 737 Grinnemo, K., Brunstrom, A., Hurtig, P., Khademi, N., and 738 Z. Bozakov, "Happy Eyeballs for Transport Selection", 739 draft-grinnemo-taps-he-03 (work in progress), July 2017. 741 [I-D.ietf-tsvwg-rtcweb-qos] 742 Jones, P., Dhesikan, S., Jennings, C., and D. Druta, "DSCP 743 Packet Markings for WebRTC QoS", draft-ietf-tsvwg-rtcweb- 744 qos-18 (work in progress), August 2016. 746 [I-D.pauly-taps-transport-security] 747 Pauly, T., Rose, K., and C. Wood, "A Survey of Transport 748 Security Protocols", draft-pauly-taps-transport- 749 security-01 (work in progress), January 2018. 751 [LBE-draft] 752 Bless, R., "A Lower Effort Per-Hop Behavior (LE PHB)", 753 Internet-draft draft-tsvwg-le-phb-03, February 2018. 755 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 756 RFC 2914, DOI 10.17487/RFC2914, September 2000, 757 . 759 [RFC3758] Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P. 760 Conrad, "Stream Control Transmission Protocol (SCTP) 761 Partial Reliability Extension", RFC 3758, 762 DOI 10.17487/RFC3758, May 2004, 763 . 765 [RFC4895] Tuexen, M., Stewart, R., Lei, P., and E. Rescorla, 766 "Authenticated Chunks for the Stream Control Transmission 767 Protocol (SCTP)", RFC 4895, DOI 10.17487/RFC4895, August 768 2007, . 770 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 771 Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, 772 . 774 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 775 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 776 June 2010, . 778 [RFC6458] Stewart, R., Tuexen, M., Poon, K., Lei, P., and V. 779 Yasevich, "Sockets API Extensions for the Stream Control 780 Transmission Protocol (SCTP)", RFC 6458, 781 DOI 10.17487/RFC6458, December 2011, 782 . 784 [RFC6525] Stewart, R., Tuexen, M., and P. Lei, "Stream Control 785 Transmission Protocol (SCTP) Stream Reconfiguration", 786 RFC 6525, DOI 10.17487/RFC6525, February 2012, 787 . 789 [RFC7305] Lear, E., Ed., "Report from the IAB Workshop on Internet 790 Technology Adoption and Transition (ITAT)", RFC 7305, 791 DOI 10.17487/RFC7305, July 2014, 792 . 794 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 795 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 796 . 798 [RFC7496] Tuexen, M., Seggelmann, R., Stewart, R., and S. Loreto, 799 "Additional Policies for the Partially Reliable Stream 800 Control Transmission Protocol Extension", RFC 7496, 801 DOI 10.17487/RFC7496, April 2015, 802 . 804 [RFC8260] Stewart, R., Tuexen, M., Loreto, S., and R. Seggelmann, 805 "Stream Schedulers and User Message Interleaving for the 806 Stream Control Transmission Protocol", RFC 8260, 807 DOI 10.17487/RFC8260, November 2017, 808 . 810 [WWDC2015] 811 Lakhera, P. and S. Cheshire, "Your App and Next Generation 812 Networks", Apple Worldwide Developers Conference 2015, San 813 Francisco, USA, June 2015, 814 . 816 Appendix A. Deriving the minimal set 818 We approach the construction of a minimal set of transport features 819 in the following way: 821 1. Categorization: the superset of transport features from [TAPS2] 822 is presented, and transport features are categorized for later 823 reduction. 824 2. Reduction: a shorter list of transport features is derived from 825 the categorization in the first step. This removes all transport 826 features that do not require application-specific knowledge or 827 cannot be implemented with TCP. !!!TODO discuss UDP 828 3. Discussion: the resulting list shows a number of peculiarities 829 that are discussed, to provide a basis for constructing the 830 minimal set. 831 4. Construction: Based on the reduced set and the discussion of the 832 transport features therein, a minimal set is constructed. 834 The first three steps as well as the underlying rationale for 835 constructing the minimal set are described in this appendix. The 836 minimal set itself is described in Section 3. 838 A.1. Step 1: Categorization -- The Superset of Transport Features 840 Following [TAPS2], we divide the transport features into two main 841 groups as follows: 843 1. CONNECTION related transport features 844 - ESTABLISHMENT 845 - AVAILABILITY 846 - MAINTENANCE 847 - TERMINATION 849 2. DATA Transfer related transport features 850 - Sending Data 851 - Receiving Data 852 - Errors 854 We assume that TAPS applications have no specific requirements that 855 need knowledge about the network, e.g. regarding the choice of 856 network interface or the end-to-end path. Even with these 857 assumptions, there are certain requirements that are strictly kept by 858 transport protocols today, and these must also be kept by a TAPS 859 system. Some of these requirements relate to transport features that 860 we call "Functional". 862 Functional transport features provide functionality that cannot be 863 used without the application knowing about them, or else they violate 864 assumptions that might cause the application to fail. For example, 865 ordered message delivery is a functional transport feature: it cannot 866 be configured without the application knowing about it because the 867 application's assumption could be that messages always arrive in 868 order. Failure includes any change of the application behavior that 869 is not performance oriented, e.g. security. 871 "Change DSCP" and "Disable Nagle algorithm" are examples of transport 872 features that we call "Optimizing": if a TAPS system autonomously 873 decides to enable or disable them, an application will not fail, but 874 a TAPS system may be able to communicate more efficiently if the 875 application is in control of this optimizing transport feature. 876 These transport features require application-specific knowledge 877 (e.g., about delay/bandwidth requirements or the length of future 878 data blocks that are to be transmitted). 880 The transport features of IETF transport protocols that do not 881 require application-specific knowledge and could therefore be 882 transparently utilized by a TAPS system are called "Automatable". 884 Finally, some transport features are aggregated and/or slightly 885 changed in the description below. These transport features are 886 marked as "ADDED". The corresponding transport features are 887 automatable, and they are listed immediately below the "ADDED" 888 transport feature. 890 In this description, transport services are presented following the 891 nomenclature "CATEGORY.[SUBCATEGORY].SERVICENAME.PROTOCOL", 892 equivalent to "pass 2" in [TAPS2]. We also sketch how some of the 893 TAPS transport features can be implemented by a TAPS system. For all 894 transport features that are categorized as "functional" or 895 "optimizing", and for which no matching TCP and/or UDP primitive 896 exists in "pass 2" of [TAPS2], a brief discussion on how to implement 897 them over TCP and/or UDP is included. 899 We designate some transport features as "automatable" on the basis of 900 a broader decision that affects multiple transport features: 902 o Most transport features that are related to multi-streaming were 903 designated as "automatable". This was done because the decision 904 on whether to use multi-streaming or not does not depend on 905 application-specific knowledge. This means that a connection that 906 is exhibited to an application could be implemented by using a 907 single stream of an SCTP association instead of mapping it to a 908 complete SCTP association or TCP connection. This could be 909 achieved by using more than one stream when an SCTP association is 910 first established (CONNECT.SCTP parameter "outbound stream 911 count"), maintaining an internal stream number, and using this 912 stream number when sending data (SEND.SCTP parameter "stream 913 number"). Closing or aborting a connection could then simply free 914 the stream number for future use. This is discussed further in 915 Appendix A.3.2. 916 o All transport features that are related to using multiple paths or 917 the choice of the network interface were designated as 918 "automatable". Choosing a path or an interface does not depend on 919 application-specific knowledge. For example, "Listen" could 920 always listen on all available interfaces and "Connect" could use 921 the default interface for the destination IP address. 923 A.1.1. CONNECTION Related Transport Features 925 ESTABLISHMENT: 927 o Connect 928 Protocols: TCP, SCTP, UDP(-Lite) 929 Functional because the notion of a connection is often reflected 930 in applications as an expectation to be able to communicate after 931 a "Connect" succeeded, with a communication sequence relating to 932 this transport feature that is defined by the application 933 protocol. 934 Implementation: via CONNECT.TCP, CONNECT.SCTP or CONNECT.UDP(- 935 Lite). 937 o Specify which IP Options must always be used 938 Protocols: TCP, UDP(-Lite) 939 Automatable because IP Options relate to knowledge about the 940 network, not the application. 942 o Request multiple streams 943 Protocols: SCTP 944 Automatable because using multi-streaming does not require 945 application-specific knowledge. 946 Implementation: see Appendix A.3.2. 948 o Limit the number of inbound streams 949 Protocols: SCTP 950 Automatable because using multi-streaming does not require 951 application-specific knowledge. 952 Implementation: see Appendix A.3.2. 954 o Specify number of attempts and/or timeout for the first 955 establishment message 956 Protocols: TCP, SCTP 957 Functional because this is closely related to potentially assumed 958 reliable data delivery for data that is sent before or during 959 connection establishment. 960 Implementation: Using a parameter of CONNECT.TCP and CONNECT.SCTP. 961 Implementation over UDP: Do nothing (this is irrelevant in case of 962 UDP because there, reliable data delivery is not assumed). 964 o Obtain multiple sockets 965 Protocols: SCTP 966 Automatable because the usage of multiple paths to communicate to 967 the same end host relates to knowledge about the network, not the 968 application. 970 o Disable MPTCP 971 Protocols: MPTCP 972 Automatable because the usage of multiple paths to communicate to 973 the same end host relates to knowledge about the network, not the 974 application. 975 Implementation: via a boolean parameter in CONNECT.MPTCP. 977 o Configure authentication 978 Protocols: TCP, SCTP 979 Functional because this has a direct influence on security. 980 Implementation: via parameters in CONNECT.TCP and CONNECT.SCTP. 981 Implementation over TCP: With TCP, this allows to configure Master 982 Key Tuples (MKTs) to authenticate complete segments (including the 983 TCP IPv4 pseudoheader, TCP header, and TCP data). With SCTP, this 984 allows to specify which chunk types must always be authenticated. 985 Authenticating only certain chunk types creates a reduced level of 986 security that is not supported by TCP; to be compatible, this 987 should therefore only allow to authenticate all chunk types. Key 988 material must be provided in a way that is compatible with both 989 [RFC4895] and [RFC5925]. 990 Implementation over UDP: Not possible. 992 o Indicate (and/or obtain upon completion) an Adaptation Layer via 993 an adaptation code point 994 Protocols: SCTP 995 Functional because it allows to send extra data for the sake of 996 identifying an adaptation layer, which by itself is application- 997 specific. 998 Implementation: via a parameter in CONNECT.SCTP. 999 Implementation over TCP: not possible. 1000 Implementation over UDP: not possible. 1002 o Request to negotiate interleaving of user messages 1003 Protocols: SCTP 1004 Automatable because it requires using multiple streams, but 1005 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1006 category is automatable. 1007 Implementation: via a parameter in CONNECT.SCTP. 1009 o Hand over a message to reliably transfer (possibly multiple times) 1010 before connection establishment 1011 Protocols: TCP 1012 Functional because this is closely tied to properties of the data 1013 that an application sends or expects to receive. 1014 Implementation: via a parameter in CONNECT.TCP. 1015 Implementation over UDP: not possible. 1017 o Hand over a message to reliably transfer during connection 1018 establishment 1019 Protocols: SCTP 1020 Functional because this can only work if the message is limited in 1021 size, making it closely tied to properties of the data that an 1022 application sends or expects to receive. 1023 Implementation: via a parameter in CONNECT.SCTP. 1024 Implementation over UDP: not possible. 1026 o Enable UDP encapsulation with a specified remote UDP port number 1027 Protocols: SCTP 1028 Automatable because UDP encapsulation relates to knowledge about 1029 the network, not the application. 1031 AVAILABILITY: 1033 o Listen 1034 Protocols: TCP, SCTP, UDP(-Lite) 1035 Functional because the notion of accepting connection requests is 1036 often reflected in applications as an expectation to be able to 1037 communicate after a "Listen" succeeded, with a communication 1038 sequence relating to this transport feature that is defined by the 1039 application protocol. 1040 ADDED. This differs from the 3 automatable transport features 1041 below in that it leaves the choice of interfaces for listening 1042 open. 1043 Implementation: by listening on all interfaces via LISTEN.TCP (not 1044 providing a local IP address) or LISTEN.SCTP (providing SCTP port 1045 number / address pairs for all local IP addresses). LISTEN.UDP(- 1046 Lite) supports both methods. 1048 o Listen, 1 specified local interface 1049 Protocols: TCP, SCTP, UDP(-Lite) 1050 Automatable because decisions about local interfaces relate to 1051 knowledge about the network and the Operating System, not the 1052 application. 1054 o Listen, N specified local interfaces 1055 Protocols: SCTP 1056 Automatable because decisions about local interfaces relate to 1057 knowledge about the network and the Operating System, not the 1058 application. 1060 o Listen, all local interfaces 1061 Protocols: TCP, SCTP, UDP(-Lite) 1062 Automatable because decisions about local interfaces relate to 1063 knowledge about the network and the Operating System, not the 1064 application. 1066 o Specify which IP Options must always be used 1067 Protocols: TCP, UDP(-Lite) 1068 Automatable because IP Options relate to knowledge about the 1069 network, not the application. 1071 o Disable MPTCP 1072 Protocols: MPTCP 1073 Automatable because the usage of multiple paths to communicate to 1074 the same end host relates to knowledge about the network, not the 1075 application. 1077 o Configure authentication 1078 Protocols: TCP, SCTP 1079 Functional because this has a direct influence on security. 1080 Implementation: via parameters in LISTEN.TCP and LISTEN.SCTP. 1081 Implementation over TCP: With TCP, this allows to configure Master 1082 Key Tuples (MKTs) to authenticate complete segments (including the 1083 TCP IPv4 pseudoheader, TCP header, and TCP data). With SCTP, this 1084 allows to specify which chunk types must always be authenticated. 1085 Authenticating only certain chunk types creates a reduced level of 1086 security that is not supported by TCP; to be compatible, this 1087 should therefore only allow to authenticate all chunk types. Key 1088 material must be provided in a way that is compatible with both 1089 [RFC4895] and [RFC5925]. 1090 Implementation over UDP: not possible. 1092 o Obtain requested number of streams 1093 Protocols: SCTP 1094 Automatable because using multi-streaming does not require 1095 application-specific knowledge. 1096 Implementation: see Appendix A.3.2. 1098 o Limit the number of inbound streams 1099 Protocols: SCTP 1100 Automatable because using multi-streaming does not require 1101 application-specific knowledge. 1102 Implementation: see Appendix A.3.2. 1104 o Indicate (and/or obtain upon completion) an Adaptation Layer via 1105 an adaptation code point 1106 Protocols: SCTP 1107 Functional because it allows to send extra data for the sake of 1108 identifying an adaptation layer, which by itself is application- 1109 specific. 1110 Implementation: via a parameter in LISTEN.SCTP. 1111 Implementation over TCP: not possible. 1112 Implementation over UDP: not possible. 1114 o Request to negotiate interleaving of user messages 1115 Protocols: SCTP 1116 Automatable because it requires using multiple streams, but 1117 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1118 category is automatable. 1119 Implementation: via a parameter in LISTEN.SCTP. 1121 MAINTENANCE: 1123 o Change timeout for aborting connection (using retransmit limit or 1124 time value) 1125 Protocols: TCP, SCTP 1126 Functional because this is closely related to potentially assumed 1127 reliable data delivery. 1128 Implementation: via CHANGE-TIMEOUT.TCP or CHANGE-TIMEOUT.SCTP. 1129 Implementation over UDP: not possible (UDP is unreliable and there 1130 is no connection timeout). 1132 o Suggest timeout to the peer 1133 Protocols: TCP 1134 Functional because this is closely related to potentially assumed 1135 reliable data delivery. 1136 Implementation: via CHANGE-TIMEOUT.TCP. 1137 Implementation over UDP: not possible (UDP is unreliable and there 1138 is no connection timeout). 1140 o Disable Nagle algorithm 1141 Protocols: TCP, SCTP 1142 Optimizing because this decision depends on knowledge about the 1143 size of future data blocks and the delay between them. 1144 Implementation: via DISABLE-NAGLE.TCP and DISABLE-NAGLE.SCTP. 1145 Implementation over UDP: do nothing (UDP does not implement the 1146 Nagle algorithm). 1148 o Request an immediate heartbeat, returning success/failure 1149 Protocols: SCTP 1150 Automatable because this informs about network-specific knowledge. 1152 o Notification of Excessive Retransmissions (early warning below 1153 abortion threshold) 1154 Protocols: TCP 1155 Optimizing because it is an early warning to the application, 1156 informing it of an impending functional event. 1157 Implementation: via ERROR.TCP. 1158 Implementation over UDP: do nothing (there is no abortion 1159 threshold). 1161 o Add path 1162 Protocols: MPTCP, SCTP 1163 MPTCP Parameters: source-IP; source-Port; destination-IP; 1164 destination-Port 1165 SCTP Parameters: local IP address 1166 Automatable because the usage of multiple paths to communicate to 1167 the same end host relates to knowledge about the network, not the 1168 application. 1170 o Remove path 1171 Protocols: MPTCP, SCTP 1172 MPTCP Parameters: source-IP; source-Port; destination-IP; 1173 destination-Port 1174 SCTP Parameters: local IP address 1175 Automatable because the usage of multiple paths to communicate to 1176 the same end host relates to knowledge about the network, not the 1177 application. 1179 o Set primary path 1180 Protocols: SCTP 1181 Automatable because the usage of multiple paths to communicate to 1182 the same end host relates to knowledge about the network, not the 1183 application. 1185 o Suggest primary path to the peer 1186 Protocols: SCTP 1187 Automatable because the usage of multiple paths to communicate to 1188 the same end host relates to knowledge about the network, not the 1189 application. 1191 o Configure Path Switchover 1192 Protocols: SCTP 1193 Automatable because the usage of multiple paths to communicate to 1194 the same end host relates to knowledge about the network, not the 1195 application. 1197 o Obtain status (query or notification) 1198 Protocols: SCTP, MPTCP 1199 SCTP parameters: association connection state; destination 1200 transport address list; destination transport address reachability 1201 states; current local and peer receiver window size; current local 1202 congestion window sizes; number of unacknowledged DATA chunks; 1203 number of DATA chunks pending receipt; primary path; most recent 1204 SRTT on primary path; RTO on primary path; SRTT and RTO on other 1205 destination addresses; MTU per path; interleaving supported yes/no 1206 MPTCP parameters: subflow-list (identified by source-IP; source- 1207 Port; destination-IP; destination-Port) 1208 Automatable because these parameters relate to knowledge about the 1209 network, not the application. 1211 o Specify DSCP field 1212 Protocols: TCP, SCTP, UDP(-Lite) 1213 Optimizing because choosing a suitable DSCP value requires 1214 application-specific knowledge. 1215 Implementation: via SET_DSCP.TCP / SET_DSCP.SCTP / SET_DSCP.UDP(- 1216 Lite) 1218 o Notification of ICMP error message arrival 1219 Protocols: TCP, UDP(-Lite) 1220 Optimizing because these messages can inform about success or 1221 failure of functional transport features (e.g., host unreachable 1222 relates to "Connect") 1223 Implementation: via ERROR.TCP or ERROR.UDP(-Lite). 1225 o Obtain information about interleaving support 1226 Protocols: SCTP 1227 Automatable because it requires using multiple streams, but 1228 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1229 category is automatable. 1230 Implementation: via a parameter in GETINTERL.SCTP. 1232 o Change authentication parameters 1233 Protocols: TCP, SCTP 1234 Functional because this has a direct influence on security. 1235 Implementation: via SET_AUTH.TCP and SET_AUTH.SCTP. 1236 Implementation over TCP: With SCTP, this allows to adjust key_id, 1237 key, and hmac_id. With TCP, this allows to change the preferred 1238 outgoing MKT (current_key) and the preferred incoming MKT 1239 (rnext_key), respectively, for a segment that is sent on the 1240 connection. Key material must be provided in a way that is 1241 compatible with both [RFC4895] and [RFC5925]. 1242 Implementation over UDP: not possible. 1244 o Obtain authentication information 1245 Protocols: SCTP 1246 Functional because authentication decisions may have been made by 1247 the peer, and this has an influence on the necessary application- 1248 level measures to provide a certain level of security. 1249 Implementation: via GETAUTH.SCTP. 1251 Implementation over TCP: With SCTP, this allows to obtain key_id 1252 and a chunk list. With TCP, this allows to obtain current_key and 1253 rnext_key from a previously received segment. Key material must 1254 be provided in a way that is compatible with both [RFC4895] and 1255 [RFC5925]. 1256 Implementation over UDP: not possible. 1258 o Reset Stream 1259 Protocols: SCTP 1260 Automatable because using multi-streaming does not require 1261 application-specific knowledge. 1262 Implementation: see Appendix A.3.2. 1264 o Notification of Stream Reset 1265 Protocols: STCP 1266 Automatable because using multi-streaming does not require 1267 application-specific knowledge. 1268 Implementation: see Appendix A.3.2. 1270 o Reset Association 1271 Protocols: SCTP 1272 Automatable because deciding to reset an association does not 1273 require application-specific knowledge. 1274 Implementation: via RESETASSOC.SCTP. 1276 o Notification of Association Reset 1277 Protocols: STCP 1278 Automatable because this notification does not relate to 1279 application-specific knowledge. 1281 o Add Streams 1282 Protocols: SCTP 1283 Automatable because using multi-streaming does not require 1284 application-specific knowledge. 1285 Implementation: see Appendix A.3.2. 1287 o Notification of Added Stream 1288 Protocols: STCP 1289 Automatable because using multi-streaming does not require 1290 application-specific knowledge. 1291 Implementation: see Appendix A.3.2. 1293 o Choose a scheduler to operate between streams of an association 1294 Protocols: SCTP 1295 Optimizing because the scheduling decision requires application- 1296 specific knowledge. However, if a TAPS system would not use this, 1297 or wrongly configure it on its own, this would only affect the 1298 performance of data transfers; the outcome would still be correct 1299 within the "best effort" service model. 1300 Implementation: using SETSTREAMSCHEDULER.SCTP. 1301 Implementation over TCP: do nothing. 1302 Implementation over UDP: do nothing. 1304 o Configure priority or weight for a scheduler 1305 Protocols: SCTP 1306 Optimizing because the priority or weight requires application- 1307 specific knowledge. However, if a TAPS system would not use this, 1308 or wrongly configure it on its own, this would only affect the 1309 performance of data transfers; the outcome would still be correct 1310 within the "best effort" service model. 1311 Implementation: using CONFIGURESTREAMSCHEDULER.SCTP. 1312 Implementation over TCP: do nothing. 1313 Implementation over UDP: do nothing. 1315 o Configure send buffer size 1316 Protocols: SCTP 1317 Automatable because this decision relates to knowledge about the 1318 network and the Operating System, not the application (see also 1319 the discussion in Appendix A.3.4). 1321 o Configure receive buffer (and rwnd) size 1322 Protocols: SCTP 1323 Automatable because this decision relates to knowledge about the 1324 network and the Operating System, not the application. 1326 o Configure message fragmentation 1327 Protocols: SCTP 1328 Automatable because fragmentation relates to knowledge about the 1329 network and the Operating System, not the application. 1330 Implementation: by always enabling it with 1331 CONFIG_FRAGMENTATION.SCTP and auto-setting the fragmentation size 1332 based on network or Operating System conditions. 1334 o Configure PMTUD 1335 Protocols: SCTP 1336 Automatable because Path MTU Discovery relates to knowledge about 1337 the network, not the application. 1339 o Configure delayed SACK timer 1340 Protocols: SCTP 1341 Automatable because the receiver-side decision to delay sending 1342 SACKs relates to knowledge about the network, not the application 1343 (it can be relevant for a sending application to request not to 1344 delay the SACK of a message, but this is a different transport 1345 feature). 1347 o Set Cookie life value 1348 Protocols: SCTP 1349 Functional because it relates to security (possibly weakened by 1350 keeping a cookie very long) versus the time between connection 1351 establishment attempts. Knowledge about both issues can be 1352 application-specific. 1353 Implementation over TCP: the closest specified TCP functionality 1354 is the cookie in TCP Fast Open; for this, [RFC7413] states that 1355 the server "can expire the cookie at any time to enhance security" 1356 and section 4.1.2 describes an example implementation where 1357 updating the key on the server side causes the cookie to expire. 1358 Alternatively, for implementations that do not support TCP Fast 1359 Open, this transport feature could also affect the validity of SYN 1360 cookies (see Section 3.6 of [RFC4987]). 1361 Implementation over UDP: do nothing. 1363 o Set maximum burst 1364 Protocols: SCTP 1365 Automatable because it relates to knowledge about the network, not 1366 the application. 1368 o Configure size where messages are broken up for partial delivery 1369 Protocols: SCTP 1370 Functional because this is closely tied to properties of the data 1371 that an application sends or expects to receive. 1372 Implementation over TCP: not possible. 1373 Implementation over UDP: not possible. 1375 o Disable checksum when sending 1376 Protocols: UDP 1377 Functional because application-specific knowledge is necessary to 1378 decide whether it can be acceptable to lose data integrity. 1379 Implementation: via SET_CHECKSUM_ENABLED.UDP. 1380 Implementation over TCP: do nothing. 1382 o Disable checksum requirement when receiving 1383 Protocols: UDP 1384 Functional because application-specific knowledge is necessary to 1385 decide whether it can be acceptable to lose data integrity. 1386 Implementation: via SET_CHECKSUM_REQUIRED.UDP. 1387 Implementation over TCP: do nothing. 1389 o Specify checksum coverage used by the sender 1390 Protocols: UDP-Lite 1391 Functional because application-specific knowledge is necessary to 1392 decide for which parts of the data it can be acceptable to lose 1393 data integrity. 1394 Implementation: via SET_CHECKSUM_COVERAGE.UDP-Lite. 1395 Implementation over TCP: do nothing. 1397 o Specify minimum checksum coverage required by receiver 1398 Protocols: UDP-Lite 1399 Functional because application-specific knowledge is necessary to 1400 decide for which parts of the data it can be acceptable to lose 1401 data integrity. 1402 Implementation: via SET_MIN_CHECKSUM_COVERAGE.UDP-Lite. 1403 Implementation over TCP: do nothing. 1405 o Specify DF field 1406 Protocols: UDP(-Lite) 1407 Optimizing because the DF field can be used to carry out Path MTU 1408 Discovery, which can lead an application to choose message sizes 1409 that can be transmitted more efficiently. 1411 Implementation: via MAINTENANCE.SET_DF.UDP(-Lite) and 1412 SEND_FAILURE.UDP(-Lite). 1413 Implementation over TCP: do nothing. With TCP the sender is not 1414 in control of transport message sizes, making this functionality 1415 irrelevant. 1417 o Get max. transport-message size that may be sent using a non- 1418 fragmented IP packet from the configured interface 1419 Protocols: UDP(-Lite) 1420 Optimizing because this can lead an application to choose message 1421 sizes that can be transmitted more efficiently. 1422 Implementation over TCP: do nothing: this information is not 1423 available with TCP. 1425 o Get max. transport-message size that may be received from the 1426 configured interface 1427 Protocols: UDP(-Lite) 1428 Optimizing because this can, for example, influence an 1429 application's memory management. 1430 Implementation over TCP: do nothing: this information is not 1431 available with TCP. 1433 o Specify TTL/Hop count field 1434 Protocols: UDP(-Lite) 1435 Automatable because a TAPS system can use a large enough system 1436 default to avoid communication failures. Allowing an application 1437 to configure it differently can produce notifications of ICMP 1438 error message arrivals that yield information which only relates 1439 to knowledge about the network, not the application. 1441 o Obtain TTL/Hop count field 1442 Protocols: UDP(-Lite) 1443 Automatable because the TTL/Hop count field relates to knowledge 1444 about the network, not the application. 1446 o Specify ECN field 1447 Protocols: UDP(-Lite) 1448 Automatable because the ECN field relates to knowledge about the 1449 network, not the application. 1451 o Obtain ECN field 1452 Protocols: UDP(-Lite) 1453 Optimizing because this information can be used by an application 1454 to better carry out congestion control (this is relevant when 1455 choosing a data transmission transport service that does not 1456 already do congestion control). 1457 Implementation over TCP: do nothing: this information is not 1458 available with TCP. 1460 o Specify IP Options 1461 Protocols: UDP(-Lite) 1462 Automatable because IP Options relate to knowledge about the 1463 network, not the application. 1465 o Obtain IP Options 1466 Protocols: UDP(-Lite) 1467 Automatable because IP Options relate to knowledge about the 1468 network, not the application. 1470 o Enable and configure a "Low Extra Delay Background Transfer" 1471 Protocols: A protocol implementing the LEDBAT congestion control 1472 mechanism 1473 Optimizing because whether this service is appropriate or not 1474 depends on application-specific knowledge. However, wrongly using 1475 this will only affect the speed of data transfers (albeit 1476 including other transfers that may compete with the TAPS transfer 1477 in the network), so it is still correct within the "best effort" 1478 service model. 1479 Implementation: via CONFIGURE.LEDBAT and/or SET_DSCP.TCP / 1480 SET_DSCP.SCTP / SET_DSCP.UDP(-Lite) [LBE-draft]. 1481 Implementation over TCP: do nothing. 1482 Implementation over UDP: do nothing. 1484 TERMINATION: 1486 o Close after reliably delivering all remaining data, causing an 1487 event informing the application on the other side 1488 Protocols: TCP, SCTP 1489 Functional because the notion of a connection is often reflected 1490 in applications as an expectation to have all outstanding data 1491 delivered and no longer be able to communicate after a "Close" 1492 succeeded, with a communication sequence relating to this 1493 transport feature that is defined by the application protocol. 1494 Implementation: via CLOSE.TCP and CLOSE.SCTP. 1495 Implementation over UDP: not possible. 1497 o Abort without delivering remaining data, causing an event 1498 informing the application on the other side 1499 Protocols: TCP, SCTP 1500 Functional because the notion of a connection is often reflected 1501 in applications as an expectation to potentially not have all 1502 outstanding data delivered and no longer be able to communicate 1503 after an "Abort" succeeded. On both sides of a connection, an 1504 application protocol may define a communication sequence relating 1505 to this transport feature. 1506 Implementation: via ABORT.TCP and ABORT.SCTP. 1507 Implementation over UDP: not possible. 1509 o Abort without delivering remaining data, not causing an event 1510 informing the application on the other side 1511 Protocols: UDP(-Lite) 1512 Functional because the notion of a connection is often reflected 1513 in applications as an expectation to potentially not have all 1514 outstanding data delivered and no longer be able to communicate 1515 after an "Abort" succeeded. On both sides of a connection, an 1516 application protocol may define a communication sequence relating 1517 to this transport feature. 1518 Implementation: via ABORT.UDP(-Lite). 1519 Implementation over TCP: stop using the connection, wait for a 1520 timeout. 1522 o Timeout event when data could not be delivered for too long 1523 Protocols: TCP, SCTP 1524 Functional because this notifies that potentially assumed reliable 1525 data delivery is no longer provided. 1526 Implementation: via TIMEOUT.TCP and TIMEOUT.SCTP. 1527 Implementation over UDP: do nothing: this event will not occur 1528 with UDP. 1530 A.1.2. DATA Transfer Related Transport Features 1532 A.1.2.1. Sending Data 1534 o Reliably transfer data, with congestion control 1535 Protocols: TCP, SCTP 1536 Functional because this is closely tied to properties of the data 1537 that an application sends or expects to receive. 1538 Implementation: via SEND.TCP and SEND.SCTP. 1539 Implementation over UDP: not possible. 1541 o Reliably transfer a message, with congestion control 1542 Protocols: SCTP 1543 Functional because this is closely tied to properties of the data 1544 that an application sends or expects to receive. 1545 Implementation: via SEND.SCTP. 1546 Implementation over TCP: via SEND.TCP. With SEND.TCP, messages 1547 will not be identifiable by the receiver. 1548 Implementation over UDP: not possible. 1550 o Unreliably transfer a message 1551 Protocols: SCTP, UDP(-Lite) 1552 Optimizing because only applications know about the time 1553 criticality of their communication, and reliably transfering a 1554 message is never incorrect for the receiver of a potentially 1555 unreliable data transfer, it is just slower. 1556 ADDED. This differs from the 2 automatable transport features 1557 below in that it leaves the choice of congestion control open. 1558 Implementation: via SEND.SCTP or SEND.UDP(-Lite). 1559 Implementation over TCP: use SEND.TCP. With SEND.TCP, messages 1560 will be sent reliably, and they will not be identifiable by the 1561 receiver. 1563 o Unreliably transfer a message, with congestion control 1564 Protocols: SCTP 1565 Automatable because congestion control relates to knowledge about 1566 the network, not the application. 1568 o Unreliably transfer a message, without congestion control 1569 Protocols: UDP(-Lite) 1570 Automatable because congestion control relates to knowledge about 1571 the network, not the application. 1573 o Configurable Message Reliability 1574 Protocols: SCTP 1575 Optimizing because only applications know about the time 1576 criticality of their communication, and reliably transfering a 1577 message is never incorrect for the receiver of a potentially 1578 unreliable data transfer, it is just slower. 1579 Implementation: via SEND.SCTP. 1580 Implementation over TCP: By using SEND.TCP and ignoring this 1581 configuration: based on the assumption of the best-effort service 1582 model, unnecessarily delivering data does not violate application 1583 expectations. Moreover, it is not possible to associate the 1584 requested reliability to a "message" in TCP anyway. 1585 Implementation over UDP: not possible. 1587 o Choice of stream 1588 Protocols: SCTP 1589 Automatable because it requires using multiple streams, but 1590 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1591 category is automatable. Implementation: see Appendix A.3.2. 1593 o Choice of path (destination address) 1594 Protocols: SCTP 1595 Automatable because it requires using multiple sockets, but 1596 obtaining multiple sockets in the CONNECTION.ESTABLISHMENT 1597 category is automatable. 1599 o Ordered message delivery (potentially slower than unordered) 1600 Protocols: SCTP 1601 Functional because this is closely tied to properties of the data 1602 that an application sends or expects to receive. 1603 Implementation: via SEND.SCTP. 1604 Implementation over TCP: By using SEND.TCP. With SEND.TCP, 1605 messages will not be identifiable by the receiver. 1606 Implementation over UDP: not possible. 1608 o Unordered message delivery (potentially faster than ordered) 1609 Protocols: SCTP, UDP(-Lite) 1610 Functional because this is closely tied to properties of the data 1611 that an application sends or expects to receive. 1612 Implementation: via SEND.SCTP. 1613 Implementation over TCP: By using SEND.TCP and always sending data 1614 ordered: based on the assumption of the best-effort service model, 1615 ordered delivery may just be slower and does not violate 1616 application expectations. Moreover, it is not possible to 1617 associate the requested delivery order to a "message" in TCP 1618 anyway. 1620 o Request not to bundle messages 1621 Protocols: SCTP 1622 Optimizing because this decision depends on knowledge about the 1623 size of future data blocks and the delay between them. 1624 Implementation: via SEND.SCTP. 1625 Implementation over TCP: By using SEND.TCP and DISABLE-NAGLE.TCP 1626 to disable the Nagle algorithm when the request is made and enable 1627 it again when the request is no longer made. Note that this is 1628 not fully equivalent because it relates to the time of issuing the 1629 request rather than a specific message. 1630 Implementation over UDP: do nothing (UDP never bundles messages). 1632 o Specifying a "payload protocol-id" (handed over as such by the 1633 receiver) 1634 Protocols: SCTP 1635 Functional because it allows to send extra application data with 1636 every message, for the sake of identification of data, which by 1637 itself is application-specific. 1638 Implementation: SEND.SCTP. 1639 Implementation over TCP: not possible. 1640 Implementation over UDP: not possible. 1642 o Specifying a key id to be used to authenticate a message 1643 Protocols: SCTP 1644 Functional because this has a direct influence on security. 1645 Implementation: via a parameter in SEND.SCTP. 1646 Implementation over TCP: This could be emulated by using 1647 SET_AUTH.TCP before and after the message is sent. Note that this 1648 is not fully equivalent because it relates to the time of issuing 1649 the request rather than a specific message. 1650 Implementation over UDP: not possible. 1652 o Request not to delay the acknowledgement (SACK) of a message 1653 Protocols: SCTP 1654 Optimizing because only an application knows for which message it 1655 wants to quickly be informed about success / failure of its 1656 delivery. 1657 Implementation over TCP: do nothing. 1658 Implementation over UDP: do nothing. 1660 A.1.2.2. Receiving Data 1662 o Receive data (with no message delimiting) 1663 Protocols: TCP 1664 Functional because a TAPS system must be able to send and receive 1665 data. 1666 Implementation: via RECEIVE.TCP. 1667 Implementation over UDP: do nothing (hand over a message, let the 1668 application ignore message boundaries). 1670 o Receive a message 1671 Protocols: SCTP, UDP(-Lite) 1672 Functional because this is closely tied to properties of the data 1673 that an application sends or expects to receive. 1674 Implementation: via RECEIVE.SCTP and RECEIVE.UDP(-Lite). 1675 Implementation over TCP: not possible. 1677 o Choice of stream to receive from 1678 Protocols: SCTP 1679 Automatable because it requires using multiple streams, but 1680 requesting multiple streams in the CONNECTION.ESTABLISHMENT 1681 category is automatable. 1682 Implementation: see Appendix A.3.2. 1684 o Information about partial message arrival 1685 Protocols: SCTP 1686 Functional because this is closely tied to properties of the data 1687 that an application sends or expects to receive. 1688 Implementation: via RECEIVE.SCTP. 1689 Implementation over TCP: do nothing: this information is not 1690 available with TCP. 1692 Implementation over UDP: do nothing: this information is not 1693 available with UDP. 1695 A.1.2.3. Errors 1697 This section describes sending failures that are associated with a 1698 specific call to in the "Sending Data" category (Appendix A.1.2.1). 1700 o Notification of send failures 1701 Protocols: SCTP, UDP(-Lite) 1702 Functional because this notifies that potentially assumed reliable 1703 data delivery is no longer provided. 1704 ADDED. This differs from the 2 automatable transport features 1705 below in that it does not distinugish between unsent and 1706 unacknowledged messages. 1707 Implementation: via SENDFAILURE-EVENT.SCTP and SEND_FAILURE.UDP(- 1708 Lite). 1709 Implementation over TCP: do nothing: this notification is not 1710 available and will therefore not occur with TCP. 1712 o Notification of an unsent (part of a) message 1713 Protocols: SCTP, UDP(-Lite) 1714 Automatable because the distinction between unsent and 1715 unacknowledged is network-specific. 1717 o Notification of an unacknowledged (part of a) message 1718 Protocols: SCTP 1719 Automatable because the distinction between unsent and 1720 unacknowledged is network-specific. 1722 o Notification that the stack has no more user data to send 1723 Protocols: SCTP 1724 Optimizing because reacting to this notification requires the 1725 application to be involved, and ensuring that the stack does not 1726 run dry of data (for too long) can improve performance. 1727 Implementation over TCP: do nothing. See also the discussion in 1728 Appendix A.3.4. 1729 Implementation over UDP: do nothing. This notification is not 1730 available and will therefore not occur with UDP. 1732 o Notification to a receiver that a partial message delivery has 1733 been aborted 1734 Protocols: SCTP 1735 Functional because this is closely tied to properties of the data 1736 that an application sends or expects to receive. 1737 Implementation over TCP: do nothing. This notification is not 1738 available and will therefore not occur with TCP. 1739 Implementation over UDP: do nothing. This notification is not 1740 available and will therefore not occur with UDP. 1742 A.2. Step 2: Reduction -- The Reduced Set of Transport Features 1744 By hiding automatable transport features from the application, a TAPS 1745 system can gain opportunities to automate the usage of network- 1746 related functionality. This can facilitate using the TAPS system for 1747 the application programmer and it allows for optimizations that may 1748 not be possible for an application. For instance, system-wide 1749 configurations regarding the usage of multiple interfaces can better 1750 be exploited if the choice of the interface is not entirely up to the 1751 application. Therefore, since they are not strictly necessary to 1752 expose in a TAPS system, we do not include automatable transport 1753 features in the reduced set of transport features. This leaves us 1754 with only the transport features that are either optimizing or 1755 functional. 1757 A TAPS system should be able to communicate via TCP or UDP if 1758 alternative transport protocols are found not to work. For many 1759 transport features, this is possible -- often by simply not doing 1760 anything when a specific request is made. For some transport 1761 features, however, it was identified that direct usage of neither TCP 1762 nor UDP is possible: in these cases, even not doing anything would 1763 incur semantically incorrect behavior. Whenever an application would 1764 make use of one of these transport features, this would eliminate the 1765 possibility to use TCP or UDP. Thus, we only keep the functional and 1766 optimizing transport features for which an implementation over either 1767 TCP or UDP is possible in our reduced set. 1769 In the following list, we precede a transport feature with "T:" if an 1770 implementation over TCP is possible, "U:" if an implementation over 1771 UDP is possible, and "TU:" if an implementation over either TCP or 1772 UDP is possible. 1774 A.2.1. CONNECTION Related Transport Features 1776 ESTABLISHMENT: 1778 o T,U: Connect 1779 o T,U: Specify number of attempts and/or timeout for the first 1780 establishment message 1781 o T: Configure authentication 1782 o T: Hand over a message to reliably transfer (possibly multiple 1783 times) before connection establishment 1784 o T: Hand over a message to reliably transfer during connection 1785 establishment 1787 AVAILABILITY: 1789 o T,U: Listen 1790 o T: Configure authentication 1792 MAINTENANCE: 1794 o T: Change timeout for aborting connection (using retransmit limit 1795 or time value) 1796 o T: Suggest timeout to the peer 1797 o T,U: Disable Nagle algorithm 1798 o T,U: Notification of Excessive Retransmissions (early warning 1799 below abortion threshold) 1800 o T,U: Specify DSCP field 1801 o T,U: Notification of ICMP error message arrival 1802 o T: Change authentication parameters 1803 o T: Obtain authentication information 1804 o T,U: Set Cookie life value 1805 o T,U: Choose a scheduler to operate between streams of an 1806 association 1807 o T,U: Configure priority or weight for a scheduler 1808 o T,U: Disable checksum when sending 1809 o T,U: Disable checksum requirement when receiving 1810 o T,U: Specify checksum coverage used by the sender 1811 o T,U: Specify minimum checksum coverage required by receiver 1812 o T,U: Specify DF field 1813 o T,U: Get max. transport-message size that may be sent using a non- 1814 fragmented IP packet from the configured interface 1815 o T,U: Get max. transport-message size that may be received from the 1816 configured interface 1817 o T,U: Obtain ECN field 1818 o T,U: Enable and configure a "Low Extra Delay Background Transfer" 1820 TERMINATION: 1822 o T: Close after reliably delivering all remaining data, causing an 1823 event informing the application on the other side 1824 o T: Abort without delivering remaining data, causing an event 1825 informing the application on the other side 1826 o T,U: Abort without delivering remaining data, not causing an event 1827 informing the application on the other side 1828 o T,U: Timeout event when data could not be delivered for too long 1830 A.2.2. DATA Transfer Related Transport Features 1832 A.2.2.1. Sending Data 1834 o T: Reliably transfer data, with congestion control 1835 o T: Reliably transfer a message, with congestion control 1836 o T,U: Unreliably transfer a message 1837 o T: Configurable Message Reliability 1838 o T: Ordered message delivery (potentially slower than unordered) 1839 o T,U: Unordered message delivery (potentially faster than ordered) 1840 o T,U: Request not to bundle messages 1841 o T: Specifying a key id to be used to authenticate a message 1842 o T,U: Request not to delay the acknowledgement (SACK) of a message 1844 A.2.2.2. Receiving Data 1846 o T,U: Receive data (with no message delimiting) 1847 o U: Receive a message 1848 o T,U: Information about partial message arrival 1850 A.2.2.3. Errors 1852 This section describes sending failures that are associated with a 1853 specific call to in the "Sending Data" category (Appendix A.1.2.1). 1855 o T,U: Notification of send failures 1856 o T,U: Notification that the stack has no more user data to send 1857 o T,U: Notification to a receiver that a partial message delivery 1858 has been aborted 1860 A.3. Step 3: Discussion 1862 The reduced set in the previous section exhibits a number of 1863 peculiarities, which we will discuss in the following. This section 1864 focuses on TCP because, with the exception of one particular 1865 transport feature ("Receive a message" -- we will discuss this in 1866 Appendix A.3.1), the list shows that UDP is strictly a subset of TCP. 1867 We can first try to understand how to build a TAPS system that can 1868 run over TCP, and then narrow down the result further to allow that 1869 the system can always run over either TCP or UDP (which effectively 1870 means removing everything related to reliability, ordering, 1871 authentication and closing/aborting with a notification to the peer). 1873 Note that, because the functional transport features of UDP are -- 1874 with the exception of "Receive a message" -- a subset of TCP, TCP can 1875 be used as a replacement for UDP whenever an application does not 1876 need message delimiting (e.g., because the application-layer protocol 1877 already does it). This has been recognized by many applications that 1878 already do this in practice, by trying to communicate with UDP at 1879 first, and falling back to TCP in case of a connection failure. 1881 A.3.1. Sending Messages, Receiving Bytes 1883 For implementing a TAPS system over TCP, there are several transport 1884 features related to sending, but only a single transport feature 1885 related to receiving: "Receive data (with no message delimiting)" 1886 (and, strangely, "information about partial message arrival"). 1887 Notably, the transport feature "Receive a message" is also the only 1888 non-automatable transport feature of UDP(-Lite) for which no 1889 implementation over TCP is possible. 1891 To support these TCP receiver semantics, we define an "Application- 1892 Framed Bytestream" (AFra-Bytestream). AFra-Bytestreams allow senders 1893 to operate on messages while minimizing changes to the TCP socket 1894 API. In particular, nothing changes on the receiver side - data can 1895 be accepted via a normal TCP socket. 1897 In an AFra-Bytestream, the sending application can optionally inform 1898 the transport about message boundaries and required properties per 1899 message (configurable order and reliability, or embedding a request 1900 not to delay the acknowledgement of a message). Whenever the sending 1901 application specifies per-message properties that relax the notion of 1902 reliable in-order delivery of bytes, it must assume that the 1903 receiving application is 1) able to determine message boundaries, 1904 provided that messages are always kept intact, and 2) able to accept 1905 these relaxed per-message properties. Any signaling of such 1906 information to the peer is up to an application-layer protocol and 1907 considered out of scope of this document. 1909 For example, if an application requests to transfer fixed-size 1910 messages of 100 bytes with partial reliability, this needs the 1911 receiving application to be prepared to accept data in chunks of 100 1912 bytes. If, then, some of these 100-byte messages are missing (e.g., 1913 if SCTP with Configurable Reliability is used), this is the expected 1914 application behavior. With TCP, no messages would be missing, but 1915 this is also correct for the application, and the possible 1916 retransmission delay is acceptable within the best effort service 1917 model [RFC7305]. Still, the receiving application would separate the 1918 byte stream into 100-byte chunks. 1920 Note that this usage of messages does not require all messages to be 1921 equal in size. Many application protocols use some form of Type- 1922 Length-Value (TLV) encoding, e.g. by defining a header including 1923 length fields; another alternative is the use of byte stuffing 1924 methods such as COBS [COBS]. If an application needs message 1925 numbers, e.g. to restore the correct sequence of messages, these must 1926 also be encoded by the application itself, as the sequence number 1927 related transport features of SCTP are not provided by the "minimum 1928 set" (in the interest of enabling usage of TCP). 1930 !!!NOTE: IMPLEMENTATION DETAILS BELOW WILL BE MOVED TO A SEPARATE 1931 DRAFT IN A FUTURE VERSION.!!! 1933 For the implementation of a TAPS system, this has the following 1934 consequences: 1936 o Because the receiver-side transport leaves it up to the 1937 application to delimit messages, messages must always remain 1938 intact as they are handed over by the transport receiver. Data 1939 can be handed over at any time as they arrive, but the byte stream 1940 must never "skip ahead" to the beginning of the next message. 1941 o With SCTP, a "partial flag" informs a receiving application that a 1942 message is incomplete. Then, the next receive calls will only 1943 deliver remaining parts of the same message (i.e., no messages or 1944 partial messages will arrive on other streams until the message is 1945 complete) (see Section 8.1.20 in [RFC6458]). This can facilitate 1946 the implementation of the receiver buffer in the receiving 1947 application, but then such an application does not support message 1948 interleaving (which is required by stream schedulers). However, 1949 receiving a byte stream from multiple SCTP streams requires a per- 1950 stream receiver buffer anyway, so this potential benefit is lost 1951 and the "partial flag" (the transport feature "Information about 1952 partial message arrival") becomes unnecessary for a TAPS system. 1953 With it, the transport feature "Notification to a receiver that a 1954 partial message delivery has been aborted" becomes unnecessary 1955 too. 1956 o From the above, a TAPS system should always support message 1957 interleaving because it enables the use of stream schedulers and 1958 comes at no additional implementation cost on the receiver side. 1959 Stream schedulers operate on the sender side. Hence, because a 1960 TAPS sender-side application may talk to an SCTP receiver that 1961 does not support interleaving, it cannot assume that stream 1962 schedulers will always work as expected. 1964 A.3.2. Stream Schedulers Without Streams 1966 We have already stated that multi-streaming does not require 1967 application-specific knowledge. Potential benefits or disadvantages 1968 of, e.g., using two streams of an SCTP association versus using two 1969 separate SCTP associations or TCP connections are related to 1970 knowledge about the network and the particular transport protocol in 1971 use, not the application. However, the transport features "Choose a 1972 scheduler to operate between streams of an association" and 1973 "Configure priority or weight for a scheduler" operate on streams. 1974 Here, streams identify communication channels between which a 1975 scheduler operates, and they can be assigned a priority. Moreover, 1976 the transport features in the MAINTENANCE category all operate on 1977 assocations in case of SCTP, i.e. they apply to all streams in that 1978 assocation. 1980 With only these semantics necessary to represent, the interface to a 1981 TAPS system becomes easier if we assume that TAPS connections may be 1982 a transport connection or association, but could also be a stream of 1983 an existing SCTP association, for example. We only need to allow for 1984 a way to define a possible grouping of TAPS connections. Then, all 1985 MAINTENANCE transport features can be said to operate on TAPS 1986 connection groups, not TAPS connections, and a scheduler operates on 1987 the connections within a group. 1989 !!!NOTE: IMPLEMENTATION DETAILS BELOW WILL BE MOVED TO A SEPARATE 1990 DRAFT IN A FUTURE VERSION.!!! 1992 For the implementation of a TAPS system, this has the following 1993 consequences: 1995 o Streams may be identified in different ways across different 1996 protocols. The only multi-streaming protocol considered in this 1997 document, SCTP, uses a stream id. The transport association below 1998 still uses a Transport Address (which includes one port number) 1999 for each communicating endpoint. To implement a TAPS system 2000 without exposed streams, an application must be given an 2001 identifier for each TAPS connection (akin to a socket), and 2002 depending on whether streams are used or not, there will be a 1:1 2003 mapping between this identifier and local ports or not. 2004 o In SCTP, a fixed number of streams exists from the beginning of an 2005 association; streams are not "established", there is no handshake 2006 or any other form of signaling to create them: they can just be 2007 used. They are also not "gracefully shut down" -- at best, an 2008 "SSN Reset Request Parameter" in a "RE-CONFIG" chunk [RFC6525] can 2009 be used to inform the peer that of a "Stream Reset", as a rough 2010 equivalent of an "Abort". This has an impact on the semantics 2011 connection establishment and teardown (see Section 3.1). 2013 o To support stream schedulers, a receiver-side TAPS system should 2014 always support message interleaving because it comes at no 2015 additional implementation cost (because of the receiver-side 2016 stream reception discussed in Appendix A.3.1). Note, however, 2017 that Stream schedulers operate on the sender side. Hence, because 2018 a TAPS sender-side application may talk to a native TCP-based 2019 receiver-side application, it cannot assume that stream schedulers 2020 will always work as expected. 2022 To be compatible with multiple transport protocols and uniformly 2023 allow access to both transport connections and streams of a multi- 2024 streaming protocol, the semantics of opening and closing need to be 2025 the most restrictive subset of all of the underlying options. For 2026 example, TCP's support of half-closed connections can be seen as a 2027 feature on top of the more restrictive "ABORT"; this feature cannot 2028 be supported because not all protocols used by a TAPS system 2029 (including streams of an association) support half-closed 2030 connections. 2032 A.3.3. Early Data Transmission 2034 There are two transport features related to transferring a message 2035 early: "Hand over a message to reliably transfer (possibly multiple 2036 times) before connection establishment", which relates to TCP Fast 2037 Open [RFC7413], and "Hand over a message to reliably transfer during 2038 connection establishment", which relates to SCTP's ability to 2039 transfer data together with the COOKIE-Echo chunk. Also without TCP 2040 Fast Open, TCP can transfer data during the handshake, together with 2041 the SYN packet -- however, the receiver of this data may not hand it 2042 over to the application until the handshake has completed. Also, 2043 different from TCP Fast Open, this data is not delimited as a message 2044 by TCP (thus, not visible as a ``message''). This functionality is 2045 commonly available in TCP and supported in several implementations, 2046 even though the TCP specification does not explain how to provide it 2047 to applications. 2049 A TAPS system could differentiate between the cases of transmitting 2050 data "before" (possibly multiple times) or "during" the handshake. 2051 Alternatively, it could also assume that data that are handed over 2052 early will be transmitted as early as possible, and "before" the 2053 handshake would only be used for messages that are explicitly marked 2054 as "idempotent" (i.e., it would be acceptable to transfer them 2055 multiple times). 2057 The amount of data that can successfully be transmitted before or 2058 during the handshake depends on various factors: the transport 2059 protocol, the use of header options, the choice of IPv4 and IPv6 and 2060 the Path MTU. A TAPS system should therefore allow a sending 2061 application to query the maximum amount of data it can possibly 2062 transmit before (or, if exposed, during) connection establishment. 2064 A.3.4. Sender Running Dry 2066 The transport feature "Notification that the stack has no more user 2067 data to send" relates to SCTP's "SENDER DRY" notification. Such 2068 notifications can, in principle, be used to avoid having an 2069 unnecessarily large send buffer, yet ensure that the transport sender 2070 always has data available when it has an opportunity to transmit it. 2071 This has been found to be very beneficial for some applications 2072 [WWDC2015]. However, "SENDER DRY" truly means that the entire send 2073 buffer (including both unsent and unacknowledged data) has emptied -- 2074 i.e., when it notifies the sender, it is already too late, the 2075 transport protocol already missed an opportunity to send data. Some 2076 modern TCP implementations now include the unspecified 2077 "TCP_NOTSENT_LOWAT" socket option that was proposed in [WWDC2015], 2078 which limits the amount of unsent data that TCP can keep in the 2079 socket buffer; this allows to specify at which buffer filling level 2080 the socket becomes writable, rather than waiting for the buffer to 2081 run empty. 2083 SCTP allows to configure the sender-side buffer too: the automatable 2084 Transport Feature "Configure send buffer size" provides this 2085 functionality, but only for the complete buffer, which includes both 2086 unsent and unacknowledged data. SCTP does not allow to control these 2087 two sizes separately. It therefore makes sense for a TAPS system to 2088 allow for uniform access to "TCP_NOTSENT_LOWAT" as well as the 2089 "SENDER DRY" notification. 2091 A.3.5. Capacity Profile 2093 The transport features: 2095 o Disable Nagle algorithm 2096 o Enable and configure a "Low Extra Delay Background Transfer" 2097 o Specify DSCP field 2099 all relate to a QoS-like application need such as "low latency" or 2100 "scavenger". In the interest of flexibility of a TAPS system, they 2101 could therefore be offered in a uniform, more abstract way, where a 2102 TAPS system could e.g. decide by itself how to use combinations of 2103 LEDBAT-like congestion control and certain DSCP values, and an 2104 application would only specify a general "capacity profile" (a 2105 description of how it wants to use the available capacity). A need 2106 for "lowest possible latency at the expense of overhead" could then 2107 translate into automatically disabling the Nagle algorithm. 2109 In some cases, the Nagle algorithm is best controlled directly by the 2110 application because it is not only related to a general profile but 2111 also to knowledge about the size of future messages. For fine-grain 2112 control over Nagle-like functionality, the "Request not to bundle 2113 messages" is available. 2115 A.3.6. Security 2117 Both TCP and SCTP offer authentication. TCP authenticates complete 2118 segments. SCTP allows to configure which of SCTP's chunk types must 2119 always be authenticated -- if this is exposed as such, it creates an 2120 undesirable dependency on the transport protocol. For compatibility 2121 with TCP, a TAPS system should only allow to configure complete 2122 transport layer packets, including headers, IP pseudo-header (if any) 2123 and payload. 2125 Security is discussed in a separate TAPS document 2126 [I-D.pauly-taps-transport-security]. The minimal set presented in 2127 the present document therefore excludes all security related 2128 transport features: "Configure authentication", "Change 2129 authentication parameters", "Obtain authentication information" and 2130 and "Set Cookie life value" as well as "Specifying a key id to be 2131 used to authenticate a message". 2133 A.3.7. Packet Size 2135 UDP(-Lite) has a transport feature called "Specify DF field". This 2136 yields an error message in case of sending a message that exceeds the 2137 Path MTU, which is necessary for a UDP-based application to be able 2138 to implement Path MTU Discovery (a function that UDP-based 2139 applications must do by themselves). The "Get max. transport-message 2140 size that may be sent using a non-fragmented IP packet from the 2141 configured interface" transport feature yields an upper limit for the 2142 Path MTU (minus headers) and can therefore help to implement Path MTU 2143 Discovery more efficiently. 2145 Appendix B. Revision information 2147 XXX RFC-Ed please remove this section prior to publication. 2149 -02: implementation suggestions added, discussion section added, 2150 terminology extended, DELETED category removed, various other fixes; 2151 list of Transport Features adjusted to -01 version of [TAPS2] except 2152 that MPTCP is not included. 2154 -03: updated to be consistent with -02 version of [TAPS2]. 2156 -04: updated to be consistent with -03 version of [TAPS2]. 2157 Reorganized document, rewrote intro and conclusion, and made a first 2158 stab at creating a real "minimal set". 2160 -05: updated to be consistent with -05 version of [TAPS2] (minor 2161 changes). Fixed a mistake regarding Cookie Life value. Exclusion of 2162 security related transport features (to be covered in a separate 2163 document). Reorganized the document (now begins with the minset, 2164 derivation is in the appendix). First stab at an abstract API for 2165 the minset. 2167 draft-ietf-taps-minset-00: updated to be consistent with -08 version 2168 of [TAPS2] ("obtain message delivery number" was removed, as this has 2169 also been removed in [TAPS2] because it was a mistake in RFC4960. 2170 This led to the removal of two more transport features that were only 2171 designated as functional because they affected "obtain message 2172 delivery number"). Fall-back to UDP incorporated (this was requested 2173 at IETF-99); this also affected the transport feature "Choice between 2174 unordered (potentially faster) or ordered delivery of messages" 2175 because this is a boolean which is always true for one fall-back 2176 protocol, and always false for the other one. This was therefore now 2177 divided into two features, one for ordered, one for unordered 2178 delivery. The word "reliably" was added to the transport features 2179 "Hand over a message to reliably transfer (possibly multiple times) 2180 before connection establishment" and "Hand over a message to reliably 2181 transfer during connection establishment" to make it clearer why this 2182 is not supported by UDP. Clarified that the "minset abstract 2183 interface" is not proposing a specific API for all TAPS systems to 2184 implement, but it is just a way to describe the minimum set. Author 2185 order changed. 2187 draft-ietf-taps-minset-01: "fall-back to" (TCP or UDP) replaced 2188 (mostly with "implementation over"). References to post-sockets 2189 removed (these were statments that assumed that post-sockets requires 2190 two-sided implementation). Replaced "flow" with "TAPS Connection" 2191 and "frame" with "message" to avoid introducing new terminology. 2192 Made sections 3 and 4 in line with the categorization that is already 2193 used in the appendix and [TAPS2], and changed style of section 4 to 2194 be even shorter and less interface-like. Updated reference draft- 2195 ietf-tsvwg-sctp-ndata to RFC8260. 2197 Authors' Addresses 2198 Michael Welzl 2199 University of Oslo 2200 PO Box 1080 Blindern 2201 Oslo N-0316 2202 Norway 2204 Phone: +47 22 85 24 20 2205 Email: michawe@ifi.uio.no 2207 Stein Gjessing 2208 University of Oslo 2209 PO Box 1080 Blindern 2210 Oslo N-0316 2211 Norway 2213 Phone: +47 22 85 24 44 2214 Email: steing@ifi.uio.no