idnits 2.17.1 draft-baset-tsvwg-tcp-over-udp-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 7, 2009) is 5436 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Unused Reference: 'RFC1122' is defined on line 553, but no explicit reference was found in the text == Outdated reference: A later version (-07) exists of draft-ietf-tcpm-rfc2581bis-05 == Outdated reference: A later version (-11) exists of draft-ietf-tcpm-tcp-auth-opt-04 ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 1323 (Obsoleted by RFC 7323) ** Obsolete normative reference: RFC 4347 (Obsoleted by RFC 6347) ** Obsolete normative reference: RFC 4960 (Obsoleted by RFC 9260) ** Obsolete normative reference: RFC 5246 (Obsoleted by RFC 8446) ** Obsolete normative reference: RFC 5389 (Obsoleted by RFC 8489) == Outdated reference: A later version (-08) exists of draft-ietf-behave-nat-behavior-discovery-06 == Outdated reference: A later version (-16) exists of draft-ietf-mmusic-ice-tcp-07 Summary: 7 errors (**), 0 flaws (~~), 8 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Transport Area Working Group S. Baset 3 Internet-Draft H. Schulzrinne 4 Intended status: Experimental Columbia University 5 Expires: December 9, 2009 June 7, 2009 7 TCP-over-UDP 8 draft-baset-tsvwg-tcp-over-udp-01 10 Status of this Memo 12 This Internet-Draft is submitted to IETF in full conformance with the 13 provisions of BCP 78 and BCP 79. This document may contain material 14 from IETF Documents or IETF Contributions published or made publicly 15 available before November 10, 2008. The person(s) controlling the 16 copyright in some of this material may not have granted the IETF 17 Trust the right to allow modifications of such material outside the 18 IETF Standards Process. Without obtaining an adequate license from 19 the person(s) controlling the copyright in such materials, this 20 document may not be modified outside the IETF Standards Process, and 21 derivative works of it may not be created outside the IETF Standards 22 Process, except to format it for publication as an RFC or to 23 translate it into languages other than English. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet- 28 Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt. 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 This Internet-Draft will expire on December 9, 2009. 43 Copyright Notice 45 Copyright (c) 2009 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents in effect on the date of 50 publication of this document (http://trustee.ietf.org/license-info). 51 Please review these documents carefully, as they describe your rights 52 and restrictions with respect to this document. 54 Abstract 56 We present TCP-over-UDP (ToU), an instance of TCP on top of UDP. It 57 provides exactly the same congestion control, flow control, 58 reliability, and extension mechanisms as offered by TCP. It is 59 intended for use in scenarios where applications running on two hosts 60 may not be able to establish a direct TCP connection but are able to 61 exchange UDP packets. 63 Table of Contents 65 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 66 1.1. Conventions . . . . . . . . . . . . . . . . . . . . . . . 5 67 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 68 2. Model of Operation . . . . . . . . . . . . . . . . . . . . . . 5 69 2.1. Setup and tear down . . . . . . . . . . . . . . . . . . . 5 70 2.2. Connection tracking . . . . . . . . . . . . . . . . . . . 5 71 2.3. MTU discovery . . . . . . . . . . . . . . . . . . . . . . 5 72 3. Congestion Control, Flow Control, and Reliability . . . . . . 6 73 3.1. Explicit Congestion Notification (ECN) . . . . . . . . . . 6 74 4. Header Format . . . . . . . . . . . . . . . . . . . . . . . . 6 75 5. NAT related issues . . . . . . . . . . . . . . . . . . . . . . 8 76 5.1. Using ToU . . . . . . . . . . . . . . . . . . . . . . . . 9 77 5.2. NAT bindings . . . . . . . . . . . . . . . . . . . . . . . 9 78 6. ToU, TLS, and DTLS . . . . . . . . . . . . . . . . . . . . . . 9 79 7. Implementation Guidelines . . . . . . . . . . . . . . . . . . 10 80 8. Design Alternatives . . . . . . . . . . . . . . . . . . . . . 10 81 8.1. Changing IP protocol number . . . . . . . . . . . . . . . 10 82 8.2. Simplified TCP . . . . . . . . . . . . . . . . . . . . . . 11 83 8.3. TCP-like mechanism within an application layer protocol . 11 84 8.4. Tunneling . . . . . . . . . . . . . . . . . . . . . . . . 11 85 8.5. TFRC . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 86 8.6. SCTP . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 87 8.7. Criticism . . . . . . . . . . . . . . . . . . . . . . . . 12 88 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 13 89 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 90 11. Security Considerations . . . . . . . . . . . . . . . . . . . 13 91 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 92 12.1. Normative References . . . . . . . . . . . . . . . . . . . 13 93 12.2. Informative References . . . . . . . . . . . . . . . . . . 14 94 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 15 95 A.1. Changes since draft-baset-tsvwg-tcp-over-udp-00 . . . . . 15 97 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 99 1. Introduction 101 Network address translators (NATs) pose a challenge for establishing 102 a direct TCP connection between hosts. While TCP connectivity works 103 when a TCP client is behind a NAT device and the server is not, it is 104 problematic when both the TCP client and server are behind different 105 NAT devices. Thus, applications running on hosts behind different 106 NAT devices may not be able to establish a direct TCP connection with 107 each other. Instead, these applications must establish a TCP 108 connection with a reachable host, which relays the traffic of the 109 application on the first host to the application on the second host 110 and vice versa. While this works, this is undesirable as it creates 111 a dependency on a reachable host. With certain NAT types, even 112 though the applications cannot establish a direct TCP connection, 113 they may be able to exchange UDP traffic by using techniques such as 114 ICE-UDP [I-D.ietf-mmusic-ice]. Thus, using UDP is attractive for 115 such applications as it removes the dependency on a reachable host. 116 However, these applications have a requirement that the underlying 117 transport be reliable. Further, these applications may run on 118 machines with heterogeneous network connectivity, thereby requiring 119 flow control. UDP does not provide reliability, congestion control, 120 or flow control semantics. Therefore, these applications may either 121 use TCP with a reachable host, or invent their own reliable, 122 congestion control, and flow control transport protocol to establish 123 a direct connection. 125 We present TCP-over-UDP (ToU), a reliable, congestion control, and 126 flow control transport protocol on top of UDP. The idea is that TCP 127 is a well-designed transport protocol that provides reliable, 128 congestion control, and flow control mechanisms and these mechanisms 129 must be reused as much as possible. Further, a transport protocol 130 that provides reliability and flow control mechanisms must not be 131 tied to a specific application and must be designed to provide 132 modular functionality. To accomplish this, ToU almost uses the same 133 header as TCP which allows to easily incorporate TCP's reliable and 134 congestion control algorithms as defined in TCP congestion control 135 [I-D.ietf-tcpm-rfc2581bis] document. In essence, ToU is not a new 136 protocol but merely an instance (or profile) of TCP over UDP minus 137 the TCP checksum, urgent flag, and urgent data. 139 We think that our approach is attractive for several reasons. First, 140 we are not proposing a new congestion control algorithm. Designing 141 new congestion control algorithms is complex, and requires a large 142 validation effort. Second, our approach takes advantage of existing 143 user-level-TCP (such as Daytona [Daytona] and MINET [MINET]) or TCP- 144 over-UDP implementations (such as atou [atou]). Finally, since we 145 are replicating TCP semantics over UDP, any TCP options such as 146 window scaling [RFC1323], selective acknowledgement option (SACK) 148 [RFC2018], or proposed TCP options such as TCP-Auth 149 [I-D.ietf-tcpm-tcp-auth-opt] can be easily incorporated in ToU 150 without a new standardization effort. 152 1.1. Conventions 154 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 155 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 156 document are to be interpreted as described in RFC 2119 [RFC2119]. 158 1.2. Terminology 160 We use the terms such as congestion window (cwnd), initial window 161 (IW), restart window (RW), receiver window (rwnd), and sender maximum 162 segment size (SMSS) as defined in TCP congestion control 163 [I-D.ietf-tcpm-rfc2581bis] document. 165 2. Model of Operation 167 Like TCP, ToU has a client and a server. A client connects to a TCP 168 server to establish a ToU connection. Below, we describe the key ToU 169 operations. 171 2.1. Setup and tear down 173 Like TCP, ToU uses a three-way handshake to establish a connection. 174 Similarly, it follows TCP's semantics in tearing down the connection. 176 2.2. Connection tracking 178 A key difference between TCP and UDP is that the former is 179 connection-oriented whereas the later is not. This means that a ToU 180 server must provide a way to keep track of existing connections. It 181 does so through the source port and IP address of the UDP packet. 183 2.3. MTU discovery 185 ToU uses packetization layer path MTU discovery [RFC4821] to discover 186 link MTU. 188 Some NAT devices placed in front of PPPoE devices perform MSS 189 clamping, i.e., they rewrite TCP's MSS option in a SYN packet from 190 1500 bytes to 1492 bytes. This operation is performed because PPPoE 191 has a MTU of 1492 bytes instead of Ethernet's 1500 bytes. MSS 192 clamping is considered a 'faster' way of discovering MTU in such 193 scenarios. MSS clamping does not work for ToU because NAT devices 194 treat ToU packets as a stream of UDP packets. It is an open question 195 how a ToU stack should deal with PPPoE MTU if faster MTU discovery is 196 desired. One option is to configure ToU stack with a default MTU of 197 1492 bytes. 199 3. Congestion Control, Flow Control, and Reliability 201 ToU follows the TCP congestion control algorithms described in TCP 202 congestion control [I-D.ietf-tcpm-rfc2581bis] document. Thus, a ToU 203 sender goes through the slow-start and congestion-avoidance phases. 204 A ToU sender starts with an initial window (IW) following the 205 guidelines in RFC 3390 [RFC3390]. During slow start, a ToU sender 206 increments congestion window (cwnd) by at most SMSS bytes for each 207 ACK received that cumulatively acknowledges new data. It switches to 208 congestion avoidance when the congestion window (cwnd) exceeds slow 209 start threshold (ssthresh). A ToU receiver generates an 210 acknowledgement following the guidelines in Section 4.2 of TCP 211 congestion control [I-D.ietf-tcpm-rfc2581bis] document. It 212 immediately generates an ACK when an out-of-order segment arrives. 213 The ToU sender uses the fast retransmit algorithm to detect and 214 repair losses, and fast recovery algorithm to govern the transmission 215 of new data until a non-duplicate ACK arrives. When ToU sender has 216 not received a segment for more than one retransmission timeout 217 (RTO), cwnd is reduced to the value of the restart window (RW) before 218 transmission begins. The ToU sender may also use selective 219 acknowledgement option (SACK) [RFC2018] to improve loss recovery when 220 multiple packets are lost from one window of data. Like TCP, it uses 221 receiver window (rwnd) to achieve flow control. 223 3.1. Explicit Congestion Notification (ECN) 225 TCP-over-UDP operates above UDP. To use ECN [RFC3168] with ToU, a 226 UDP socket must allow ToU to set and retrieve the ECN bits in the IP 227 header. Currently, UDP sockets do not provide such a mechanism. 228 However, ToU assumes that in future, UDP sockets will provide this 229 mechanism so that ECN can be incorporated in the congestion control 230 mechanism of ToU. 232 ToU endpoints also need to determine whether they both support ECN. 233 Similar to ECE and CWR flags for TCP as defined in ECN [RFC3168], ToU 234 header includes these flags. 236 4. Header Format 238 A ToU header is like a TCP header [RFC0793] except that it does not 239 include source port, destination port, and checksum, as they are 240 already included in the UDP header. ToU header also does not include 241 the 1-bit Urgent flag and bit corresponding to this flags are 242 reserved in the ToU header. Further, it also does not include the 243 16-bit Urgent Pointer. The reason for excluding Urgent flag and 244 Urgent pointer is that they are only used in Telnet [RFC0854] which 245 is not a widely used protocol. 247 Between sequence number and acknowledgement number, ToU header has a 248 32-bit magic cookie to demultiplex it with other UDP-based protocols 249 such as STUN [RFC5389]. A ToU header also includes ECE and CWR flags 250 for negotiating ECN capabilities. These flags are defined in RFC 251 3168 [RFC3168]. The rest of the fields in a ToU header have exactly 252 the same meaning as those in a TCP header. The size of the fixed ToU 253 header is 16 bytes, whereas the size of fixed TCP header is 20 bytes. 254 The fixed ToU header and UDP header have a cumulative size of 24 255 bytes, four more than a fixed TCP header. 257 0 1 2 3 258 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 260 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 261 | Sequence Number | 262 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 263 | Magic Cookie | 264 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 265 | Acknowledgment Number | 266 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 267 | Data | |C|E| |A|P|R|S|F| | 268 | Offset|Reserve|W|C|R|C|S|S|Y|I| Window | 269 | | |R|E| |K|H|T|N|N| | 270 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 271 | Options | Padding | 272 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 273 | data | 274 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 276 Header for TCP-over-UDP (ToU) 278 Figure 1 280 Since ToU header fields are exactly the same as TCP, we have borrowed 281 their descriptions from the TCP RFC [RFC0793]. 283 Sequence Number (32-bits): Same as a TCP sequence number. 285 Magic Cookie (32-bits): A fixed value of 0x7194B32E in network byte 286 order to demultiplex ToU from other application layer protocols. 288 Acknowledgement Number (32-bits): Same as a TCP acknowledgement 289 number. 291 Data offset (4-bits): The number of 32-bit words in ToU header. 292 Like a TCP header, ToU header is an integral number of 32-bits 293 long. 295 Reserved (4-bits): Reserved for future use. Must be zero. 297 Control Bits (8-bits): 8-bits from left to right. Unlike TCP, the 298 Urgent bit is excluded. 300 CWR: Congestion window reduced flag as defined in RFC 3168 301 [RFC3168]. 303 ECE: ECN-Echo flag as defined in RFC 3168 [RFC3168]. 305 R: Reserved in ToU. In the TCP header, it is used for the Urgent 306 bit. 308 ACK: Acknowledgment field significant 310 PSH: PSH function. 312 RST: Reset the connection 314 SYN: Synchronize sequence numbers 316 FIN: No more data from sender 318 Window (16-bits): Same as the window in TCP header. The number of 319 data octets beginning with the one indicated in the acknowledgment 320 field which the sender of this segment is willing to accept. 322 Options: Same as TCP options. 324 Padding: Like TCP, the ToU header padding is used to ensure that the 325 ToU header ends and data begins on a 32 bit boundary. The padding 326 is composed of zeros. 328 5. NAT related issues 330 This section discusses how to determine if hosts should use ToU and 331 the impact of UDP NAT bindings on ToU connection management. 333 5.1. Using ToU 335 Hosts should only use ToU when establishing a direct TCP connection 336 fails. It is outside the scope of this draft to specify a mechanism 337 to determine if establishing a TCP connection fails between two hosts 338 behind NATs. Hosts may use ICE-TCP [I-D.ietf-mmusic-ice-tcp] and 339 ICE-UDP [I-D.ietf-mmusic-ice] to determine if hosts can directly 340 establish a TCP connection or directly exchange UDP packets, 341 respectively. If hosts fail to establish a direct TCP connection but 342 are able to directly exchange UDP packets, they can establish a ToU 343 connection. 345 5.2. NAT bindings 347 NAT devices maintain a binding for mapping an internal IP address and 348 port number to an external IP address and port number. The lifetime 349 of bindings for UDP is much smaller than TCP because UDP is a 350 connection less protocol. If an application does not send packets 351 over ToU, the UDP binding may be lost resulting in a broken ToU 352 connection. 354 ToU does not provide any mechanism to determine UDP binding lifetimes 355 or to refresh these bindings. Rather, an application establishing a 356 ToU connection can use STUN [RFC5389] to discover 357 [I-D.ietf-behave-nat-behavior-discovery] binding lifetimes and 358 periodically refresh these bindings. Running STUN in conjunction 359 with ToU has a design implication that a ToU packet must be 360 differentiated from a STUN packet. The magic cookie in a ToU packet 361 serves this purpose. 363 6. ToU, TLS, and DTLS 365 Transport layer security (TLS) [RFC5246] and Datagram transport layer 366 security (DTLS) [RFC4347] protocols provide privacy and data 367 integrity between two communicating applications. TLS is layered on 368 top of some reliable transport protocol such as TCP, whereas DTLS 369 only assumes a datagram service. A question is what is the layering 370 relationship between ToU protocol, TLS, and DTLS. Figure 2 shows 371 three possible options. Option-3 is not feasible since ToU layer 372 must be made aware of the size of header which DTLS may add. 373 Option-2 layers DTLS on top of ToU. Unlike TLS, DTLS carries a 374 sequence number because it assumes a datagram service. However, the 375 use of sequence number is made redundant because ToU provides 376 reliable and inorder delivery semantics. Therefore, Option-1 is most 377 feasible in which TLS is layered on top of ToU. 379 +-+-+-+-+ +-+-+-+-+ +-+-+-+-+ 380 | TLS | | DTLS | | ToU | 381 +-+-+-+-+ +-+-+-+-+ +-+-+-+-+ 382 | ToU | | ToU | | DTLS | 383 +-+-+-+-+ +-+-+-+-+ +-+-+-+-+ 384 | UDP | | UDP | | UDP | 385 +-+-+-+-+ +-+-+-+-+ +-+-+-+-+ 386 Option-1 Option-2 Option-3 388 Layering options for ToU, TLS, DTLS 390 Figure 2 392 7. Implementation Guidelines 394 From the implementers perspective, the use of ToU should be as 395 modular as possible. Once way to achieve this modularity is to 396 implement ToU as a user-level library that provides socket-like 397 function calls to the applications. The library may have its own 398 thread of execution and can be instantiated at the start of the 399 program. The library implements the reliable, inorder, congestion 400 control, and flow control semantics of TCP. Applications can 401 interact with the ToU library through socket-like function calls. 403 8. Design Alternatives 405 ToU is strictly meant for scenarios where end-points desire to 406 establish a TCP connection but are unable to do so due to the 407 presence of NATs and firewalls. Below, we briefly discuss the design 408 alternatives and address possible criticisms for ToU. 410 8.1. Changing IP protocol number 412 One solution is to change the IP protocol number of TCP packets to 413 UDP before sending them on the wire. Similarly, when the packets are 414 received, the protocol number is changed back to TCP and the received 415 packets are passed to the TCP stack. The idea behind this approach 416 is to reuse TCP stack as much as possible. This approach suffers 417 from a number of problems. First, it requires a change in the 418 operating system kernel to rewrite IP protocol number of TCP packets 419 to UDP and it is unrealistic to expect all the OS kernels to 420 implement this change. Second, TCP checksum has a different offset 421 than a UDP checksum and many NAT devices parsing the UDP packet will 422 reject the packet because the UDP checksum is incorrect. Third, 423 since applications can use the same port number for TCP and UDP 424 ports, it is unclear how the kernel will correctly differentiate 425 between TCP and UDP packets for the same port number. 427 8.2. Simplified TCP 429 It may be argued that TCP semantics are too complicated and it might 430 be easier to define a protocol that adds retransmission of individual 431 UDP packets, and ACK mechanisms, and sequencing layer. However, 432 unless one is content with stop-and-wait congestion control (and 433 roughly modem data rates), it is necessary for a transport protocol 434 to have AIMD or rate-based congestion control (TFRC). As discussed 435 in Section 8.5, rate-based congestion control is not suitable for 436 mid-sized transfers and is not any simpler than AIMD. Further, since 437 hosts may have heterogeneous network connectivity, a transport 438 protocol needs to provide flow control. Moreover, it may not be easy 439 to validate a new transport protocol that only provides selective TCP 440 semantics. 442 8.3. TCP-like mechanism within an application layer protocol 444 In this approach, key TCP mechanisms such as reliability, congestion 445 control, and flow control are designed as part of the application 446 layer protocol. This approach has several disadvantages. First, 447 every application layer protocol that is unable to establish TCP 448 connections in the presence of NAT and firewalls but may use UDP will 449 need to invent its own reliable, congestion control and flow control 450 transport protocol. Second, it is non-trivial to get the first 451 implementations of a conceptually new protocol right. Third, any new 452 transport protocol, even if it is specified within an application 453 layer protocol must undergo a large validation effort. Finally, most 454 long-term successful protocols are those that provide modular 455 functionality, and not extremely narrowly-tailored protocols. 457 8.4. Tunneling 459 Another design option is to provide a VPN-like tunnel for sending and 460 receiving TCP packets over UDP. The idea is to use tunneling 461 solutions between hosts so that hosts can use the kernel TCP stack 462 and unmodified socket functions calls. 464 This approach is not desirable for several reasons. First, tunneling 465 solutions typically require support from kernel or require kernel 466 upgrades to work. Requiring kernel upgrades to work is not plausible 467 for an application that is trying to get deployment traction. 468 Second, establishing a tunnel typically requires root access to the 469 system and it is unrealistic for user-space applications to require 470 root access for proper functioning. Third, peer-to-peer 471 applications, which are expected to use ToU, establish a large number 472 of connections with other hosts. Even, if a tunneling solution does 473 not require any kernel support, such a solution consumes significant 474 bandwidth and CPU resources to maintain a large number of tunnels 475 with other hosts. Popular P2P applications such as Skype and 476 Bittorrent do not take advantage of a layer-3 tunneling solution. 478 8.5. TFRC 480 TFRC [RFC5348] is a congestion control mechanism (not a protocol) 481 that is designed for long-lived media streams. Its main benefit is 482 of smoothing rates to these media streams. It does not provide any 483 packet formats, reliability, or flow control. It's congestion 484 control mechanism is not suited for exchanging data objects that 485 range from a few dozen to a few hundred packets. The reason is that 486 TFRC is based on estimating loss rates within 8 loss intervals. With 487 a loss rate of 1%, this translates, very roughly, into 800 packets or 488 roughly 800 kB, before a reliable estimate of a better (higher) rate 489 is computed. Further, its main benefit, smoothing rates, is of no 490 importance to applications desiring to replicate TCP functionality 491 over UDP. 493 8.6. SCTP 495 SCTP [RFC4960] is significantly more complicated than TCP in its 496 implementation and its performance is generally the same, except in 497 circumstances involving head-of-line blocking. Further, SCTP will 498 have trouble getting traction in the consumer and enterprise Internet 499 space unless it (also) runs over UDP, as there seem to be few NATs 500 that know how to handle SCTP and thus it is effectively unusable by a 501 fair fraction of the Internet user population. 503 8.7. Criticism 505 A criticism of the ToU approach is that it is deceptively simple to 506 describe but difficult to implement and is likely to suffer from 507 broken implementations. We think that this assertion is not valid 508 for three reasons. First, ToU does not define a new congestion 509 control protocol and thus stays away from all the validation issues 510 associated with a new congestion control protocol. Second, a 511 reasonable implementation approach is to first implement connection 512 management and AIMD congestion control and test it with regular TCP 513 to determine if the implemented congestion control mechanisms are 514 broken. This implementation can be followed by implementing TCP 515 options such as window scaling and SACK. Third, ToU like other 516 protocols such as SIP will be implemented as a module or library and 517 is likely to mature over time. 519 9. Acknowledgements 521 The draft incorporates comments from the discussion on TSVWG and 522 P2PSIP mailing list. We also acknowledge an earlier draft by R. 523 Denis-Courmont on UDP transports. 525 10. IANA Considerations 527 TBD. 529 11. Security Considerations 531 ToU is subject to the same security considerations as TCP. 533 12. References 535 12.1. Normative References 537 [I-D.ietf-tcpm-rfc2581bis] 538 Paxson, V., Blanton, E., and M. Allman, "TCP Congestion 539 Control", draft-ietf-tcpm-rfc2581bis-05 (work in 540 progress), May 2009. 542 [I-D.ietf-tcpm-tcp-auth-opt] 543 Touch, J., Mankin, A., and R. Bonica, "The TCP 544 Authentication Option", draft-ietf-tcpm-tcp-auth-opt-04 545 (work in progress), March 2009. 547 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 548 RFC 793, September 1981. 550 [RFC0854] Postel, J. and J. Reynolds, "Telnet Protocol 551 Specification", STD 8, RFC 854, May 1983. 553 [RFC1122] Braden, R., "Requirements for Internet Hosts - 554 Communication Layers", STD 3, RFC 1122, October 1989. 556 [RFC1323] Jacobson, V., Braden, B., and D. Borman, "TCP Extensions 557 for High Performance", RFC 1323, May 1992. 559 [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP 560 Selective Acknowledgment Options", RFC 2018, October 1996. 562 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 563 Requirement Levels", BCP 14, RFC 2119, March 1997. 565 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 566 of Explicit Congestion Notification (ECN) to IP", 567 RFC 3168, September 2001. 569 [RFC3390] Allman, M., Floyd, S., and C. Partridge, "Increasing TCP's 570 Initial Window", RFC 3390, October 2002. 572 [RFC4347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 573 Security", RFC 4347, April 2006. 575 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 576 Discovery", RFC 4821, March 2007. 578 [RFC4960] Stewart, R., "Stream Control Transmission Protocol", 579 RFC 4960, September 2007. 581 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 582 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 584 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 585 Friendly Rate Control (TFRC): Protocol Specification", 586 RFC 5348, September 2008. 588 [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, 589 "Session Traversal Utilities for NAT (STUN)", RFC 5389, 590 October 2008. 592 12.2. Informative References 594 [Daytona] Pradhan, P., Kandula, S., Xu, W., Sheikh, A., and E. 595 Nahum, "Daytona : A User-Level TCP Stack", 2004, 596 . 598 [I-D.ietf-behave-nat-behavior-discovery] 599 MacDonald, D. and B. Lowekamp, "NAT Behavior Discovery 600 Using STUN", draft-ietf-behave-nat-behavior-discovery-06 601 (work in progress), March 2009. 603 [I-D.ietf-mmusic-ice] 604 Rosenberg, J., "Interactive Connectivity Establishment 605 (ICE): A Protocol for Network Address Translator (NAT) 606 Traversal for Offer/Answer Protocols", 607 draft-ietf-mmusic-ice-19 (work in progress), October 2007. 609 [I-D.ietf-mmusic-ice-tcp] 610 Rosenberg, J., "TCP Candidates with Interactive 611 Connectivity Establishment (ICE)", 612 draft-ietf-mmusic-ice-tcp-07 (work in progress), 613 July 2008. 615 [MINET] Dinda, P., "The Minet TCP/IP Stack", 2002, . 618 [atou] Dunigan, T. and F. Fowler, "A TCP-over-UDP Test Harness", 619 2002, . 621 Appendix A. Change Log 623 A.1. Changes since draft-baset-tsvwg-tcp-over-udp-00 625 o Updated introduction to reflect that it is difficult for two hosts 626 behind two different NATs to establish a TCP connection. 628 o Added PSH bit. 630 o Added MTU discovery to model of operation section. 632 o Added text on ECN to congestion control section. 634 o Added a section on NAT related issues. 636 o Updated text in design alternatives section. 638 Authors' Addresses 640 Salman A. Baset 641 Columbia University 642 1214 Amsterdam Avenue 643 New York, NY 644 USA 646 Email: salman@cs.columbia.edu 648 Henning Schulzrinne 649 Columbia University 650 1214 Amsterdam Avenue 651 New York, NY 652 USA 654 Email: hgs@cs.columbia.edu