TOC 
Transport Area Working GroupS. Baset
Internet-DraftH. Schulzrinne
Intended status: ExperimentalColumbia University
Expires: November 8, 2009May 07, 2009


TCP-over-UDP
draft-baset-tsvwg-tcp-over-udp-00

Status of this Memo

This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on November 8, 2009.

Copyright Notice

Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document.

Abstract

We present TCP-over-UDP (ToU), an instance of TCP on top of UDP. It provides exactly the same congestion control, flow control, reliability, and extension mechanisms as offered by TCP. It is intended for use in scenarios where applications running on two hosts may not be able to establish a direct TCP connection but are able to exchange UDP packets.



Table of Contents

1.  Introduction
    1.1.  Conventions
    1.2.  Terminology
2.  Model of Operation
    2.1.  Setup and tear down
    2.2.  Connection tracking
3.  Congestion Control, Flow Control, and Reliability
4.  Header Format
5.  ToU, TLS, and DTLS
6.  Implementation Guidelines
7.  Design Alternatives
    7.1.  Simplified TCP
    7.2.  TCP-like mechanism within an application layer protocol
    7.3.  Tunneling
    7.4.  TFRC
    7.5.  SCTP
8.  Acknowledgements
9.  IANA Considerations
10.  Security Considerations
11.  References
    11.1.  Normative References
    11.2.  Informative References
§  Authors' Addresses




 TOC 

1.  Introduction

The applications running on hosts behind restrictive network address translators (NATs) may not be able to establish a direct TCP connection with each other. Instead, these applications must establish a TCP connection with a reachable host, which relays the traffic of the application on the first host to the application on the second host and vice versa. While this works, this is undesirable as it creates a dependency on a reachable host. With certain NAT types, even though the applications cannot establish a direct TCP connection, they may be able to exchange UDP traffic by using techniques such as ICE-UDP (Rosenberg, J., “Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer Protocols,” October 2007.) [I‑D.ietf‑mmusic‑ice]. Thus, using UDP is attractive for such applications as it removes the dependency on a reachable host. However, these applications have a requirement that the underlying transport be reliable. Further, these applications may run on machines with heterogeneous network connectivity, thereby requiring flow control. UDP does not provide reliability, congestion control, or flow control semantics. Therefore, these applications may either use TCP with a reachable host, or invent their own reliable, congestion control, and flow control transport protocol to establish a direct connection.

We present TCP-over-UDP (ToU), a reliable, congestion control, and flow control transport protocol on top of UDP. The idea is that TCP is a well-designed transport protocol that provides reliable, congestion control, and flow control mechanisms and these mechanisms must be reused as much as possible. Further, a transport protocol that provides reliability and flow control mechanisms must not be tied to a specific application and must be designed to provide modular functionality. To accomplish this, ToU almost uses the same header as TCP which allows to easily incorporate TCP's reliable and congestion control algorithms as defined in TCP congestion control (Allman, M., Paxson, V., and E. Blanton, “TCP Congestion Control,” July 2009.) [I‑D.ietf‑tcpm‑rfc2581bis] document. In essence, ToU is not a new protocol but merely an instance (or profile) of TCP over UDP minus the TCP checksum, urgent data, and PSH flag.

We think that our approach is attractive for several reasons. First, we are not proposing a new congestion control algorithm. Designing new congestion control algorithms is complex, and requires a large validation effort. Second, our approach takes advantage of existing user-level-TCP (such as Daytona (Pradhan, P., Kandula, S., Xu, W., Sheikh, A., and E. Nahum, “Daytona : A User-Level TCP Stack,” 2004.) [Daytona] and MINET (Dinda, P., “The Minet TCP/IP Stack,” 2002.) [MINET]) or TCP-over-UDP implementations (such as atou (Dunigan, T. and F. Fowler, “A TCP-over-UDP Test Harness,” 2002.) [atou]). Finally, since we are replicating TCP semantics over UDP including TCP header, any TCP options such as selective acknowledgement option (SACK) (Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, “TCP Selective Acknowledgment Options,” October 1996.) [RFC2018] or proposed TCP options such as TCP-Auth (Touch, J., Mankin, A., and R. Bonica, “The TCP Authentication Option,” March 2010.) [I‑D.ietf‑tcpm‑tcp‑auth‑opt] can be easily incorporated in ToU without a new standardization effort.



 TOC 

1.1.  Conventions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.) [RFC2119].



 TOC 

1.2.  Terminology

We use the terms such as congestion window (cwnd), initial window (IW), restart window (RW), receiver window (rwnd), and sender maximum segment size (SMSS) as defined in TCP congestion control (Allman, M., Paxson, V., and E. Blanton, “TCP Congestion Control,” July 2009.) [I‑D.ietf‑tcpm‑rfc2581bis] document.



 TOC 

2.  Model of Operation

Below, we describe the key ToU operations.



 TOC 

2.1.  Setup and tear down

Like TCP, ToU uses a three-way handshake to establish a connection. Similarly, it follows TCP's semantics in tearing down the connection.



 TOC 

2.2.  Connection tracking

A key difference between TCP and UDP is that the former is connection-oriented whereas the later is not. This means that a ToU server must provide a way to keep track of existing connections. It does so through the source port and IP address of the UDP packet.



 TOC 

3.  Congestion Control, Flow Control, and Reliability

ToU follows the TCP congestion control algorithms described in TCP congestion control (Allman, M., Paxson, V., and E. Blanton, “TCP Congestion Control,” July 2009.) [I‑D.ietf‑tcpm‑rfc2581bis] document. Thus, a ToU sender goes through the slow-start and congestion-avoidance phases. A ToU sender starts with an initial window (IW) following the guidelines in RFC 3390 (Allman, M., Floyd, S., and C. Partridge, “Increasing TCP's Initial Window,” October 2002.) [RFC3390]. During slow start, a ToU sender increments congestion window (cwnd) by at most SMSS bytes for each ACK received that cumulatively acknowledges new data. It switches to congestion avoidance when the congestion window (cwnd) exceeds slow start threshold (ssthresh). A ToU receiver generates an acknowledgement following the guidelines in Section 4.2 of TCP congestion control (Allman, M., Paxson, V., and E. Blanton, “TCP Congestion Control,” July 2009.) [I‑D.ietf‑tcpm‑rfc2581bis] document. It immediately generates an ACK when an out-of-order segment arrives. The ToU sender uses the fast retransmit algorithm to detect and repair losses, and fast recovery algorithm to govern the transmission of new data until a non-duplicate ACK arrives. When ToU sender has not received a segment for more than one retransmission timeout (RTO), cwnd is reduced to the value of the restart window (RW) before transmission begins. The ToU sender may also use selective acknowledgement option (SACK) (Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, “TCP Selective Acknowledgment Options,” October 1996.) [RFC2018] to improve loss recovery when multiple packets are lost from one window of data. Like TCP, it uses receiver window (rwnd) to achieve flow control.



 TOC 

4.  Header Format

ToU header is like a TCP header (Postel, J., “Transmission Control Protocol,” September 1981.) [RFC0793] except that it does not include source port, destination port, and checksum, as they are already included in the UDP header. ToU header also does not include the 1-bit PSH flag and 1-bit Urgent flag and bits corresponding to these flags are reserved in ToU header. Further, it also does not include the 16-bit Urgent Pointer. Between sequence number and acknowledgement number, we have inserted a 32-bit magic cookie that allows to demultiplex ToU with other UDP-based protocols such as STUN (Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, “Session Traversal Utilities for NAT (STUN),” October 2008.) [RFC5389]. The rest of the fields in a ToU header have exactly the same meaning as those in a TCP header. The size of the fixed ToU header is 16 bytes, whereas the size of fixed TCP header is 20 bytes. The fixed ToU header and UDP header have a cumulative size of 24 bytes, four more than a fixed TCP header.




    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Sequence Number                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                         Magic Cookie                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Acknowledgment Number                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Data |             |A| |R|S|F|                               |
   | Offset|  Reserved   |C|R|S|Y|I|            Window             |
   |       |             |K| |T|N|N|                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Options                    |    Padding    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                             data                              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Header for TCP-over-UDP (ToU)

 Figure 1 

Since ToU header fields are exactly the same as TCP, we have borrowed their descriptions from the TCP RFC (Postel, J., “Transmission Control Protocol,” September 1981.) [RFC0793].

Sequence Number (32-bits):
Same as a TCP sequence number.
Magic Cookie (32-bits):
A fixed value of 0x7194B32E in network byte order to demultiplex ToU from other application layer protocols.
Acknowledgement Number (32-bits):
Same as a TCP acknowledgement number.
Data offset (4-bits):
The number of 32-bit words in ToU header. Like a TCP header, ToU header is an integral number of 32-bits long.
Reserved (7-bits):
Reserved for future use. Must be zero.
Control Bits (4-bits):
5-bits from left to right. Unlike TCP, the Urgent and PSH bits are excluded.
ACK: Acknowledgment field significant
R: Reserved in ToU. In the TCP header, it is used for the PSH function.
RST: Reset the connection
SYN: Synchronize sequence numbers
FIN: No more data from sender
Window (16-bits):
Same as the window in TCP header. The number of data octets beginning with the one indicated in the acknowledgment field which the sender of this segment is willing to accept.
Options:
Same as TCP options.
Padding:
Like TCP, the ToU header padding is used to ensure that the ToU header ends and data begins on a 32 bit boundary. The padding is composed of zeros.



 TOC 

5.  ToU, TLS, and DTLS

Transport layer security (TLS) (Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2,” August 2008.) [RFC5246] and Datagram transport layer security (DTLS) (Rescorla, E. and N. Modadugu, “Datagram Transport Layer Security,” April 2006.) [RFC4347] protocols provide privacy and data integrity between two communicating applications. TLS is layered on top of some reliable transport protocol such as TCP, whereas DTLS only assumes a datagram service. A question is what is the layering relationship between ToU protocol, TLS, and DTLS. Figure 2 shows four possible options. We think that Option-2 and Option-4 are not feasible since ToU layer must be made aware of the size of header which DTLS and TLS protocols may add. Since ToU provides the same reliable and inorder delivery semantics as TCP, we prefer Option-1 over Option-3 in which TLS is layered on top of ToU.




   +-+-+-+-+   +-+-+-+-+   +-+-+-+-+   +-+-+-+-+
   |  TLS  |   |  ToU  |   |  DTLS |   |  ToU  |
   +-+-+-+-+   +-+-+-+-+   +-+-+-+-+   +-+-+-+-+
   |  ToU  |   |  TLS  |   |  ToU  |   |  DTLS |
   +-+-+-+-+   +-+-+-+-+   +-+-+-+-+   +-+-+-+-+
   |  UDP  |   |  UDP  |   |  UDP  |   |  UDP  |
   +-+-+-+-+   +-+-+-+-+   +-+-+-+-+   +-+-+-+-+
   Option-1    Option-2    Option-3    Option-4

Layering options for ToU, TLS, DTLS

 Figure 2 



 TOC 

6.  Implementation Guidelines

From the implementers perspective, the use of ToU should be as modular as possible. Once way to achieve this modularity is to implement ToU as a user-level library that provides socket-like function calls to the applications. The library may have its own thread of execution and can be instantiated at the start of the program. The library implements the reliable, inorder, congestion control, and flow control semantics of TCP. Applications can interact with the ToU library through socket-like function calls.



 TOC 

7.  Design Alternatives

ToU is strictly meant for scenarios where end-points desire to establish a TCP connection but are unable to do so due to the presence of NATs and firewalls. Below, we briefly discuss the design alternatives.



 TOC 

7.1.  Simplified TCP

It may be argued that TCP semantics are too complicated and it might be easier to define a protocol that adds retransmission of individual UDP packets, and ACK mechanisms, and sequencing layer. However, unless one is content with stop-and-wait congestion control (and roughly modem data rates), it is necessary for a transport protocol to have AIMD or rate-based congestion control (TFRC). As discussed in Section 7.4 (TFRC), rate-based congestion control is not suitable for mid-sized transfers and is not any simpler than AIMD. Further, since hosts may have heterogeneous network connectivity, a transport protocol needs to provide flow control. Moreover, it may not be easy to validate a new transport protocol that only provides selective TCP semantics.



 TOC 

7.2.  TCP-like mechanism within an application layer protocol

In this approach, key TCP mechanims such as reliability, congestion control, and flow control are designed as part of the application layer protocol. This approach has several disadvantages. First, every application layer protocol that is unable to establish TCP connections in the presence of NAT and firewalls but may use UDP will need to invent its own reliable, congestion control and flow control transport protocol. Second, it is non-trivial to get the first implementations of a conceptually new protocol right. Third, any new transport protocol, even if it is specified within an application layer protocol must undergo a large validation effort. Finally, most long-term successful protocols are those that provide modular functionality, and not extremely narrowly-tailored protocols.



 TOC 

7.3.  Tunneling

Another design option is to provide a VPN-like tunneling option for sending and receiving TCP packets over UDP. This is conceivable as follows. An application uses the regular TCP socket calls which make use of the TCP stack. Just before the transmission of the packet, a module or a virtual ethernet driver intercepts the packet, and sends the TCP packet along with its payload over UDP. Similarly, when a packet is received over UDP, the virtual ethernet driver checks if it is an encapsulated TCP packet, and if yes, passes it to the appropriate kernel level TCP handler.

This approach is not desirable for several reasons. First, it creates a dependency on a kernel-level module or a virtual ethernet driver that must capture TCP packets before transmission and immediately upon reception. Kernel-level modules or virtual ethernet drivers require root access to a machine. Peer-to-peer applications are user space applications are expected to be the main users of ToU. It is unrealistic to create a dependency between these user space applications and a kernel level module. Second, sending a full-sized TCP segment over UDP may cause fragmentation. Lastly, other UDP based protocols such as STUN may need to be run on the same port as the tunneling port which can complicate the disambiguation of these protocols from the tunneled TCP.



 TOC 

7.4.  TFRC

TFRC (Floyd, S., Handley, M., Padhye, J., and J. Widmer, “TCP Friendly Rate Control (TFRC): Protocol Specification,” September 2008.) [RFC5348] is a congestion control mechanism (not a protocol) that is designed for long-lived media streams. Its main benefit is of smoothing rates to these media streams. It does not provide any packet formats, reliability, or flow control. It's congestion control mechanism is not suited for exchanging data objects that range from a few dozen to a few hundred packets. The reason is that TFRC is based on estimating loss rates within 8 loss intervals. With a loss rate of 1%, this translates, very roughly, into 800 packets or roughly 800 kB, before a reliable estimate of a better (higher) rate is computed. Further, its main benefit, smoothing rates, is of no importance to applications desiring to replicate TCP functionality over UDP.



 TOC 

7.5.  SCTP

SCTP (Stewart, R., “Stream Control Transmission Protocol,” September 2007.) [RFC4960] is significantly more complicated than TCP in its implementation and its performance is generally the same, except in circumstances involving head-of-line blocking. Further, SCTP will have trouble getting traction in the consumer and enterprise Internet space unless it (also) runs over UDP, as there seem to be few NATs that know how to handle SCTP and thus it is effectively unusable by a fair fraction of the Internet user population.



 TOC 

8.  Acknowledgements

The draft incorporates comments from the discussion on P2PSIP mailing list.



 TOC 

9.  IANA Considerations

TBD.



 TOC 

10.  Security Considerations

ToU is subject to the same security considerations as TCP.



 TOC 

11.  References



 TOC 

11.1. Normative References

[I-D.ietf-tcpm-rfc2581bis] Allman, M., Paxson, V., and E. Blanton, “TCP Congestion Control,” draft-ietf-tcpm-rfc2581bis-07 (work in progress), July 2009 (TXT).
[I-D.ietf-tcpm-tcp-auth-opt] Touch, J., Mankin, A., and R. Bonica, “The TCP Authentication Option,” draft-ietf-tcpm-tcp-auth-opt-11 (work in progress), March 2010 (TXT).
[RFC0793] Postel, J., “Transmission Control Protocol,” STD 7, RFC 793, September 1981 (TXT).
[RFC1122] Braden, R., “Requirements for Internet Hosts - Communication Layers,” STD 3, RFC 1122, October 1989 (TXT).
[RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, “TCP Selective Acknowledgment Options,” RFC 2018, October 1996 (TXT, HTML, XML).
[RFC2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).
[RFC3390] Allman, M., Floyd, S., and C. Partridge, “Increasing TCP's Initial Window,” RFC 3390, October 2002 (TXT).
[RFC4347] Rescorla, E. and N. Modadugu, “Datagram Transport Layer Security,” RFC 4347, April 2006 (TXT).
[RFC4960] Stewart, R., “Stream Control Transmission Protocol,” RFC 4960, September 2007 (TXT).
[RFC5246] Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2,” RFC 5246, August 2008 (TXT).
[RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, “TCP Friendly Rate Control (TFRC): Protocol Specification,” RFC 5348, September 2008 (TXT).
[RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, “Session Traversal Utilities for NAT (STUN),” RFC 5389, October 2008 (TXT).


 TOC 

11.2. Informative References

[Daytona] Pradhan, P., Kandula, S., Xu, W., Sheikh, A., and E. Nahum, “Daytona : A User-Level TCP Stack,” 2004.
[I-D.ietf-mmusic-ice] Rosenberg, J., “Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer Protocols,” draft-ietf-mmusic-ice-19 (work in progress), October 2007 (TXT).
[MINET] Dinda, P., “The Minet TCP/IP Stack,” 2002.
[atou] Dunigan, T. and F. Fowler, “A TCP-over-UDP Test Harness,” 2002.


 TOC 

Authors' Addresses

  Salman A. Baset
  Columbia University
  1214 Amsterdam Avenue
  New York, NY
  USA
Email:  salman@cs.columbia.edu
  
  Henning Schulzrinne
  Columbia University
  1214 Amsterdam Avenue
  New York, NY
  USA
Email:  hgs@cs.columbia.edu