<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">

<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>

<?rfc symrefs="yes" ?>
<?rfc sortrefs="yes"?>
<?rfc strict="no" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="yes" ?>
<?rfc toc="yes" ?>

<rfc ipr="full3978" docName="draft-ietf-behave-tcp-08.txt" category="bcp">
    <front>
        <title abbrev="NAT TCP Requirements">NAT Behavioral
            Requirements for TCP</title>
        <author initials ='S' surname ='Guha' fullname='Saikat Guha' role='editor'>
            <organization abbrev="Cornell U.">Cornell University</organization>
            <address>
                <postal>
                    <street>331 Upson Hall</street>
                    <city>Ithaca</city>
                    <region>NY</region>
                    <code>14853</code>
                    <country>US</country>
                </postal>
                <phone>+1 607 255 1008</phone>
                <email>saikat@cs.cornell.edu</email>
            </address>
        </author>
        <author initials ='K' surname ='Biswas' fullname='Kaushik Biswas'>
            <organization abbrev="Cisco Systems">Cisco Systems, Inc.</organization>
            <address>
                <postal>
                    <street>170 West Tasman Dr.</street>
                    <city>San Jose</city>
                    <region>CA</region>
                    <code>95134</code>
                    <country>US</country>
                </postal>
                <phone>+1 408 525 5134</phone>
                <email>kbiswas@cisco.com</email>
            </address>
        </author>
        <author initials ='B' surname ='Ford' fullname='Bryan Ford'>
            <organization>M.I.T.</organization>
            <address>
                <postal>
                    <street>Laboratory for Computer Science</street>
                    <street>77 Massachusetts Ave.</street>
                    <city>Cambridge</city>
                    <region>MA</region>
                    <code>02139</code>
                    <country>US</country>
                </postal>
                <phone>+1 617 253 5261</phone>
                <email>baford@mit.edu</email>
            </address>
        </author>
        <!--
        <author initials ='P' surname ='Francis' fullname='Paul Francis'>
            <organization abbrev="Cornell U.">Cornell University</organization>
            <address>
                <postal>
                    <street>4108 Upson Hall</street>
                    <city>Ithaca</city>
                    <region>NY</region>
                    <code>14853</code>
                    <country>US</country>
                </postal>
                <phone>+1 607 255 9223</phone>
                <email>francis@cs.cornell.edu</email>
            </address>
        </author>
        -->
        <author initials ='S' surname ='Sivakumar' fullname='Senthil Sivakumar'>
            <organization abbrev="Cisco Systems">Cisco Systems, Inc.</organization>
            <address>
                <postal>
                    <street>7100-8 Kit Creek Road</street>
                    <street>PO Box 14987</street>
                    <city>Research Triangle Park</city>
                    <region>NC</region>
                    <code>27709-4987</code>
                    <country>US</country>
                </postal>
                <phone>+1 919 392 5158</phone>
                <email>ssenthil@cisco.com</email>
            </address>
        </author>
        <author initials ='P' surname ='Srisuresh' fullname='Pyda Srisuresh'>
            <organization>Consultant</organization>
            <address>
                <postal>
                    <street>20072 Pacifica Dr.</street>
                    <city>Cupertino</city>
                    <region>CA</region>
                    <code>95014</code>
                    <country>US</country>
                </postal>
                <phone>+1 408 836 4773</phone>
                <email>srisuresh@yahoo.com</email>
            </address>
        </author>
        <date month="September" year="2008"/>"
        <abstract>

            <t>This document defines a set of requirements for NATs that handle
                TCP that would allow many applications, such as peer-to-peer
                applications and on-line games, to work consistently.
                Developing NATs that meet this set of requirements will greatly
                increase the likelihood that these applications will function
                properly.</t>

        </abstract>
    </front>

    <middle>

        <section title="Applicability Statement">

            <t>This document is adjunct to <xref target='BEHAVE-UDP'/>, which
                defines many terms relating to NATs, lays out general
                requirements for all NATs, and sets requirements for NATs that
                handle IP and unicast UDP traffic. The purpose of this document
                is to set requirements for NATs that handle TCP traffic.</t>

            <t>The requirements of this specification apply to Traditional NATs
                as described in <xref target='RFC2663'/>.</t>

            <t>This document only covers the TCP aspects of NAT traversal.
                Middle-box behavior that is not necessary for network address
                translation of TCP is out-of-scope. Firewalls, and packet
                inspection above the TCP layer are out-of-scope except for
                Application Level Gateways (ALG) behavior that may interfere
                with NAT traversal. Application and OS aspects of TCP NAT
                traversal are out-of-scope. Signaling based approaches to NAT
                traversal such as MIDCOM and UPnP that directly control the NAT
                are out-of-scope. Finally, TCP connections intended for the NAT
                (e.g. an HTTP or SSH management interface) and TCP connections
                initiated by the NAT (e.g. reliable syslog client) are
                out-of-scope.</t>

        </section>

        <section title="Introduction">

            <t>Network Address Translators (NATs) hinder connectivity in
                applications where sessions may be initiated to internal hosts.
                Readers may refer to <xref target="RFC3022"/> for detailed
                information on traditional NATs.  <xref target='BEHAVE-UDP'/>
                lays out the terminology and requirements for NATs in the
                context of IP and UDP. This document supplements these by
                setting requirements for NATs that handle TCP traffic. All
                definitions and requirements in <xref target='BEHAVE-UDP'/> are
                inherited here.</t>

            <t><xref target="RFC4614"/> chronicles the evolution of TCP
                from the original definition <xref target="RFC0793"/> to
                present day implementations. While much has changed in TCP with
                regards to congestion control and flow control, security, and
                support for high-bandwidth networks, the process of initiating
                a connection (i.e. the 3-way handshake or simultaneous-open)
                has changed little. It is the process of connection initiation
                that NATs affect the most. Experimental approaches such as
                T/TCP <xref target="RFC1644"/> have proposed alternate
                connection initiation approaches, but, have been found to be
                complex and susceptible to denial-of-service attacks. Modern
                operating systems and NATs consequently primarily support the
                3-way handshake and simultaneous-open modes of connection
                initiation as described in <xref target="RFC0793"/>.</t>

            <t>Recently, many techniques have been devised to make peer-to-peer
                TCP applications work across NATs. <xref target='STUNT'/>,
                <xref target='NATBLASTER'/>, and <xref target='P2PNAT'/>
                describe Unilateral Self-Address Fixing (UNSAF) mechanisms
                that allow peer-to-peer applications to establish TCP through
                NATs. These approaches require only endpoint applications to be
                modified and work with standards compliant OS stacks. The
                approaches, however, depend on specific NAT behavior that is
                usually, but not always, supported by NATs (see <xref
                    target='TCPTRAV'/> and <xref target='P2PNAT'/> for
                details). Consequently a complete TCP NAT traversal solution is
                sometimes forced to rely on public TCP relays to traverse NATs
                that do not cooperate. This document defines requirements that
                ensure that TCP NAT traversal approaches are not forced to use
                data relays.</t>

        </section>

        <section title="Terminology">

            <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
                NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
                "OPTIONAL" in this document are to be interpreted as
                described in <xref target='RFC2119'/>.</t>

            <t>This document uses the term "NAT session" as defined in
                <xref target='RFC2663'/>. "NAT" in this specification includes
                both "Basic NAT" and "Network Address/Port Translator (NAPT)"
                <xref target='RFC2663'/>.</t>

            <t>This document uses the term "TCP connection" (or just
                "connection") to refer to individual TCP flows identified by
                the 4-tuple (source and destination IP address and TCP port)
                and the initial sequence numbers (ISN).</t>

            <t>This document uses the term "address and port mapping" (or just
                "mapping") as defined in <xref target='BEHAVE-UDP'/> to refer
                to state at the NAT necessary for network address and port
                translation of TCP connections. This document also uses the
                terms "endpoint independent mapping", "address dependent
                mapping", "address and port dependent mapping", "filtering
                behavior", "endpoint independent filtering", "address dependent
                filtering", "address and port dependent filtering", "port
                assignment", "port overloading", "hairpinning", and "external
                source IP address and port" as defined in <xref
                    target='BEHAVE-UDP'/>. </t>

        </section>

        <section title='TCP Connection Initiation'>

            <t>This section describes various NAT behaviors applicable to TCP
                connection initiation.</t>

            <section title='Address and Port Mapping Behavior'>

                <t>A NAT uses a mapping to translate packets for each TCP
                    connection. A mapping is dynamically allocated for
                    connections initiated from the internal side, and
                    potentially reused for certain subsequent connections. NAT
                    behavior regarding when a mapping can be reused differs for
                    different NATs as described in <xref
                        target='BEHAVE-UDP'/>.</t>

                <t>Consider an internal IP address and TCP port (X:x) that
                    initiates a TCP connection to an external (Y1:y1) tuple.
                    Let the mapping allocated by the NAT for this connection be
                    (X1':x1'). Shortly thereafter, the endpoint initiates a
                    connection from the same (X:x) to an external address
                    (Y2:y2) and gets the mapping (X2':x2') on the NAT. As per
                    <xref target='BEHAVE-UDP'/>, if (X1':x1') equals (X2':x2')
                    for all values of (Y2:y2) then the NAT is defined to have
                    "Endpoint Independent Mapping" behavior. If (X1':x1')
                    equals (X2':x2') only when Y2 equals Y1 then the NAT is
                    defined to have "Address Dependent Mapping" behavior. If
                    (X1':x1') equals (X2':x2') only when (Y2:y2) equals
                    (Y1:y1), possible only for consecutive connections to the
                    same external address shortly after the first is terminated
                    and if the NAT retains state for connections in TIME_WAIT
                    state, then the NAT is defined to have "Address and Port
                    Dependent Mapping" behavior. This document introduces one
                    additional behavior where (X1':x1') never equals (X2':x2'),
                    that is, for each connection a new mapping is allocated; in
                    such a case, the NAT is defined to have "Connection
                    Dependent Mapping" behavior. </t>


                <t><list style="hanging">

                        <t hangText="REQ-1:">A NAT MUST have an "Endpoint
                            Independent Mapping" behavior for TCP.</t>

                        <t/>

                        <t hangText="Justification:">REQ-1 is necessary for
                            UNSAF methods to work. Endpoint independent mapping
                            behavior allows peer-to-peer applications to learn
                            and advertise the external IP address and port
                            allocated to an internal endpoint such that
                            external peers can contact it (subject to the NAT's
                            security policy). The security policy of a NAT is
                            independent of its mapping behavior and is
                            discussed later in <xref target="sec-ef"/>. Having
                            endpoint independent mapping behavior allows
                            peer-to-peer applications to work consistently
                            without compromising the security benefits of the
                            NAT.</t>

                </list></t>

            </section>

            <section title='Internally Initiated Connections'>

                <t>An internal endpoint initiates a TCP connection through a
                    NAT by sending a SYN packet. The NAT allocates (or reuses)
                    a mapping for the connection, as described in the previous
                    section. The mapping defines the external IP address and
                    port used for translation of all packets for that
                    connection. In particular, for client-server applications
                    where an internal client initiates the connection to an
                    external server, the mapping is used to translate the
                    outbound SYN, the resulting inbound SYN-ACK response, the
                    subsequent outbound ACK, and other packets for the
                    connection. This method of connection initiation
                    corresponds to the 3-way handshake (defined in <xref
                        target='RFC0793'/>) and is supported by all NATs.</t>

                <t>Peer-to-peer applications use an alternate method of
                    connection initiation termed simultaneous-open (Fig. 8,
                    <xref target='RFC0793'/>) to traverse NATs. In the
                    Simultaneous-Open mode of operation, both peers send SYN
                    packets for the same TCP connection. The SYN packets cross
                    in the network. Upon receiving the other end's SYN packet
                    each end responds with a SYN-ACK packet, which also cross in
                    the network. The connection is considered established once
                    the SYN-ACKs are received. From the perspective of the NAT,
                    the internal host's SYN packet is met by an inbound SYN
                    packet for the same connection (as opposed to a SYN-ACK
                    packet during a 3-way handshake). Subsequent to this
                    exchange, both an outbound and an inbound SYN-ACK are seen
                    for the connection. Some NATs erroneously block the inbound
                    SYN for the connection in progress. Some NATs block or
                    incorrectly translate the outbound SYN-ACK. Such behavior
                    breaks TCP simultaneous-open and prevents peer-to-peer
                    applications from functioning correctly behind a NAT.</t>

                <t>In order to provide network address translation service for
                    TCP, it is necessary for a NAT to correctly receive,
                    translate, and forward all packets for a connection that
                    conform to valid transitions of the TCP State-Machine (Fig.
                    6, <xref target='RFC0793'/>).</t>

            <t><list style="hanging">

                    <t hangText="REQ-2:">A NAT MUST support all valid sequences
                        of TCP packets (defined in <xref target='RFC0793'/>)
                        for connections initiated both internally as well as
                        externally when the connection is permitted by the NAT.
                        In particular:

                        <list style="hanging">

                            <t hangText="a)">In addition to handling the TCP
                                3-way handshake mode of connection initiation,
                                A NAT MUST handle the TCP simultaneous-open
                                mode of connection initiation.</t>

                        </list></t>

                    <t/>

                    <t hangText="Justification:">The intent of this requirement
                        is to allow standards compliant TCP stacks to traverse
                        NATs no matter what path the stacks take through the
                        TCP state-machine and no matter which end initiates the
                        connection as long as the connection is permitted by
                        the filtering policy of the NAT (filtering policy is
                        described in the following section).

                        <list style="hanging">

                            <t hangText="a)">In addition to TCP packets for a
                                3-Way Handshake, A NAT must be prepared to
                                accept an inbound SYN and an outbound SYN-ACK
                                for an internally initiated connection in order
                                to support Simultaneous-Open.</t>

                        </list>

                    </t>

            </list></t>


            </section>

            <section anchor="sec-ef" title='Externally Initiated Connections'>

                <t>The NAT allocates a mapping for the first connection
                    initiated by an internal endpoint to an external endpoint.
                    In some scenarios, the NAT's policy may allow this mapping
                    to be reused for connections initiated from the external
                    side to the internal endpoint. Consider as before an
                    internal IP address and port (X:x) that is assigned (or
                    reuses) a mapping (X1':x1') when it initiates a connection
                    to an external (Y1:y1). An external endpoint (Y2:y2)
                    attempts to initiate a connection with the internal
                    endpoint by sending a SYN to (X1':x1'). A NAT can choose to
                    either allow the connection to be established, or to
                    disallow the connection. If the NAT chooses to allow the
                    connection, it translates the inbound SYN and routes it to
                    (X:x) as per the existing mapping. It also translates the
                    SYN-ACK generated by (X:x) in response and routes it to
                    (Y2:y2) and so on. Alternately, the NAT can disallow the
                    connection by filtering the inbound SYN. </t>

                <t>A NAT may allow an existing mapping to be reused by an
                    externally initiated connection if its security policy
                    permits. Several different policies are possible as
                    described in <xref target='BEHAVE-UDP'/>. If a NAT allows
                    the connection initiation from all (Y2:y2) then it is
                    defined to have "Endpoint Independent Filtering" behavior.
                    If the NAT allows connection initiations only when Y2
                    equals Y1 then the NAT is defined to have "Address
                    Dependent Filtering" behavior. If the NAT allows connection
                    initiations only when (Y2:y2) equals (Y1:y1), then the NAT
                    is defined to have "Address and Port Dependent Filtering"
                    behavior (possible only shortly after the first connection
                    has been terminated but the mapping is still active). One
                    additional filtering behavior defined in this document is
                    when the NAT does not allow any connection initiations from
                    the external side; in such cases, the NAT is defined to
                    have "Connection Dependent Filtering" behavior. The
                    difference between "Address and Port Dependent Filtering"
                    and "Connection Dependent Filtering" behavior is that the
                    former permits an inbound SYN during the TIME_WAIT state of
                    the first connection to initiate a new connection while the
                    latter does not.</t>

                <t><list style="hanging">

                        <t hangText="REQ-3:">
                            If application transparency is most important, it
                            is RECOMMENDED that a NAT have an "Endpoint
                            independent filtering" behavior for TCP. If a more
                            stringent filtering behavior is most important, it
                            is RECOMMENDED that a NAT have an "Address
                            dependent filtering" behavior.
                            <list style="hanging">

                                <t hangText="a)">The filtering behavior MAY be
                                    an option configurable by the administrator
                                    of the NAT.</t>

                                <t hangText="b)">The filtering behavior for TCP
                                    MAY be independent of the filtering
                                    behavior for UDP.</t>

                        </list></t>

                        <t/>

                        <t hangText="Justification:">The intent of this
                            requirement is to allow peer-to-peer applications
                            that do not always initiate connections from the
                            internal side of the NAT to continue to work in the
                            presence of NATs. This behavior also allows
                            applications behind a BEHAVE compliant NAT to
                            inter-operate with remote endpoints that are behind
                            non-BEHAVE compliant (legacy) NATs. If the remote
                            endpoint's NAT does not have endpoint independent
                            mapping behavior but has only one external IP
                            address, then an application can still traverse the
                            combination of the two NATs if the local NAT has
                            address dependent filtering. <xref
                                target="sec-security"/> contains a detailed
                            discussion on the security implications of this
                            requirement. </t>

                </list></t>

                <t>If the inbound SYN packet is filtered, either because a
                    corresponding mapping does not exist or because of the
                    NAT's filtering behavior, a NAT has two basic choices: to
                    ignore the packet silently, or to signal an error to the
                    sender. Signaling an error through ICMP messages allows the
                    sender to quickly detect that the SYN did not reach the
                    intended destination. Silently dropping the packet, on the
                    other hand, allows applications to perform
                    Simultaneous-Open more reliably.</t>

                <t>Silently dropping the SYN aids Simultaneous-Open as follows.
                    Consider that the application is attempting a
                    Simultaneous-Open and the outbound SYN from the internal
                    endpoint has not yet crossed the NAT (due to network
                    congestion or clock skew between the two endpoints); this
                    outbound SYN would otherwise have created the necessary
                    mapping at the NAT to allow translation of the inbound SYN.
                    Since the outbound SYN did not reach the NAT in time, the
                    inbound SYN cannot be processed. If a NAT responds to the
                    premature inbound SYN with an error message that forces the
                    external endpoint to abandon the connection attempt, it
                    hinders applications performing a TCP simultaneous-open. If
                    instead the NAT silently ignores the inbound SYN, the
                    external endpoint retransmits the SYN after a TCP timeout.
                    In the meantime, the NAT creates the mapping in response to
                    the (delayed) outbound SYN such that the retransmitted
                    inbound SYN can be routed and simultaneous-open can
                    succeed. The down-side to this behavior is that in the
                    event the inbound SYN is erroneous, the remote side does
                    not learn of the error until after several TCP timeouts.
                </t>

                <t>NAT support for simultaneous-open as well as quickly
                    signaling errors are both important for applications.
                    Unfortunately, there is no way for a NAT to signal an error
                    without forcing the endpoint to abort a potential
                    simultaneous-open: TCP RST and ICMP Port Unreachable
                    packets require the endpoint to abort the attempt while
                    ICMP Host and Network Unreachable errors may adversely
                    affect other connections to the same host or network <xref
                        target="RFC1122"/>.</t>

                <t>In addition, when an unsolicited SYN is received by the NAT,
                    the NAT may not know whether the application is attempting a
                    simultaneous-open (and that it should therefore silently
                    drop the SYN) or whether the SYN is in error (and that it
                    should notify the sender).</t>

                <t><list style="hanging">

                        <t hangText="REQ-4:">A NAT MUST NOT respond to an
                            unsolicited inbound SYN packet for at least 6 seconds
                            after the packet is received. If during this interval
                            the NAT receives and translates an outbound SYN for the
                            connection the NAT MUST silently drop the original
                            unsolicited inbound SYN packet. Otherwise the NAT
                            SHOULD send an ICMP Port Unreachable error (Type 3,
                            Code 3) for the original SYN, unless REQ-4a applies.
                            <list style="hanging">

                                <t hangText="a)">The NAT MUST silently drop the
                                    original SYN packet if sending a response
                                    violates the security policy of the NAT. </t>

                        </list></t>

                        <t/>

                        <t hangText="Justification:"> The intent of this
                            requirement is to allow simultaneous-open to work
                            reliably in the presence of NATs as well as to
                            quickly signal an error in case the unsolicited SYN
                            is in error. As of writing this memo, it is not
                            possible to achieve both; the requirement therefore
                            represents a compromise. The NAT should tolerate
                            some delay in the outbound SYN for a TCP
                            simultaneous-open, which may be due to network
                            congestion or loose synchronization between the
                            endpoints. If the unsolicited SYN is not part of a
                            simultaneous-open attempt and is in error, the NAT
                            should endeavor to signal the error in accordance
                            with <xref target="RFC1122"/>.

                            <list style="hanging">

                                <t hangText="a)">There may, however, be reasons
                                    for the NAT to rate-limit or omit such
                                    error notifications, for example in the
                                    case of an attack. Silently dropping the
                                    SYN packet when under attack allows
                                    simultaneous-open to work without consuming
                                    any extra network bandwidth or revealing
                                    the presence of the NAT to attackers. <xref
                                        target="sec-security"/> mentions the
                                    security considerations for this
                                    requirement. </t>

                        </list></t>

                </list></t>

                <t>For NATs that combine NAT functionality with end-host
                    functionality (e.g. an end-host that also serves as a NAT
                    for other hosts behind it), REQ-4 above applies only to
                    SYNs intended for the NAT'ed hosts and not to SYNs intended
                    for the NAT itself. One way to determine whether the
                    inbound SYN is intended for a NAT'ed host is to allocate
                    NAT mappings from one port range, and allocate ports for
                    local endpoints from a different non-overlapping port
                    range. More dynamic implementations can be imagined.</t>

            </section>


        </section>

        <section title='NAT Session Refresh'>

            <t>A NAT maintains state associated with in-progress and
                established connections. Because of this, a NAT is susceptible
                to a resource-exhaustion attack whereby an attacker (or virus)
                on the internal side attempts to cause the NAT to create more
                state than it has resources for. To prevent such an attack, a
                NAT needs to abandon sessions in order to free the state
                resources.</t>

            <t>A common method that is applicable only to TCP is to
                preferentially abandon sessions for crashed endpoints, followed
                by closed TCP connections and partially-open connections. A NAT
                can check if an endpoint for a session has crashed by sending a
                TCP keep-alive packet and receiving a TCP RST packet in
                response. If the NAT cannot determine whether the endpoint is
                active, it should not abandon the session until the TCP
                connection has been idle for some time. Note that an
                established TCP connection can stay idle (but live)
                indefinitely; hence there is no fixed value for an idle-timeout
                that accommodates all applications. However, a large
                idle-timeout motivated by recommendations in <xref
                    target='RFC1122'/> can reduce the chances of abandoning a
                live session.</t>

            <t>A TCP connection passes through three phases: partially-open,
                established, and closing. During the partially-open phase,
                endpoints synchronize initial sequence numbers. The phase is
                initiated by the first SYN for the connection and extends until
                both endpoints have sent a packet with the ACK flag set (TCP
                states: SYN_SENT and SYN_RCVD). ACKs in both directions mark
                the beginning of the established phase where application data
                can be exchanged indefinitely (TCP states: ESTABLISHED,
                FIN_WAIT_1, FIN_WAIT_2, and CLOSE_WAIT). The closing phase
                begins when both endpoint have terminated their half of the
                connection by sending a FIN packet. Once FIN packets are seen
                in both directions, application data can no longer be exchanged
                but the stacks still need to ensure that the FIN packets are
                received (TCP states: CLOSING and LAST_ACK).</t>

            <t>TCP connections can stay in established phase indefinitely
                without exchanging any packets. Some end-hosts can be
                configured to send keep-alive packets on such idle connections;
                by default, such keep-alive packets are sent every 2 hours if
                enabled <xref target="RFC1122"/>. Consequently, a NAT that
                waits for slightly over 2 hours can detect idle connections with
                keep-alive packets being sent at the default rate. TCP
                connections in the partially-open or closing phases, on the
                other hand, can stay idle for at most 4 minutes while waiting
                for in-flight packets to be delivered <xref target="RFC1122"/>.
            </t>

            <t>The "established connection idle-timeout" for a NAT is defined
                as the minimum time a TCP connection in the established phase
                must remain idle before the NAT considers the associated
                session a candidate for removal. The "transitory connection
                idle-timeout" for a NAT is defined as the minimum time a TCP
                connection in the partially-open or closing phases must remain
                idle before the NAT considers the associated session a
                candidate for removal. TCP connections in the TIME_WAIT state
                are not affected by the "transitory connection
                idle-timeout".</t>

            <t><list style="hanging">

                    <t hangText="REQ-5:">If a NAT cannot determine whether the
                        endpoints of a TCP connection are active, it MAY
                        abandon the session if it has been idle for some time.
                        In such cases, the value of the "established connection
                        idle-timeout" MUST NOT be less than 2 hours 4 minutes.
                        The value of the "transitory connection idle-timeout"
                        MUST NOT be less than 4 minutes.

                        <list style="hanging">

                            <t hangText="a)">The value of the NAT idle-timeouts
                                MAY be configurable.</t>

                    </list></t>

                    <t/>

                    <t hangText="Justification:">The intent of this requirement
                        is to minimize the cases where a NAT abandons session
                        state for a live connection. While some NATs may choose
                        to abandon sessions reactively in response to new
                        connection initiations (allowing idle connections to
                        stay up indefinitely in the absence of new
                        initiations), other NATs may choose to proactively reap
                        idle sessions. In cases where the NAT cannot actively
                        determine if the connection is alive, this requirement
                        ensures that applications can send keep-alive packets
                        at the default rate (every 2 hours) such that the NAT
                        can passively determine that the connection is alive.
                        The additional 4 minutes allows time for in-flight
                        packets to cross the NAT. </t>

            </list></t>

            <t>NAT behavior for handling RST packets, or connections in
                TIME_WAIT state is left unspecified. A NAT MAY hold state for a
                connection in TIME_WAIT state to accommodate retransmissions of
                the last ACK. However, since the TIME_WAIT state is commonly
                encountered by internal endpoints properly closing the TCP
                connection, holding state for a closed connection may limit the
                throughput of connections through a NAT with limited resources.
                <xref target="RFC1337"/> describes hazards associated
                with TIME_WAIT assassination. </t>

            <t>The handling of non-SYN packets for connections for which there
                is no active mapping is left unspecified. Such packets may be
                received if the NAT silently abandons a live connection, or
                abandons a connection in TIME_WAIT state before the 4 minute
                TIME_WAIT period expires. The decision to either silently drop
                such packets or to respond with a TCP RST packet is left up to
                the implementation.</t>

            <t>NAT behavior for notifying endpoints when abandoning live
                connections is left unspecified. When a NAT abandons a live
                connection, for example due to a timeout expiring, the NAT MAY
                either send TCP RST packets to the endpoints or MAY silently
                abandon the connection. Sending a RST notification allows
                endpoint applications to recover more quickly, however,
                notifying the endpoints may not always be possible if, for
                example, session state it lost due to a power failure.</t>

        </section>

        <section title="Application Level Gateways">

            <t>Application Level Gateways (ALGs) in certain NATs modify IP
                addresses and TCP ports embedded inside application protocols.
                Such ALGs may interfere with UNSAF methods or protocols that
                try to be NAT-aware and must therefore be used with extreme
                caution.</t>

            <t><list style="hanging">

                    <t hangText="REQ-6:">If a NAT includes ALGs that affect
                        TCP, it is RECOMMENDED that all of those ALGs (except
                        for FTP <xref target="RFC0959"/>) be disabled by
                        default. </t>

                <t/>

                <t hangText="Justification:">The intent of this requirement is
                    to prevents ALGs from interfering with UNSAF methods. The
                    default state of a FTP ALG is left unspecified because of
                    legacy concerns: as of writing this memo, a large fraction
                    of legacy FTP clients do not enable passive (PASV) mode by default
                    and require an ALG to traverse NATs.</t>

            </list></t>

        </section>

        <section title="Other Requirements Applicable to TCP">

            <t>A list of general and UDP specific NAT behavioral requirements
                are described in <xref target="BEHAVE-UDP"/>. A list of ICMP
                specific NAT behavioral requirements are described in <xref
                    target="BEHAVE-ICMP"/>. The requirements listed below
                reiterate the requirements from these two documents that
                directly affect TCP. The following requirements do not relax
                any requirements in <xref target="BEHAVE-UDP"/> or <xref
                    target="BEHAVE-ICMP"/>. </t>

            <section title="Port Assignment">

                <t>NATs that allow different internal endpoints to
                    simultaneously use the same mapping are defined in <xref
                        target="BEHAVE-UDP"/> to have a "Port assignment"
                    behavior of "Port overloading". Such behavior is
                    undesirable as it prevents two internal endpoints sharing
                    the same mapping from establishing simultaneous connections
                    to a common external endpoint.</t>

                <t><list style="hanging">

                        <t hangText="REQ-7">A NAT MUST NOT have a "Port
                            assignment" behavior of "Port overloading" for TCP.
                        </t>

                        <t/>

                        <t hangText="Justification:">This requirement allows
                            two applications on the internal side of the NAT to
                            consistently communicate with the same destination.
                        </t>

                </list></t>

                <t>NAT behavior for preserving the source TCP port range for
                    connections is left unspecified. Some applications expect
                    the source TCP port to be in the well-known range (TCP
                    ports from 0 to 1023).  The "r" series of commands (rsh,
                    rcp, rlogin, etc.) are an example. NATs that preserve the
                    range from which the source port is picked allow such
                    applications to function properly through the NAT, however,
                    by doing so the NAT may compromise the security of the
                    application in certain situations; applications that depend
                    only on the IP address and source TCP port range for
                    security (the "r" commands for example) cannot distinguish
                    between an attacker and a legitimate user behind the same
                    NAT. </t>

            </section>

            <section title="Hairpinning Behavior">

                <t>NATs that forward packets originating from an internal
                    address, destined for an external address that matches the
                    active mapping for an internal address, back to that
                    internal address are defined in <xref target="BEHAVE-UDP"/>
                    as supporting "hairpinning". If the NAT presents the
                    hairpinned packet with an external source IP address and
                    port (i.e. the mapped source address and port of the
                    originating internal endpoint), then it is defined to have
                    "External source IP address and port" for hairpinning.
                    Hairpinning is necessary to allow two internal endpoints
                    (known to each other only by their external mapped
                    addresses) to communicate with each other. "External source
                    IP address and port" behavior for hairpinning avoids
                    confusing implementations that expect the external source
                    address and port.</t>


                <t><list style="hanging">

                        <t hangText="REQ-8:"> A NAT MUST support
                            "Hairpinning" for TCP.

                            <list style="hanging"> <t hangText="a)"> A NAT's
                                    Hairpinning behavior MUST be of type
                                    "External source IP address and port".</t>

                        </list></t>

                        <t/>

                        <t hangText="Justification:">This requirement allows
                            two applications behind the same NAT that are
                            trying to communicate with each other using their
                            external addresses.

                            <list style="hanging"> <t hangText="a)">Using the
                                    external source address and port for the
                                    hairpinned packet is necessary for
                                    applications that do not expect to receive
                                    a packet from a different address than the
                                    external address they are trying to
                                    communicate with.</t>

                        </list></t>

                </list></t>

            </section>

            <section title="ICMP Responses to TCP Packets">

                <t>Several TCP mechanisms depend on the reception of ICMP error
                    messages triggered by the transmission of TCP segments. One
                    such mechanism is path MTU discovery <xref
                        target='RFC1191'/>, which is required for the correct
                    operation of TCP. The current path MTU discovery mechanism
                    requires the sender of TCP segments to be notified of ICMP
                    "Datagram Too Big" responses.</t>

                <t><list style="hanging">

                        <t hangText="REQ-9:">If a NAT translates TCP, it SHOULD
                            translate ICMP Destination Unreachable (Type 3)
                            messages.</t>

                        <t/>

                        <t hangText="Justification:">Translating ICMP
                            Destination Unreachable messages, particularly the
                            "Fragmentation Needed and Don't Fragment was Set"
                            (Type 3, Code 4) message avoids communication
                            failures ("black holes" <xref target="RFC2923"/>).
                            Furthermore, TCP's connection establishment and
                            maintenance mechanisms also behave much more
                            efficiently when ICMP Destination Unreachable
                            messages arrive in response to outgoing TCP
                            segments.</t>

                        <t/>

                        <t hangText="REQ-10:"> Receipt of any sort of ICMP
                            message MUST NOT terminate the NAT mapping or TCP
                            connection for which the ICMP was generated. </t>

                        <t/>

                        <t hangText="Justification:">This is necessary for
                            reliably performing TCP simultaneous-open where a
                            remote NAT may temporarily signal an ICMP
                            error.</t>

                </list></t>

            </section>

        </section>

        <section title="Requirements">


            <t>A NAT that supports all of the mandatory requirements of this
                specification (i.e., the "MUST") and is compliant with <xref
                    target="BEHAVE-UDP"/>, is "compliant with this
                specification".  A NAT that supports all of the requirements of
                this specification (i.e., included the "RECOMMENDED") and is
                fully compliant with <xref target="BEHAVE-UDP"/> is "fully
                compliant with all the mandatory and recommended requirements
                of this specification".</t>

            <t><list style="hanging">

                    <t hangText="REQ-1:">A NAT MUST have an "Endpoint
                        Independent Mapping" behavior for TCP.</t>

                    <t hangText="REQ-2:">A NAT MUST support all valid sequences
                        of TCP packets (defined in <xref target='RFC0793'/>)
                        for connections initiated both internally as well as
                        externally when the connection is permitted by the NAT.
                        In particular:

                        <list style="hanging">

                            <t hangText="a)">In addition to handling the TCP
                                3-way handshake mode of connection initiation,
                                A NAT MUST handle the TCP simultaneous-open
                                mode of connection initiation.</t>

                        </list></t>

                    <t hangText="REQ-3:">
                        If application transparency is most important, it
                        is RECOMMENDED that a NAT have an "Endpoint
                        independent filtering" behavior for TCP. If a more
                        stringent filtering behavior is most important, it
                        is RECOMMENDED that a NAT have an "Address
                        dependent filtering" behavior.
                        <list style="hanging">

                            <t hangText="a)">The filtering behavior MAY be
                                an option configurable by the administrator
                                of the NAT.</t>

                            <t hangText="b)">The filtering behavior for TCP
                                MAY be independent of the filtering
                                behavior for UDP.</t>

                    </list></t>

                    <t hangText="REQ-4:">A NAT MUST NOT respond to an
                        unsolicited inbound SYN packet for at least 6 seconds
                        after the packet is received. If during this interval
                        the NAT receives and translates an outbound SYN for the
                        connection the NAT MUST silently drop the original
                        unsolicited inbound SYN packet. Otherwise the NAT
                        SHOULD send an ICMP Port Unreachable error (Type 3,
                        Code 3) for the original SYN, unless REQ-4a applies.
                        <list style="hanging">

                            <t hangText="a)">The NAT MUST silently drop the
                                original SYN packet if sending a response
                                violates the security policy of the NAT. </t>

                    </list></t>

                    <t hangText="REQ-5:">If a NAT cannot determine whether the
                        endpoints of a TCP connection are active, it MAY
                        abandon the session if it has been idle for some time.
                        In such cases, the value of the "established connection
                        idle-timeout" MUST NOT be less than 2 hours 4 minutes.
                        The value of the "transitory connection idle-timeout"
                        MUST NOT be less than 4 minutes.

                        <list style="hanging">

                            <t hangText="a)">The value of the NAT idle-timeouts
                                MAY be configurable.</t>

                    </list></t>


                    <t hangText="REQ-6:">If a NAT includes ALGs that affect
                        TCP, it is RECOMMENDED that all of those ALGs (except
                        for FTP <xref target="RFC0959"/>) be disabled by
                        default. </t>

            </list></t>

            <t hangText="">The following requirements reiterate requirements
                from <xref target="BEHAVE-UDP"/> or <xref
                    target="BEHAVE-ICMP"/> that directly affect TCP. This
                document does not relax any requirements in <xref
                    target="BEHAVE-UDP"/> or <xref target="BEHAVE-ICMP"/>.</t>

            <t><list style="hanging">
                    <t hangText="REQ-7">A NAT MUST NOT have a "Port assignment"
                        behavior of "Port overloading" for TCP. </t>

                    <t hangText="REQ-8:"> A NAT MUST support
                        "Hairpinning" for TCP.

                        <list style="hanging"> <t hangText="a)"> A NAT's
                                Hairpinning behavior MUST be of type
                                    "External source IP address and port".</t>

                    </list></t>

                    <t hangText="REQ-9:">If a NAT translates TCP, it
                        SHOULD translate ICMP Destination Unreachable
                        (Type 3) messages.</t>

                    <t hangText="REQ-10:"> Receipt of any sort of ICMP message
                        MUST NOT terminate the NAT mapping or TCP connection
                        for which the ICMP was generated. </t>

            </list></t>

        </section>

        <section anchor="sec-security" title="Security considerations">

            <t><xref target='BEHAVE-UDP'/> discusses security considerations
                for NATs that handle IP and unicast UDP traffic. Security
                concerns specific to handling TCP packets are discussed in this
                section.</t>

            <t><list style="hanging">

                    <t hangText="Security considerations for REQ-1:">This
                        requirement does not introduce any TCP-specific
                        security concerns.</t>

                    <t/>

                    <t hangText="Security considerations for REQ-2:">This
                        requirement does not introduce any TCP-specific
                        security concerns. Simultaneous-open and other
                        transitions in the TCP state machine are by-design and
                        necessary for TCP to work correctly in all scenarios.
                        Further, this requirement only affects connections
                        already in progress as authorized by the NAT in
                        accordance with its policy.</t>

                    <t/>

                    <t hangText="Security considerations for REQ-3:">The
                        security provided by the NAT is governed by its
                        filtering behavior as addressed in <xref
                            target='BEHAVE-UDP'/>. Connection dependent
                        filtering behavior is most secure from a firewall
                        perspective, but severely restricts connection
                        initiations through a NAT. Endpoint independent
                        filtering behavior, which is most transparent to
                        applications, requires an attacker to guess the IP
                        address and port of an active mapping in order to get
                        his packet to an internal host. Address dependent
                        filtering, on the other hand, is less transparent than
                        endpoint independent filtering but more transparent
                        than connection dependent filtering; it is more secure
                        than endpoint independent filtering as it requires an
                        attacker to additionally guess the address of the
                        external endpoint for a NAT session associated with the
                        mapping and be able to receive packets addressed to the
                        same. While this protects against most attackers on the
                        Internet, it does not necessarily protect against
                        attacks that originate from behind a remote NAT with a
                        single IP address that is also translating a legitimate
                        connection to the victim.</t>

                    <t/>

                    <t hangText="Security considerations for REQ-4:">This
                        document recommends a NAT to respond to unsolicited
                        inbound SYN packets with an ICMP error delayed by a few
                        seconds. Doing so may reveal the presence of a NAT to
                        an external attacker. Silently dropping the SYN makes
                        it harder to diagnose network problems and forces
                        applications to wait for the TCP stack to finish
                        several retransmissions before reporting an error. An
                        implementer must therefore understand and carefully
                        weigh the effects of not sending an ICMP error or
                        rate-limiting such ICMP errors to a very small number.
                    </t>

                    <t/>

                    <t hangText="Security considerations for REQ-5:">This
                        document recommends that a NAT that passively monitors
                        TCP state keep idle sessions alive for at least 2 hours
                        4 minutes or 4 minutes depending on the state of the
                        connection. If a NAT is under attack, it may attempt to
                        actively determine the liveliness of a TCP connection or
                        let the NAT administrator configure more
                        conservative timeouts.</t>

                    <t/>

                    <t hangText="Security considerations for REQ-6:">This
                        requirement does not introduce any TCP-specific
                        security concerns.</t>

                    <t/>

                    <t hangText="Security considerations for REQ-7:">This
                        requirement does not introduce any TCP-specific
                        security concerns.</t>

                    <t/>

                    <t hangText="Security considerations for REQ-8:">This
                        requirement does not introduce any TCP-specific
                        security concerns.</t>

                    <t/>

                    <t hangText="Security considerations for REQ-9:">This
                        requirement does not introduce any TCP-specific
                        security concerns.</t>

                    <t/>

                    <t hangText="Security considerations for REQ-10:">This
                        requirement does not introduce any TCP-specific
                        security concerns.</t>

            </list></t>

            <t>NAT implementations that modify TCP sequence numbers (e.g., for
                privacy reasons or for ALG support) must ensure that TCP
                packets with SACK notifications <xref target="RFC2018"/> are
                properly handled.</t>

            <t>NAT implementations that modify local state based on TCP flags
                in packets must ensure that out-of-window TCP packets are
                properly handled. <xref target="RFC4953"/> summarizes and
                discusses a variety of solutions designed to prevent attackers
                from affecting TCP connections. </t>

        </section>

        <section title="IANA considerations">

            <t>This document does not change or create any IANA-registered
                values.</t>

        </section>

        <section title="Acknowledgments">

            <t>Joe Touch contributed the mechanism for handling unsolicited
                inbound SYNs. Thanks to Mark Allman, François Audet, Lars
                Eggert, Paul Francis, Fernando Gont, Sam Hartman, Paul Hoffman,
                Dave Hudson, Cullen Jennings, Philip Matthews, Tom Petch,
                Magnus Westerlund, and Dan Wing for their many contributions,
                comments and suggestions.</t>

        </section>

    </middle>

    <back>

        <references title="Normative References">
            <reference anchor="BEHAVE-UDP">
                <front>
                    <title>Network Address Translation (NAT) Behavioral Requirements for Unicast UDP</title>
                    <author initials='F.' surname='Audet' fullname='F. Audet'>
                        <organization /></author>
                    <author initials='C.' surname='Jennings' fullname='C. Jennings'>
                        <organization /></author>
                    <date year='2007' month='January' />
                    <abstract>
                        <t>This document defines basic terminology for describing different types of Network Address Translation (NAT) behavior when handling Unicast UDP and also defines a set of requirements that would allow many applications, such as multimedia communications or online gaming, to work consistently. Developing NATs that meet this set of requirements will greatly increase the likelihood that these applications will function properly. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t></abstract></front>

                <seriesInfo name='BCP' value='127' />

                <seriesInfo name='RFC' value='4787' />
                <format type='TXT' octets='68693' target='ftp://ftp.isi.edu/in-notes/rfc4787.txt' />
            </reference>
            <?rfc include='reference.RFC.0793.xml'?>
            <?rfc include='reference.RFC.0959.xml'?>
            <?rfc include='reference.RFC.1122.xml'?>
            <?rfc include='reference.RFC.2119.xml'?>
            <?rfc include='reference.RFC.1191.xml'?>


        </references>

        <references title="Informational References">

            <reference anchor="BEHAVE-ICMP">
                <front>
                    <title>NAT Behavioral Requirements for ICMP protocol</title>
                    <author fullname="Pyda Srisuresh" initials="P." surname="Srisuresh"/>
                    <author fullname="Bryan Ford" initials="B." surname="Ford"/>
                    <author fullname="Senthil Sivakumar" initials="S." surname="Sivakumar"/>
                    <author fullname="Saikat Guha" initials="S." surname="Guha"/>
                </front>
                <seriesInfo name="Internet-Draft" value="draft-ietf-behave-nat-icmp" />
            </reference>
            <reference anchor="STUNT">
                <front>
                    <title>NUTSS: A SIP based approach to UDP and TCP connectivity</title>
                    <author fullname="Saikat Guha" initials="S." surname="Guha"/>
                    <author fullname="Paul Francis" initials="P." surname="Francis"/>
                    <date month="August" year="2004"/>
                </front>
                <seriesInfo name="Proceedings of the ACM SIGCOMM Workshop on Future Directions in Network Architecture" value="(Portland, OR)" />
            </reference>
            <reference anchor="NATBLASTER">
                <front>
                    <title>NATBLASTER: Establishing TCP connections between hosts behind NATs</title>
                    <author fullname="Andrew Biggadike" initials="A." surname="Biggadike"/>
                    <author fullname="Daniel Ferullo" initials="D." surname="Ferullo"/>
                    <author fullname="Geoffrey Wilson" initials="G." surname="Wilson"/>
                    <author fullname="Adrian Perrig" initials="A." surname="Perrig"/>
                    <date month="April" year="2005"/>
                </front>
                <seriesInfo name="Proceedings of the ACM SIGCOMM Asia Workshop" value="(Beijing, China)" />
            </reference>
            <reference anchor="P2PNAT">
                <front>
                    <title>Peer-to-peer communication across network address translators</title>
                    <author fullname="Bryan Ford" initials="B." surname="Ford"/>
                    <author fullname="Pyda Srisuresh" initials="P." surname="Srisuresh"/>
                    <author fullname="Dan Kegel" initials="D." surname="Kegel"/>
                    <date month="April" year="2005"/>
                </front>
                <seriesInfo name="Proceedings of the USENIX Annual Technical Conference" value="(Anaheim, CA)" />
            </reference>
            <reference anchor="TCPTRAV">
                <front>
                    <title>Characterization and Measurement of TCP Traversal through NATs and Firewalls</title>
                    <author fullname="Saikat Guha" initials="S." surname="Guha"/>
                    <author fullname="Paul Francis" initials="P." surname="Francis"/>
                    <date month="October" year="2005"/>
                </front>
                <seriesInfo name="Proceedings of the Internet Measurement Conference" value="(Berkeley, CA)" />
            </reference>
            <?rfc include='reference.RFC.4614.xml'?>
            <?rfc include='reference.RFC.1644.xml'?>
            <?rfc include='reference.RFC.1337.xml'?>
            <?rfc include='reference.RFC.2663.xml'?>
            <?rfc include='reference.RFC.3022.xml'?>
            <?rfc include='reference.RFC.2923.xml'?>
            <?rfc include='reference.RFC.2018.xml'?>
            <?rfc include='reference.RFC.4953.xml'?>

        </references>

    </back>

</rfc>
