INTERNET DRAFT                                         Yogesh Prem Swami
File: draft-swami-tcp-lmdr-02.txt                               Khiem Le
Expires: August 09, 2004                           Nokia Research Center
                                                                  Dallas
                                                          March 08, 2004


           Lightweight Mobility Detection and Response (LMDR)
                           Algorithm for TCP


Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of [RFC2026].

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

Abstract

     TCP congestion control is based on the assumption that end-to-end
     path of a connection doesn't change--or at best changes
     infrequently--once the connection is established. However, when a
     user moves from one subnet to another, this assumption breaks down.
     After a subnet change, A TCP sender that relies only on the rate of
     the arrival of ACKs for congestion control may inadvertently add to
     unnecessary network congestion (or reduced throughput). What's
     worse is that a TCP sender may be totally unaware of such user
     mobility and may not be able to take any remedial action to prevent
     packet loss. In this document we describe a network layer
     independent mechanism by which a TCP receiver can propagate its
     subnet change information to its peer, and based on that the sender
     can take appropriate action.


Expires: August 09, 2004                                        [Page 1]

draft-swami-tcp-lmdr-02.txt                               March 08, 2003


1. Introduction

     TCP congestion control [RFC2581] is based on the assumption that
     end-to-end path of a TCP connection does not change--or at best
     changes infrequently--once the connection is established. Based on
     this assumption, TCP increases its data rate (probes the network)
     whenever it receives a positive feedback in the form of ACKs.
     However, unless the assumption of "constant path" is made, the TCP
     sender cannot continue with the old data rate since the two paths
     may have different capacity and levels of congestion

     When a TCP sender or receiver changes its point of attachment to
     the Internet (henceforth referred as "changes subnets"), the entire
     end-to-end path between the sender and receiver can change. In
     these cases, the rate at which ACKs are received only reflect the
     congestion state of the old path. Therefore, relying on the rate of
     arrival of ACKs as the only criterion for congestion control can
     lead to periods of congestion that cannot be alleviated using
     existing algorithm. To summarize:

     After a subnet change following bad things can happen-

     a) the TCP sender MAY add to congestion and continuously
        lose packets in those subnets where there is an influx of
        connections from other subnets, OR

     b) in case the packets sent in old subnet are all lost
        due to subnet change (typically the case with Mobile-IPv4), then
        the TCP sender  may have to wait until the RTO expires before it
        can start its loss recovery algorithm, OR

     c) it MAY spend a lot of time trying to reach a reasonable
        throughput on the new path if the congestion and network
        capacity (measured in terms of bandwidth-delay product) on the
        two different paths are substantially different. This is a
        direct consequence of having a SS_THRESH set to a value that
        does not reflect the real value of SS_THRESH on the new subnet.

     In this document, we describe a network layer independent mechanism
     by which a TCP receiver can propagate its subnet change information
     to its peer. We assume that a mobile host always knows its own
     subnet change (for example, by looking at its neighbor cache,
     destination cache, default router, or a combination of these
     [RFC2461]), but currently, it may not be able to inform about its
     subnet change to its peer.

     Please note that some network layer mobility management techniques
     such Mobile-IPv6 [JPA03] with route optimization may be used to


Expires: August 09, 2004                                        [Page 2]

draft-swami-tcp-lmdr-02.txt                               March 08, 2003


     indirectly derive peer's mobility information (for example, a TCP
     sender can look into its binding cache to derive its peer's
     mobility information), but these schemes do not work in cases such
     as Mobile-IPv6 with reverse tunnelling, Mobile-IPv4 [RFC3344], or
     other types of networks such as traditional cellular networks. Once
     a TCP sender has mobility information about itself or its peer, it
     can use the congestion response described in section-5 to adjust
     its data rate.

     The rest of this document is organized as follows: Section-2
     defines the terminology used in this document. Section-3 describes
     the issue of congestion in more detail. Section-4 has the details
     of subnet change algorithm, and Section-5 contains the associated
     congestion response algorithm. Section-6 describes certain corner
     cases.

2. Terminology

     The key words "MUST," "MUST NOT," "REQUIRED," "SHALL," "SHALL NOT,"
     "SHOULD," "SHOULD NOT," "RECOMMENDED," "MAY," "OPTIONAL," and
     "silently ignore" in this document are to be interpreted as
     described in [RFC2119].

     Mobile Node (MN):
          A host (not a router) capable of changing its point of
          attachment to the Internet without breaking transport layer
          connectivity. Hosts that change their point of attachment to
          the Internet but use DHCP or other mechanism to get a new IP
          address are not considered mobile.

     Old Subnet:
          MN's point of attachment (subnet prefix) to the Internet prior
          to movement

     New Subnet:
          MN's point of attachment after movement.

     Stale ACK:
          ACKs generated in response to the data orginally sent in the
          old subnet (note that some routers might transparently tunnel
          these packets to new subnet, but even in then the ACKs are
          still considered stale).

     INIT_WINDOW:
          The initial congestion window size at the start of connection
          as described in [RFC3390].


Expires: August 09, 2004                                        [Page 3]

draft-swami-tcp-lmdr-02.txt                               March 08, 2003


3. Congestion Issues with Subnet Change

     For concreteness, the description below assumes network mobility
     based on Mobile IP, but the same concepts are readily applicable to
     other types of networks.

     To illustrate the problem, consider Figure-1. At time=T, the MN is
     reachable on Subnet-1 through AR-1 and has the care-of address
     <Subnet-1, MN>. While MN is "attached" to AR-1, packet exchange
     between TCP-Sender and <Subnet-1, MN> takes place using PATH-1.
     Let's assume that after some period of time, at T+1, MN moves
     (hands over) to Subnet-2 and is reachable through AR-2 with the
     care-of address <Subnet-2, MN>. While MN is attached to AR-2, all
     packets exchanged between TCP-Sender and <Subnet-2, MN> traverse
     though the Internet Cloud-2 (which may or may not overlap with
     Cloud-1) and use PATH-2.


                          <---------PATH-1---------->

                            /---------\   +---------+
                            |         |   |         | Subnet-1
                        +---+ Cloud-1 +---+  AR-1   +-->>>>>MN
                        |   |         |   |         |  (Time=T)
       +------------+   |   \----++---/   +---------+
       |            |   |        ||            |
       | TCP Sender +---+        ^V PATH-3    ^V^ PATH-4
       |            |   |        ||            |
       +------------+   |   /----++---\   +----+----+
                        |   |         |   |         | Subnet-2
                        +---+ Cloud-2 +---+  AR-2   +-->>>>>MN
                            |         |   |         |  (Time=T+1)
                            \---------/   +---------+

                           <--------PATH-2----------->


     During the transient period when MN moves from Subnet-1 to
     Subnet-2, AR-1 may (or may not) buffer and forward packets destined
     to and from <Subnet-2, MN> through PATH-3 or through PATH-4 [K03].

     We make the distinction between PATH-3 and PATH-4 to emphasize the
     fact that PATH-4 may belong to a well provisioned network that has
     dynamic equilibrium for mobile users. Such networks are designed to
     accommodate extremely bursty traffic. PATH-3, on the other hand,
     may consist of arbitrary routers without proper provisioning.

     Let's assume that a TCP connection was progressing between MN and


Expires: August 09, 2004                                        [Page 4]

draft-swami-tcp-lmdr-02.txt                               March 08, 2003


     TCP Sender when the user moves from Subnet-1 to Subnet-2. We now
     analyze the problem of congestion on different paths shown above.

3.1 Congestion On PATH-1

     Congestion on PATH-1 is governed by basic slow-start and congestion
     avoidance mechanisms [RFC2581]. As long as MN is on Subnet-1,
     standard congestion control is sufficient. But once it moves from
     Subnet-1 to Subnet-2, two different events can take place:

     1. All packets destined to Subnet-1 are dropped by AR-1.
        In this case, after MN moves to Subnet-2, the TCP sender will
        most likely timeout since the tunnel establishment to the new
        access router will typically exceed the time during which the
        ACKs can trigger new data (in other words, the new data
        triggered by ACKs in flight will still have their tunnel end
        point set to AR-1 because of the latency involved in
        establishing the new tunnel). After timeout, the TCP sender will
        start with a congestion  window of 1 which will hopefully
        traverse the new path PATH-3. In this case there is no need for
        extra congestion control.

        The disadvantage, however, of dropping all packets destined to
        Subnet-1 are:

        a) The sender will wait for one complete RTO before it can
           start loss recovery

        b) If the MN moves faster than one subnet per RTO, on an
           average, the TCP receiver will take a relatively long time to
           recover such packets (theoretically, it will never be able to
           recover, but in practice this is not true due to the
           randomness of motion).

        c) The sender will reduce its SS_THRESH to 1/2 packets in
           flight. Since there is no correlation between BDP and packet
           loss on PATH-1, the throughput of the connection will suffer
           if the SS_THRESH on new path is set to a small value (for
           example, if the sender moves to the new path right after the
           connection setup, and the SS_THRESH will get set to 2*MSS)

     2. All packets (or all packets arriving to AR-1 during some
        period of time) destined to <Subnet-1, MN> are forwarded to
        <Subnet-2, MN> ([K03] describes the details of how this can be
        done). In this case, AR-1 can forward packets to <Subnet-2, MN>
        using PATH-3 or PATH-4. We consider these two paths separately.


Expires: August 09, 2004                                        [Page 5]

draft-swami-tcp-lmdr-02.txt                               March 08, 2003


3.2 Congestion On PATH-3

     If AR-1 starts forwarding packets to AR-2 using PATH-3, PATH-3 will
     experience a sudden burst of data. In addition, If multiple MNs
     move between AR-2 and AR-1, PATH-3 MAY get congested. But if
     sending packets on PATH-3 is bad for other connections, dropping
     them is bad for the connections that change subnets (section-3.1).

3.3 Congestion On PATH-4

     In many cases, it's reasonable to assume that wireless service
     providers will have a well provisioned network that can accommodate
     highly bursty traffic. Such networks may have a dynamic equilibrium
     where the average transit traffic from AR-1 to AR-2 is the same as
     the transit traffic from AR-2 to AR-1. Such well provisioned paths
     are, however, not possible Internet-wide, since different mobile
     users will typically be connected to different hosts.

3.4 Congestion On PATH-2

     Since the MN is able to receive packets even after moving away from
     AR-1, it will continue to generate ACKs in the orderly fashion.
     These ACKs will traverse PATH-3 or PATH-4  and finally reach the
     TCP sender. But the segments sent by TCP sender due to these ACKs
     will travel on PATH-2 (assuming the TCP sender has received the
     binding update to send data on new path). Unfortunately, the TCP
     sender has no congestion information about PATH-2 and using the old
     congestion window may cause network congestion on PATH-2. This
     problem becomes worse as the number of mobile users or rate of
     subnet change increases in the system.

     To summarize, after a subnet change, if the old access router does
     not take part in tunnelling packets to new subnet, there is no
     problem of congestion, but such a scheme is inefficient
     (section-3.1). On the other hand, if an old access router does take
     part in tunnelling packets to new subnet, the new path may get
     heavily congested.

4. Subnet Change Detection

     Quite often, a TCP sender is not aware of its peer's subnet state
     (whether it's in the old subnet or in a new subnet) even though its
     peer almost always knows about its own subnet information. This
     happens, for example, if MN uses Mobile-IPv6 with reverse routing
     (i.e., the home network transparently tunnels all packets to the
     receiver), or Mobile-IPv4, or cellular network for mobility
     management. It's therefore important to have a subnet change
     detection mechanism at the transport layer that can propagate this


Expires: August 09, 2004                                        [Page 6]

draft-swami-tcp-lmdr-02.txt                               March 08, 2003


     information between peers. This section describes such a subnet
     change detection scheme.

     Subnet change detection in itself is a two step process. First, a
     mobile terminal needs to know it has moved from one subnet to
     another; second it needs to propagate this information to its peer.
     Detecting when a mobile terminal has changed its subnet is a
     neighbor discovery [RFC2461] problem and is beyond the scope of
     this document. In this document we assume that hosts can determine
     their own subnet information with the assistance from lower layers.

     We now focus on how a mobile can propagate this information to its
     peer. To do so, we propose to use one bit--call it 'M-bit'--from
     "reserved bits" in the TCP header. This bit acts as a flag whose
     value remains unchanged as long as the mobile remains attached to
     the same subnet. Once the mobile moves to a new subnet, the mobile
     flips (binary NOT) the bits and keeps the bit flipped as long as it
     remains in the new subnet. The peer host compares the value of 'M-
     bit' with the previously received values and uses any M-bit
     transition as an indication for peer's subnet change.

     Following are the details of subnet change detection algorithm:

     1. Each TCP implementation should keep three state
        variables--my_subnet_flag, rem_subnet_flag, and high_out_old--to
        facilitate mobility detection. In addition, a sender MAY also
        keep another state variable--prefix_now--to indicate the current
        subnet-prefix information. The first two flags (my_subnet_flag,
        rem_subnet_flag) hold the mobility state information about the
        local and remote TCP respectively. 'high_out_old' is the highest
        sequence number of packet-in-flight when a TCP receiver detects
        that its peer has changed subnet. This state information is
        needed for congestion response.

     2. At connection set up, both the client and server willing to
        have mobility detection should set the M=1 in the SYN packets
        sent by TCP client and server. If either (or both) of the SYN
        packets has M=0, then the TCP sender should stop processing
        mobility detection and response scheme. In these cases a Mobile
        Host should let the sender timeout after subnet change.

        Once both the entities know that the sender and receiver have
        mobility detection capabilities, the TCP sender and receiver
        should initialize

                    my_subnet_flag =1; remote_subnet_flag=1;

     3. For each packet sent, each host should determine


Expires: August 09, 2004                                        [Page 7]

draft-swami-tcp-lmdr-02.txt                               March 08, 2003


        if it has moved to a new subnet. If either of the end points
        determines that it has moved, it should update the value of
        my_subnet_flag as follows:

                      my_subnet_flag =  ~(my_subnet_flag)

        where '~' is the boolean operation NOT. ***In addition, the
        receiver should also send an ACK with the highest sequence
        number within the maximum delayed ACK period if no such ACK is
        already scheduled.***

     4. Before sending any data or ACK packet, the TCP sender should
        set the value of M-bit in the TCP header as:

                                M=my_subnet_flag

     5. When the peer TCP receives a valid TCP packet, it should
        compare the value of 'M-bit' with the value of
        'rem_subnet_flag.' If the two values match, TCP should proceed
        as usual. If the two flags differ, then the TCP sender SHOULD
        update the variables as follows:

                  rem_subnet_flag=M-bit of the present packet.

                high_out_old = Sequence Number of the Last Byte
                          in the retransmission queue.

     The peer TCP uses 'high_out_old' so that it does not base the
     congestion control decisions on stale ACKs.

     After making these changes, the TCP sender SHOULD follow the
     congestion response algorithm as described in section-5.

NOTE: In certain network architectures it's possible that a mobile
   (and the associated link technology) has information on the
   congestion of the new path. In these cases, if the congestion on the
   new path is low, one MAY choose not to indicate the mobility
   information (i.e., flip the 'M-bit') to the sender since there is no
   need to reduce the data rate. However, the mobility information MUST
   be indicated if no such information is available.

Implementation Note: Since M-bit is part of reserved bit, a firewall
   may drop the SYN packet itself [RFC3360]. Enabling this feature
   should take care of this in order to prevent black holes.

5. Congestion Response after Subnet Change

     The goal of congestion response after subnet change is to minimize


Expires: August 09, 2004                                        [Page 8]

draft-swami-tcp-lmdr-02.txt                               March 08, 2003


     congestion on PATH-2. In principle, congestion response for PATH-2
     has the same congestion control issues as with initiating a new
     connection--the sender should have no more than INIT_WINDOW worth
     of data outstanding on the *new path* and the SS_THRESH should be
     set to a large value. What makes the problem complex is the fact
     that unlike new connections, connections after subnet change have
     non-zero packets in flight. ***The congestion response after subnet
     change MUST therefore ignore the stale-ACKs and only use the ACKs
     generated in the new subnet to base its congestion control
     decisions.*** Unfortunately, the cumulative ACK property of TCP
     does not allow an easy way to ignore stale-ACKs. In this document
     we describe the congestion response in the presence of SACK option
     [RFC2018] only.

     With SACK option the congestion response waits for the SACK/ACK of
     new data sent in the new subnet, before growing its window.
     Following are the details of the algorithm:

     1. Set the congestion window as

                           cwnd=cwnd+INIT_WINDOW;

     2. Send INIT_WINDOW worth of data on the new path and
        restart RTO timer as if this were a new connection [RFC2018].

     3. For each subsequent ACK received, follow
        mobile_SACK_cong_resp()

            mobile_SACK_cong_resp(tcp_packet ack_pkt){

                IF ( ( ack_packet contains an ACK seq >
                                high_out_old) OR
                   ( ack_packet contains a SACK seq > high_out_old)){

                    cwnd=INIT_WINDOW + 2;
                    SS_THRESH =INFINITE;

                    if( ack_packet contained a SACK >
                                high_out_old){

                        Mark packets less than
                        high_out_old without a
                        SACK flag as lost;

                        Update packets in flight
                        assuming all unsacked packets
                        were lost;


Expires: August 09, 2004                                        [Page 9]

draft-swami-tcp-lmdr-02.txt                               March 08, 2003


                        Do loss recovery as described in
                        [RFC3517];

                    } else {

                        send new data as appropriate;

                    }

                    Follow [RFC2988] for timer calculation as if
                    this were a new connection;
                }
                ELSE {
                    cwnd = 0; /* Don't send any new data */

                    If ACK contains a SACK block, mark the
                    packet as sacked;

                    DO NOT restart the RTO timer even for
                    pure ACKs;
            }

     Please note that the above algorithm waits for an ACK or SACK block
     that must have traversed the new path. In addition, the timer
     values are initialized as if this were a new connection. The timer
     values are not reset for stale ACKs since they don't provide any
     new congestion information (data flow rate) about the new path.

6. Anomalies

6.1 Race Conditions

     The congestion response algorithm described above works fine as
     long as the TCP sender receives the flipped M-bit before the new
     path is established. But if the flipped M-bit is received much
     later, the TCP sender would have already injected some data on the
     new path. An implementation MUST take proper precaution to send the
     M-bit before the new path is established (for example, by sending
     the flipped M-bit in parallel with the binding update procedure)

6.2 Rapid Subnet Hopping

     Consider the case when a mobile node moves from subnet-1 to
     subnet-2, to subnet-3 in a short period of time. If all the ACKs
     generated in subnet-2 are lost, it's possible that the sender will
     miss the subnet change indication. We believe that such events are
     rare and we do not attempt to solve it.


Expires: August 09, 2004                                       [Page 10]

draft-swami-tcp-lmdr-02.txt                               March 08, 2003


7. Architectural Considerations

     Architecturally, the method described above does not add any new
     architectural features in the system. Although LMDR requires a TCP
     receiver to look into some parameters and data structures (local to
     that stack) that are specific to IP layer, it should not be a
     problem either from an implementation point of view or from a
     theoretical point of view. In most cases, TCP layer already
     consults the IP layer for MTU information at the very least.

     Recently several proposals have been made regarding link-up and
     link-down, which addresses different link layer related issues.
     LMDR is different from these proposals and it's been designed for
     just one purpose: Subnet change notification and response for a TCP
     connection. LMDR does not try to solve any link-up or link-down
     issues which may or may not take place due to subnet change.

8. Security Considerations

     Since M-bit is valid only for an acceptable ACK [RFC793], it's
     immune to passive attacks as long as the congestion window is not
     of the order of 2^31 bytes. However, M-bit is not safe against
     active DoS attacks (present TCP is not safe either). We will
     describe a security mechanism (a TCP option) to protect against
     active attacks if there is a requirement from the working group.

9. Acknowledgments

     We would like to thank Mark Allman for his comments and suggestions
     on the draft.


10. REFERENCES

     [RFC2581]  M. Allman, V. Paxson, W. Stevens, "TCP Congestion
                Control," Apr 1999.

     [K03]      R. Koodli, "Fast Handover for Mobile IPv6," Internet
                draft; work in progress, draft-ietf-mobileip-fast-
                mipv6-07.txt, Sept 2003.

     [RFC2461]  T. Narten, E. Normark., W, Simpson, " Neighbor Discovery
                for IP Version 6 (IPv6)," Dec 1998.

     [JPA03]    D. Johnson, C. Perkins, J. Arkko, "Mobility Support in
                IPv6," Internet Draft; Work In Progress, draft-ietf-
                mobileip-ipv6-24.txt, June 2003.


Expires: August 09, 2004                                       [Page 11]

draft-swami-tcp-lmdr-02.txt                               March 08, 2003


     [RFC3344]  C. Perkins, "IP Mobility Support for IPv4," Aug 2002.

     [RFC3390]  M. Allman, S. Floyd, C. Partridge, "Increasing TCP's
                Initial Window," Oct 2002.

     [RFC3360]  S. Floyd, "Inappropriate TCP Resets Considered Harmful,"
                Aug 2002.

     [RFC3517]   E. Blanton, M. Allman, K. Fall, L. Wang, "A
                Conservative SACK-based Loss Recovery Algorithm for
                TCP," Internet draft; work in progress, draft-allman-
                tcp-sack-13.txt, Oct 2002.

     [RFC2018]  M. Mathis, J. Mahdavi, S. Floyd, A. Romanow, "TCP
                Selective Acknowledgment Options," RFC 2018. Nov 2000.

     [RFC2988]  V. Paxson, M. Allman, "Computing TCP's Retransmission
                Timer," Nov 2000.

     [RFC793]   "Transmission Control Protocol," RFC-793, Sept 1981.

11. IPR Statement

     The IETF has been notified of intellectual property rights claimed
     in regard to some or all of the specification contained in this
     document. For more information consult the on-line list of claimed
     rights at http://www.ietf.org/ipr.

Author's Address:

   Yogesh Prem Swami                   Khiem Le
   Nokia Research Center, Dallas       Nokia Research Center, Dallas
   6000 Connection Drive               6000 Connection Drive
   Irving, TX-75063, USA.              Irving, TX-75063. USA.

   E-Mail: yogesh.swami@nokia.com      E-Mail: khiem.le@nokia.com
   Ph    : +1 972 374 0669             Ph    : +1 972 894 4882


Expires: August 09, 2004                                       [Page 12]