<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
    
]>

<!-- may be omitted for very short documents -->
<?rfc toc="yes"?>
<?rfc sortrefs="no"?>
<?rfc symrefs="yes"?>
<?rfc strict="yes"?>
<?rfc rfcedstyle="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>

<!-- these two save paper: start new sections from the same page etc. -->
<?rfc compact="yes"?> <?rfc subcompact="no"?>

<!-- other categories: bcp, exp, historic, std -->
<rfc ipr="trust200902" category="exp" docName="draft-han-tsvwg-cc-00">

  <front>
    <title abbrev="New Congestion Control">A New Congestion Control in Bandwidth Guaranteed Network</title>

    <author initials="L." surname="Han" fullname="Lin Han">
      <organization>Huawei</organization>
      <address>
        <postal>
          <street>2330 Central Expressway</street>
          <city>Santa Clara</city>
          <code>CA 95050</code>
          <country>USA</country>
        </postal>
        <email>lin.han@huawei.com</email>
      </address>
    </author>
    <author initials="Y." surname="Qu" fullname="Yingzhen Qu">
      <organization>Huawei</organization>
      <address>
        <postal>
          <street>2330 Central Expressway</street>
          <city>Santa Clara</city>
          <code>CA 95050</code>
          <country>USA</country>
        </postal>
        <email>yingzhen.qu@huawei.com</email>
      </address>
    </author>
    <author initials="T." surname="Nadeau" fullname="Thomas Nadeau">
      <organization>Lucid Vision</organization>
      <address>
        <postal>
          <street></street>
          <city>Hampton</city>
          <code>NH 03842</code>
          <country>USA</country>
        </postal>
        <email>tnadeau@lucidvision.com</email>
      </address>
    </author>
    <date/>
    
<area>Transport Area</area>
<workgroup>TSVWG Working Group</workgroup>

<keyword>congestion control</keyword>
<keyword>TCP</keyword>


    <abstract>
      <t>In bandwidth guaranteed networks, network resources are reserved
   before a TCP session starts transmitting data.  This draft proposes a
   new TCP congestion control algorithm used in bandwidth guaranteed
   networks.  It is an extension to the current TCP standards.
      </t> 
    </abstract>
  </front>
  <middle>

    <section anchor="sec.introduction" title="Introduction">

      <t>The original IP protocol suite was designed to support best-effort data transmission. With the development of the Internet, congestion became a real problem. To avoid congestion in the Internet, TCP uses congestion-avoidance algorithms to keep hosts from pumping too much traffic into the network. Over the past 40 years there have been various algorithms and optimizations proposed to solve this problem, including TCP-RENO <xref target="RFC5681"/>, TCP-NewReno <xref target="RFC6582"/> <xref target="RFC6675"/>, TCP-Cubic <xref target="RFC8312"/> and BBR <xref target="I-D.cardwell-iccrg-bbr-congestion-control"/> etc.
      </t>
      <t>In bandwidth guaranteed networks, network resources are reserved before transmitting data. This draft proposes a new congestion control algorithm that should be used in bandwidth guaranteed networks to improve TCP throughput. The following is a list of key differences between this new algorithm and classic TCP congestion control [RFC5681]:
         <list style="empty">
       <t>It doesn’t have a slow start, after a TCP session is successfully initiated its congestion window (cwnd) jumps to CIR and the host is allowed to transmit data. This is based on the assumption that network resources have been reserved in bandwidth guaranteed networks.</t>
       <t>During congestion avoidance, cwnd stays between CIR (Committed Information Rate) and PIR (Peak Information Rate). If there is no packet loss due to congestion, cwnd has a flat top rate as PIR.</t>
       <t>OAM is used together with duplicate ACKs to detect whether a packet loss is due to congestion or random failure.</t>
      </list></t>
      <t>This draft is organized as follows. Section 2 defines terminologies used in this draft. Section 3 provides background information for Bandwidth Guaranteed Networks. Section 4 explains the details of the new congestion control algorithm. 
      </t>

    </section>

    <section anchor="sec.term-not" title="Terminology and Notation">

      
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
in this document are to be interpreted as described in <xref target="RFC2119"/>.</t>


      <t>Some of the following terms are defined the same as <xref target="RFC5681"/>, and they are copied here for readability.
      <list style="empty">
        <t>FULL-SIZED SEGMENT: A segment that contains the maximum number of data bytes permitted (i.e., a segment containing SMSS bytes of data).</t>
        <t>RECEIVER WINDOW (rwnd): The most recently advertised receiver window.</t>
        <t>CONGESTION WINDOW (cwnd): A TCP state variable that limits the amount of data a TCP can send.  At any given time, a TCP MUST NOT send data with a sequence number higher than the sum of the highest acknowledged sequence number and the minimum of cwnd and rwnd.</t>
        <t>Sender Maximum Segment Size (SMSS): The SMSS is the size of the
      largest segment that the sender can transmit.  This value can be
      based on the maximum transmission unit of the network, the path
      MTU discovery [RFC1191, RFC4821] algorithm, RMSS (see next item),
      or other factors.  The size does not include the TCP/IP headers
      and options.</t>
         <t>RECEIVER MAXIMUM SEGMENT SIZE (RMSS): The RMSS is the size of the
      largest segment the receiver is willing to accept.  This is the
      value specified in the MSS option sent by the receiver during
      connection startup.  Or, if the MSS option is not used, it is 536
      bytes [RFC1122].  The size does not include the TCP/IP headers and
      options.</t>
      <t>INITIAL WINDOW (IW): The initial window is the size of the sender's
      congestion window after the three-way handshake is completed.</t>
      <t>RESTART WINDOW (RW): The restart window is the size of the congestion
      window after a TCP restarts transmission after an idle period.</t>
      <t>ssthresh: Slow Start Threshold. </t>
      <t>OAM: Operations, Administrations, and Maintenance. </t>
      <t>RTT: Round-Trip Time. </t>
      <t>CIR: Committed Information Rate. </t>
      <t>PIR: Peak Information Rate. </t>
   
      </list></t>


    </section>

    <section anchor="sec.bandwidth" title="Bandwidth Guaranteed Network">
      <t>With the development of new applications, such as AR/VR, the network is required to provide bandwidth guaranteed services. There have been various solutions, including out-of-band signaling protocols such as RSVP <xref target="RFC2205"/> and NSIS <xref target="RFC4080"/>, and in-band-signaling as proposed in <xref target="I-D.han-6man-in-band-signaling-for-transport-qos"/>. The common objective of all these solutions is to have network resources/bandwidth reserved before data is transmitted. The details of how the resource is reserved are out of the scope of this draft, however it is assumed that in bandwidth guaranteed networks there have been network resources (bandwidths, queues etc.) dedicated to the TCP flows, and data is guaranteed at CIR rate. When data rate is between CIR and PIR shared resources are used, and traffic above CIR rate is not guaranteed. No traffic above PIR rate will be allowed to enter the network. 
      </t>
      <t>The proposed congestion control also requires that OAM (Operations, administration and management) is used to constantly report on the network condition parameters. Before a TCP session is started, important network parameters need to be detected by OAM, such as number of hops, Round Trip Time (RTT). This might be done through setting up a measuring TCP connection. The measuring TCP connection does not have user data, and it is only used to measure the key network parameters. As the network status is constantly changing, after a TCP session is established, these parameters need to be updated. This requires a sender to periodically or consistently embed TCP data packet with OAM <xref target="I-D.han-6man-in-band-signaling-for-transport-qos"/> <xref target="I-D.ietf-ippm-ioam-data"/> to detect current buffer depth, RTT etc. It is important that OAM needs to be able to detect if any device’s buffer depth has exceeded the pre-configured threshold, as this is an indication of potential congestion and packet drop. When this happens, OAM should send a possible congestion alarm to the TCP sender. In case the retransmit timer expires on this TCP sender, if a possible congestion alarm has been received it means a packet is dropped due to congestion. Otherwise it is possible that this packet drop might due to some physical failure. The OAM details are out of the scope of this draft. Please refer to other related drafts.
      </t>
      <t>In summary, in bandwidth guaranteed networks resources are reserved before transmitting data, and OAM is used to get network statistics. The new congestion control proposed in this draft is to be used in this kind of bandwidth guaranteed networks.
      </t>
    </section>

    <section anchor="sec.cc" title="New Congestion Control">

      <t><xref target="RFC5681"/> defines a set of TCP congestion algorithms: slow start, congestion avoidance, fast retransmit and fast recovery. The proposed congestion control in this draft is an extension to RFC 5681, and it only differs in the congestion control algorithm on the sender side. 
      </t>
      <section anchor="sec.rwnd" title="Receiver Advertised Window Size">
        <t>Receiver’s advertised window (rwnd) is a receiver-side limit on the amount of outstanding data, so a sender should not send data more than this window size. It is calculated as the following:</t>
        <figure>
          <artwork>
   rwnd = AdvertisedWND = MaxRcvBuffer - (LastByteRcvd - LastByteRead)
          </artwork>
        </figure>
        
      </section>
       <section anchor="sec.cwnd" title="MinBandwidthWND and MaxBandwidthWND">
        <t>Same as <xref target="RFC5681"/>, on the sender side, the congestion window (cwnd) is the sender-side limit on the amount of data that the sender can transmit before receiving an acknowledgement (ACK). Considering both the sender and the receiver side, the effective sending window is always the minimum of cwnd and rwnd:
        </t>
        <figure>
          <artwork>
   EffectiveWND = min(cwnd, rwnd)
          </artwork>
        </figure>
        <t>A TCP sender MUST NOT send data more than the minimum of cwnd and rwnd. </t> 
        <t>Slow-start is commonly used in TCP at the beginning of a transfer or after a loss repair as the network conditions are unknown, hence this slow probing is necessary to determine the available network capacity in order to avoid inappropriately sending large burst of data into the network and cause congestion. A detailed discussion about initial window setting is provided in <xref target="RFC3390"/>.
        </t>

        <t>RTT is the time taken to send a packet to the destination plus receiving a response packet(ACK). Since the network status is constantly changing, RTT also varies. <xref target="RFC6298"/> specifies how RTT should be sampled and updated. In this new algorithm RTT is updated using the following formula: 
        </t>
        <figure>
          <artwork>
   RTT = a* old RTT + (1-a) * new RTT   (0 &lt; a &lt; 1)   (1)
          </artwork>
        </figure>
        <t>The initial RTT can be achieved using a measure TCP connection, or configured based on historical data.
        </t>
        <t>In bandwidth guaranteed network since resources are already allocated and the network status is known through OAM <xref target="I-D.han-6man-in-band-signaling-for-transport-qos"/>, it is safe to remove slow-start and allow a host to start sending traffic at the rate of CIR after the TCP session is established. 
        </t>


        <t>There are two important window sizes, the MinBandwidthWND and the MaxBandwidthWND are calculated as below:
        </t>

        <figure>
          <artwork>
   MinBandwidthWND = CIR * RTT/MSS    (2)
   MaxBandwidthWND = PIR * RTT/MSS    (3)
          </artwork>
        </figure>
        <t>In bandwidth guaranteed networks, after a TCP session is established, the sender can start transmitting data at an initial window size, which is equal to MinBandwidthWND:
        </t>
        <figure>
          <artwork>
   cwnd = MinBandwidthWND
   IW = min (cwnd, rwnd) 
          </artwork>
        </figure> 
        <t>If the receiver window (rwnd) is not a limiting factor, the sender will start sending data at CIR rate. This is a key difference from the classic TCP slow-start, which usually starts from sending one or two packets <xref target="RFC5681"/>.
        </t>
      </section>
      <section anchor="sec.congestion" title="Congestion Avoidance">
        <t>In TCP-Reno, a TCP enters congestion avoidance mode after slow-start. In bandwidth guaranteed networks, there is no slow-start, so a TCP enters congestion avoidance mode right after the initial start.
        </t>
        <t>During congestion avoidance, for approximately per round-trip time when a valid ACK packet is received, cwnd is increased by one until it reaches MaxBandwidthWND.
        </t>
        <figure>
          <artwork>
  If (cwnd &lt; MaxBandwidthWND) {
    cwnd +=1;
  } else {
    cwnd = MaxBandwidthWND;
  }
          </artwork>
        </figure>  
        <t>Once the cwnd reaches MaxBandwidthWND , it stays constant at MaxBandwidthWND until packet loss is detected. This is another major difference from <xref target="RFC5681"/>. In <xref target="RFC5681"/> congestion avoidance period, the cwnd keeps increasing until a TCP sender detects segment loss. However, in this new congestion control algorithm, the cwnd stays constant at MaxBandwidthWND until there is packet loss detected. 
        </t>
        <t>This means a TCP sender is never allowed to send data at a rate larger than PIR, and it's different from TCP Reno.</t>
      </section>
      <section anchor="sec.fast" title="Fast Retransmit and Fast Recovery">
        <t>Same as defined <xref target="RFC5681"/>, a TCP receiver SHOULD send an immediate duplicate ACK when an out-of-order segment arrives. The TCP sender detects and repair loss based on incoming duplicate ACKs. If 3 duplicate ACKs are received, the sender uses it as an indication that a segment has been lost, and will perform a retransmission of the lost segment.
        </t>

        <t>In TCP-Reno <xref target="RFC5681"/>, after the fast retransmit of what appears to be the lost segment, fast recovery is used to continue to transmit new segments at a reduced rate ssthresh.
        </t>

        <t>In the new congestion control algorithm, upon receiving duplicate ACKs the fast retransmit and fast recovery follow the below rules:
        </t>
        <t><list style="symbols">
          <t>When a sender receives the first and second duplicate ACKs, same as <xref target="RFC5681"/>, the cwnd is not changed, and the sender continues to send traffic.</t>
          <t>When a sender receives the third duplicated ACK, if the retransmission timer has not expired and a previous OAM congestion alarm has been received it is likely a segment is lost due to congestion. The sender will perform a retransmission of the lost segment, and the cwnd is set to be MinBandwidthWND.</t>
          <t>When a sender receives the third duplicated ACK, but no previous OAM congestion alarm has been received, then it is considered that a segment is lost due to random failure not congestion. In this case the cwnd is not changed. </t>
        </list></t>
        <t>Compared to <xref target="RFC5681"/>, where in case of network congestion the new cwnd is set to be ssthresh, which is usually half of the old cwnd. In this new congestion control, in case there is a segment loss detected as described above, the new cwnd is set to be MinBandwithWND as in equation (2).
        </t>
      </section>
      <section anchor="sec.timeout" title="Timeout">
        <t>If a retransmission timer <xref target="RFC6298"/> in a TCP sender expires, in bandwidth guaranteed networks no matter duplicate ACK received or not, this most likely indicates a physical failure.</t>
        <t>In this case, the cwnd is set to be one, and the TCP sender will retransmit the lost segment. This packet also services the function of probing network status. If there is really a network failure, no ACK will be received and the retransmission timer will expire again. Upon receiving an expected ACK after the retransmission, it means the network has recovered, and the cwnd will be set to be MinBandwidthWND as in equation (2).</t>
      </section>
      <section anchor="sec.idle" title="Idle Recovery">
        <t>It is defined in <xref target="RFC5681"/> that a TCP session should use slow start to restart transmission after a long idle period more than one retransmission timeout, and the RW (Restart Window) is the minimum of IW and cwnd. 
        </t>
        <t>In this proposal, the same rule is still followed. However due to the fact that there is no slow start needed in bandwidth guaranteed networks, and the IW in this new congestion control is set to be MinBandwidthWND, a TCP sender can start transmitting data at CIR rate after a long idle. 
        </t>
      </section>

    </section>
   

    <section anchor="sec.iana" title="IANA Considerations">

      <t>NA. </t>
     
    </section>

    <section anchor="sec-cons" title="Security Considerations">

   <t>This proposal makes no change to the underlying security of TCP. More information about TCP security concerns can be found in <xref target="RFC5681"/>.</t>

    </section>

  </middle>

  <back>

    <references title="Normative References">
   <?rfc include="reference.RFC.2119"?>
    </references>

    <references title="Informative References">
      <?rfc include="reference.RFC.2205"?>
      <?rfc include="reference.RFC.3390"?>
      <?rfc include="reference.RFC.4080"?>
      <?rfc include="reference.RFC.4960"?>
      <?rfc include="reference.RFC.5681"?>
      <?rfc include="reference.RFC.6298"?>
      <?rfc include="reference.RFC.6582"?>
      <?rfc include="reference.RFC.6675"?>
      <?rfc include="reference.RFC.8312"?>
      <?rfc include="reference.I-D.cardwell-iccrg-bbr-congestion-control.xml"?>
      <?rfc include="reference.I-D.han-6man-in-band-signaling-for-transport-qos"?>
      <?rfc include="reference.I-D.ietf-ippm-ioam-data"/?>
    </references>


    <section anchor="acknowledgments" title="Acknowledgments" numbered="no">
      <t>The authors wish to thank xxxx for
      their helpful comments and suggestions.</t>
    </section>

  </back>

</rfc>
