Updating TCP to support Rate-Limited
TrafficUniversity of AberdeenSchool of EngineeringFraser Noble BuildingAberdeenScotlandAB24 3UEUKgorry@erg.abdn.ac.ukhttp://www.erg.abdn.ac.ukUniversity of AberdeenSchool of EngineeringFraser Noble BuildingAberdeenScotlandAB24 3UEUKarjuna@erg.abdn.ac.ukhttp://www.erg.abdn.ac.ukUniversity of AberdeenSchool of EngineeringFraser Noble BuildingAberdeenScotlandAB24 3UEUKraffaello@erg.abdn.ac.ukhttp://www.erg.abdn.ac.uk
Transport
TCPM Working GroupCWVTCPThis document proposes an update to RFC 5681 to address issues that
arise when TCP is used to support traffic that exhibits periods where
the sending rate is limited by the application rather than the
congestion window. It provides an experimental update to TCP that allows
a TCP sender to restart quickly following either a rate-limited
interval. This method is expected to benefit applications that send
rate-limited traffic using TCP, while also providing an appropriate
response if congestion is experienced.It also evaluates the Experimental specification of TCP Congestion
Window Validation, CWV, defined in RFC 2861, and concludes that RFC 2861
sought to address important issues, but failed to deliver a widely used
solution. This document therefore recommends that the status of RFC 2861
is moved from Experimental to Historic, and that it is replaced by the
current specification.TCP is used to support a range of application behaviours. The TCP
congestion window (cwnd) controls the number of unacknowledged
packets/bytes that a TCP flow may have in the network at any time, a
value known as the FlightSize . A bulk
application will always have data available to transmit. The rate at
which it sends is therefore limited by the maximum permitted by the
receiver advertised window and the sender congestion window (cwnd). In
contrast, a rate-limited application will experience periods when the
sender is either idle or is unable to send at the maximum rate permitted
by the cwnd. The update in this document targets the operation of TCP in
such rate-limited cases.Standard TCP states that a TCP sender
SHOULD set cwnd to no more than the Restart Window (RW) before beginning
transmission, if the TCP sender has not sent data in an interval
exceeding the retransmission timeout, i..e when an application becomes
idle. noted that this TCP behaviour was
not always observed in current implementations. Experiments confirm this to still be the case.CWV introduced the terminology of "application limited periods". This
document describes any time that an application limits the sending rate,
rather than being limited by the transport, as "rate-limited". This
update improves support for applications that vary their transmission
rate, either with (short) idle periods between transmission or by
changing the rate the application sends. These applications are
characterised by the TCP FlightSize often being less than cwnd. Many
Internet applications exhibit this behaviour, including web browsing,
http-based adaptive streaming, applications that support query/response
type protocols, network file sharing, and live video transmission. Many
such applications currently avoid using long-lived (persistent) TCP
connections (e.g. servers typically
support persistent HTTP connections, but short server timeouts often prevent using it).
Such applications often instead either use a succession of short TCP
transfers or use UDP.Standard TCP does not impose additional restrictions on the growth of
the congestion window when a TCP sender is unable to send at the maximum
rate allowed by the cwnd. In this case the rate-limited sender may grow
a cwnd far beyond that corresponding to the current transmit rate,
resulting in a value that does not reflect current information about the
state of the network path the flow is using. Use of such an invalid cwnd
may result in reduced application performance and/or could significantly
contribute to network congestion. proposed a solution to these issues in
an experimental method known as Congestion Window Validation (CWV). CWV
was intended to help reduce cases where TCP accumulated an invalid cwnd.
The use and drawbacks of using the CWV algorithm in RFC 2861 with an
application are discussed in . defines relevant terminology. specifies an alternative to CWV that
seeks to address the same issues, but does this in a way that is
expected to mitigate the impact on an application that varies its
sending rate. The updated method applies to the rate-limited conditions
(including both an application-limited and idle sender).The goals of this update are:To not change the behaviour of a TCP sender that performs bulk
transfers that consume the cwnd.To provide a method that co-exists with Standard TCP and other
flows that use this updated method.To reduce transfer latency for applications that change their
rate over short intervals of time.To avoid a TCP sender growing a large "non-validated" cwnd, when
it has not recently sent using this cwnd.To remove the incentive for ad-hoc application or network stack
methods (such as "padding") solely to maintain a large cwnd for
future transmission.To incentivise the use of long-lived connections, rather than a
succession of short-lived flows, benefiting both flows and network
when actual congestion is encountered. describes the rationale for selecting the
safe period to preserve the cwnd.This document was produced by the TCP Maintenance and Minor
Extensions (tcpm) working group.The document updates and obsoletes the methods described in . It recommends a set of mechanisms, including
the use of pacing during a non-validated period. The updated
mechanisms are intended to have a less aggressive congestion impact
than would be exhibited by a standard TCP sender.The specification in this draft is classified as "Experimental"
pending experience with deployed implementations of the methods. described a simple modification to the
TCP congestion control algorithm that decayed the cwnd after the
transition to a “sufficiently-long” idle period. This used
the slow-start threshold (ssthresh) to save information about the
previous value of the congestion window. The approach relaxed the
standard TCP behaviour for an idle
session, intended to improve application performance. CWV also modified
the behaviour where a sender transmitted at a rate less than allowed by
cwnd. proposed two set of responses, one
after an "application-limited" and one after an "idle period". Although
this distinction was argued, in practice differentiating the two
conditions was found problematic in actual networks (e.g.). This offers predictable performance for long
on-off periods (>>1 RTT), or slowly varying rate-based traffic,
the performance could be unpredictable for variable-rate traffic and
depended both upon whether an accurate RTT had been obtained and the
pattern of application traffic relative to the measured RTT.Many applications can and often do vary their transmission over a
wide range rates. Using such applications
often experienced varying performance, which made it hard for
application developers to predict the TCP latency even when using a path
with stable network characteristics. We argue that an attempt to
classify application behaviour as application-limited or idle is
problematic and also inappropriate. This document therefore explicitly
avoids trying to differentiate these two cases, instead treating all
rate-limited traffic uniformly. has been implemented in some
mainstream operating systems as the default behaviour . Analysis (e.g. ) has shown that a TCP sender using CWV is
able to use available capacity on a shared path after an idle period.
This can benefit variable-rate applications, especially over long delay
paths, when compared to the slow-start restart specified by standard
TCP. However, CWV would only benefit an application if the idle period
were less than several Retransmission Time Out (RTO) intervals , since the behaviour would otherwise be the
same as for standard TCP, which resets the cwnd to the TCP Restart
Window after this period.To enable better performance for variable-rate applications with TCP,
some operating systems have chosen to support non-standard methods, or
applications have resorted to "padding" streams to maintain their
sending rate when they have no data to transmit. Although transmitting
redundant data across a network path provides good evidence that the
path can sustain data at the offered rate, padding also consumes network
capacity and reduces the opportunity for congestion-free statistical
multiplexing. For variable-rate flows, the benefits of statistical
multiplexing can be significant and it is therefore a goal to find a
viable alternative to padding streams.Experience with suggests that although
the CWV method benefited the network in a rate-limited scenario
(reducing the probability of network congestion), the behaviour was too
conservative for many common rate-limited applications. This mechanism
did not therefore offer the desirable increase in application
performance for rate-limited applications and it is unclear whether
applications actually use this mechanism in the general Internet.It is therefore concluded that CWV, as defined in , was often a poor solution for many
rate-limited applications. It had the correct motivation, but had the
wrong approach to solving this problem.The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in .The document assumes familiarity with the terminology of TCP
congestion control .The following terminology is used in this document:cwnd-limited: A TCP flow that has sent the maximum number of segments
permitted by the cwnd, where the application utilises the allowed
sending rate (see ).pipeACK sample: A measure of the volume of data acknowledged by the
network within an RTT.pipeACK variable: A variable that measures the available capacity
using the set of pipeACK samples.pipeACK Sampling Period: The maximum period that a measured pipeACK
sample may influence the pipeACK variable.Non-validated phase: The phase where the cwnd reflects a previous
measurement of the available path capacity.Non-validated period, NVP: The maximum period for which cwnd is
preserved in the non-validated phase.Rate-limited: A TCP flow that does not consume more than one half of
cwnd, and hence operates in the non-validated phase. This includes
periods when an application is either idle or chooses to send at a rate
less than the maximum permitted by the cwnd.Validated phase: The phase where the cwnd reflects a current estimate
of the available path capacity.This section proposes an update to the TCP congestion control
behaviour during a rate-limited interval. This new method intentionally
does not differentiate between times when the sender has become idle or
chooses to send at a rate less than the maximum allowed by the cwnd.The period where actual usage is less than allowed by cwnd, is named
as the non-validated phase. The update allows an application in the
non-validated phase to resume transmission at a previous rate without
incurring the delay of slow-start. However, if the TCP sender
experiences congestion using the preserved cwnd, it is required to
immediately reset the cwnd to an appropriate value specified by the
method. If a sender does not take advantage of the preserved cwnd within
the NVP, the value of cwnd is reduced, ensuring the value better
reflects the capacity that was recently actually used.It is expected that this update will satisfy the requirements of many
rate-limited applications and at the same time provide an appropriate
method for use in the Internet. Some applications use dummy packets (aka
"padding") to maintain a sending rate when an application has now data
to send. Although this ensures the path continues to support the rate
permitted by the cwnd, it wastes network capacity sending useless data.
New-CWV reduces this incentive for an application to send data simply to
keep transport congestion state.The method is specified in following subsections and is expected to
encourage applications and TCP stacks to use standards-based congestion
control methods. It may also encourage the use of long-lived connections
where this offers benefit (such as persistent http).A sender starts a TCP connection in the validated phase and
initialises the pipeACK variable to the "undefined" value. This value
inhibits use of the value in cwnd calculations. defines a variable, FlightSize, that
indicates the instantaneous amount of data that has been sent, but not
cumulatively acknowledged. In this method a new variable "pipeACK" is
introduced to measure the acknowledged size of the network pipe. This
is used to determine if the sender has validated the cwnd. pipeACK
differs from FlightSize in that it is evaluated over a window of
acknowledged data, rather than reflecting the amount of data
outstanding.A sender determines a pipeACK sample by measuring the volume of
data that was acknowledged by the network over the period of a
measured Round Trip Time (RTT). Using the variables defined in , a value could be measured by caching the
value of HighACK and after one RTT measuring the difference between
the cached HighACK value and the current HighACK value. Other
equivalent methods may be used.A sender is not required to continuously update the pipeACK
variable after each received ACK, but SHOULD perform a pipeACK sample
at least once per RTT when it has sent unacknowledged segments.The pipeACK variable MAY consider multiple pipeACK samples over the
pipeACK Sampling Period. The value of the pipeACK variable MUST NOT
exceed the maximum (highest value) within the sampling period. This
specification defines the pipeACK Sampling Period as Max(3*RTT, 1
second). This period enables a sender to compensate for large
fluctuations in the sending rate, where there may be pauses in
transmission, and allows the pipeACK variable to reflect the largest
recently measured pipeACK sample.When no measurements are available, the pipeACK variable is set to
the "undefined value". This value is used to inhibit entering the
non-validated phase until the first new measurement of a pipeACK
sample.The pipeACK variable MUST NOT be updated during TCP Fast Recovery.
That is, the sender stops collecting pipeACK samples during loss
recovery. The method RECOMMENDS that the TCP SACK option is enabled and the method defined on is used to recover missing segments. This
allows the sender to more accurately determine the number of missing
bytes during the loss recovery phase, and using this method will
result in a more appropriate cwnd following loss.The updated method creates a new TCP sender phase that captures
whether the cwnd reflects a validated or non-validated value. The
phases are defined as:Validated phase: pipeACK >=(1/2)*cwnd, or pipeACK is
undefined. This is the normal phase, where cwnd is expected to be
an approximate indication of the capacity currently available
along the network path, and the standard methods are used to
increase cwnd (currently ).Non-validated phase: pipeACK <(1/2)*cwnd. This is the phase
where the cwnd has a value based on a previous measurement of the
available capacity, and the usage of this capacity has not been
validated in the pipeACK Sampling Period. That is, when it is not
known whether the cwnd reflects the currently available capacity
along the network path. The mechanisms to be used in this phase
seek to determine a safe value for cwnd and an appropriate
reaction to congestion.Note: A threshold is needed to determine whether a sender is
in the validated or non-validated phase. We start by noting that a
standard TCP sender in slow-start is permitted to double its
FlightSize from one RTT to the next. This motivated the choice of a
threshold value of 1/2. This threshold ensures a sender does not
further increase the cwnd as long as the FlightSize is less than
(1/2*cwnd). Furthermore, a sender with a FlightSize less than
(1/2*cwnd) may in the next RTT be permitted by the cwnd to send at a
rate that more than doubles the FlightSize, and hence this case needs
to be regarded as non-validated and a sender therefore needs to employ
additional mechanisms while in this phase.A TCP sender MUST enter the non-validated phase when the pipeACK is
less than (1/2)*cwnd.A TCP sender that enters the non-validated phase SHOULD preserve
the cwnd (i.e., this neither grows nor reduces while the sender
remains in this phase). If the sender receives an indication of
congestion (loss or Explicit Congestion Notification, ECN, mark ) it uses the method described below. The
phase is concluded after a fixed period of time (the NVP, as explained
in ) or when the sender transmits
sufficient data so that pipeACK > (1/2)*cwnd (i.e. the sender is no
longer rate-limited).The behaviour in the non-validated phase is specified as:A sender determines whether to increase the cwnd based upon
whether it is cwnd-limited (see ):A sender that is cwnd-limited MAY use the standard TCP
method to increase cwnd (i.e. a TCP sender that fully utilises
the cwnd is permitted to increase cwnd each received ACK using
standard methods).A sender that is not cwnd-limited MUST NOT increase the
cwnd when ACK packets are received in this phase.If the sender receives an indication of congestion while in the
non-validated phase (i.e., detects loss, or an ECN mark), the
sender MUST exit the non-validated phase (reducing the cwnd as
defined in ).If the Retransmission Time Out (RTO) expires while in the
non-validated phase, the sender MUST exit the non-validated phase.
It then resumes using the standard TCP RTO mechanism .A sender with a pipeACK variable greater than (1/2)*cwnd SHOULD
enter the validated phase. (A rate-limited sender will not
normally be impacted by whether it is in a validated or
non-validated phase, since it will normally not consume the entire
cwnd. However a change to the validated phase will release the
sender from constraints on the growth of cwnd, and restore the use
of the standard congestion response.)The cwnd-limited behaviour may be triggered during a
transient condition that occurs when a sender is in the non-validated
phase and receives an ACK that acknowledges received data, the cwnd
was fully utilised, and more data is awaiting transmission than may be
sent with the current cwnd. The sender is then allowed to use the
standard method to increase the cwnd. (Note, if the sender succeeds in
sending these new segments, the updated cwnd and pipeACK variables
will eventually result in a transition to the validated phase.)Reception of congestion feedback while in the non-validated phase
is interpreted as an indication that it was inappropriate for the
sender to use the preserved cwnd. The sender is therefore required
to quickly reduce the rate to avoid further congestion. Since the
cwnd does not have a validated value, a new cwnd value must be
selected based on the utilised rate.A sender that detects a packet-drop, or receives an indication of
an ECN marked packet, MUST record the current FlightSize in the
variable LossFlightSize and MUST calculate a safe cwnd for loss
recovery using the method below:The pipeACK value is not updated during loss recovery. If there is a valid pipeACK value,
the new cwnd is adjusted to reflect that a non-validated cwnd may be
larger than the actual FlightSize, or recently used FlightSize
(recorded in pipeACK). The updated cwnd therefore prevents overshoot
by a sender significantly increasing its transmission rate during
the recovery period.At the end of the recovery phase, the TCP sender MUST reset the
cwnd using the method below:Where R is the volume of data that was retransmitted during the
recovery phase.If the sender implements a method that allows it to identify the
number of ECN-marked segments within a window that were observed by
the receiver, the sender SHOULD use the method above, further
reducing R by the number of marked segments.After completing the loss recovery phase, the sender MUST
re-initialise the pipeACK variable to the "undefined" value. This
ensures that standard TCP methods are used immediately after
completing loss recovery until a new pipeACK value can be
determined.ssthresh is adjusted using the standard TCP method.Note: The adjustment by reducing cwnd by the volume of data not
sent (R) follows the method proposed for Jump Start . The inclusion of the term R makes the
adjustment more conservative than standard TCP. This is required,
since a sender in the non-validated state may increase the rate more
than a standard TCP would have done relative to what was sent in the
last RTT (i.e., more than doubled the number of segments in flight
relative to what it sent in the last RTT). The additional reduction
after congestion is beneficial when the LossFlightSize has
significantly overshot the available path capacity incurring
significant loss (e.g. following a change of path characteristics or
when additional traffic has taken a larger share of the network
bottleneck during a period when the sender transmits less).Note: The pipeACK value is only valid during a non-validated
phase, and therefore does not exceed cwnd/2. If LossFlightSize and R
were small, then this can result in the final cwnd after loss
recovery being not more than 1/4 of the cwnd on detection of congestion. This
reduction is conservative compared to standard TCP.
pipeACK is reset to undefined after completing loss recovery.
Subsequent updates to cwnd do not therefore reflect pipeACK history
before any congestion event.TCP congestion control allows a sender to accumulate a cwnd that
would allow it to send a burst of segments with a total size up to
the difference between the FlightsSize and cwnd. Such bursts can
impact other flows that share a network bottleneck and/or may induce
congestion when buffering is limited.Various methods have been proposed to control the sender
burstiness , . For example, TCP can limit the number of new
segments it sends per received ACK. This is effective when a flow of
ACKs is received, but can not be used to control a sender that has
not send appreciable data in the previous RTT .This document recommends using a method to avoid line-rate bursts
after an idle or rate-limited interval when there is less reliable
information about the capacity of the network path: A TCP sender in
the non-validated phase SHOULD control the maximum burst size, e.g.
using a rate-based pacing algorithm in which a sender paces out the
cwnd over its estimate of the RTT, or some other method, to prevent
many segments being transmitted contiguously at line-rate. The most
appropriate method(s) to implement pacing depend on the design of
the TCP/IP stack, speed of interface and whether hardware support
(such as TCP Segment Offload, TSO) is used. The present document
does not recommend any specific method.An application that remains in the non-validated phase for a
period greater than the NVP is required to adjust its congestion
control state. If the sender exits the non-validated phase after
this period, it MUST update the ssthresh:(This adjustment of ssthresh ensures that the sender records that
it has safely sustained the present rate. The change is beneficial
to rate-limited flows that encounter occasional congestion, and
could otherwise suffer an unwanted additional delay in recovering
the sending rate.)The sender MUST then update cwnd to be not greater than:Where IW is the appropriate TCP initial window, used by the TCP
sender (e.g. ).Note: This adjustment ensures that the sender responds
conservatively after remaining in the non-validated phase for more
than the non-validated period. In this case, it reduces the cwnd by
a factor of two from the preserved value. This adjustment is helpful
when flows accumulate but do not use a large cwnd, and seeks to
mitigate the impact when these flows later resume transmission. This
could for instance mitigate the impact if multiple high-rate
application flows were to become idle over an extended period of
time and then were simultaneously awakened by some external
event.This section provides informative examples of implementation
methods. Implementations may choose to use other methods that comply
with the normative requirements.A pipeACK sample may be measured once each RTT. This reduces the
sender processing burden for calculating after each acknowledgement
and also reduces storage requirements at the sender.Since application behaviour can be bursty using CWV, it may be
desirable to implement a maximum filter to accumulate the measured
values so that the pipeACK variable records the largest pipeACK
sample within the pipeACK Sampling Period. One simple way to
implement this is to divide the pipeACK Sampling Period into several
(e.g. 5) equal length measurement periods. The sender then records
the start time for each measurement period and the highest measured
pipeACK sample. At the end of the measurement period, any
measurement(s) that are older than the pipeACK Sampling Period are
discarded. The pipeACK variable is then assigned the largest of the
set of the highest measured values.Figure 1: Example of measuring pipeACK samplesFigure 1 shows an example of how measurement samples may be
collected. At the time represented by the figure new samples are
being accumulated into sample D. Three previous samples also fall
within the pipeACK Sampling Period: A, B, and C. There was also a
period of inactivity between samples B and C during which no
measurements were taken. The current value of the pipeACK variable
will be 5, the maximum across all samples.After one further measurement period, Sample A will be discarded,
since it then is older than the pipeACK Sampling Period and the
pipeACK variable will be recalculated, Its value will be the larger
of Sample C or the final value accumulated in Sample D.Note that the pipeACK Sampling Period and the NVP period do not
necessarily require a new timer to be implemented. An alternative is
to record a timestamp when the sender enters the NVP. Each time a
sender transmits a new segment, this timestamp may be used to
determine if the NVP period has expired. If the period expires, the
sender may take into account how many units of the NVP period have
passed and make one reduction (as defined in ) for each NVP period.A method is required to detect the cwnd-limited condition (see
. This is used to detect a
condition where a sender in the non-validated phase receives an ACK,
but the size of cwnd prevents sending more new data.In simple terms this condition is true only when the TCP sender's
FlightSize is equal to or larger than the cwnd. However, an
implementation must consider other constraints on the way in which
cwnd variable is used, for instance the need to support methods such
as the Nagle Algorithm and TCP Segment Offload (TSO). This can
result in a sender becoming cwnd-limited when the cwnd is nearly,
rather than completely, equal to the FlightSize.This section documents the rationale for selecting the maximum period
that cwnd may be preserved, known as the non-validated period, NVP.Limiting the period that cwnd may be preserved avoids undesirable
side effects that would result if the cwnd were to be kept unnecessarily
high for an arbitrary long period, which was a part of the problem that
CWV originally attempted to address. The period a sender may safely
preserve the cwnd, is a function of the period that a network path is
expected to sustain the capacity reflected by cwnd. There is no ideal
choice for this time.A period of five minutes was chosen for this NVP. This is a
compromise that was larger than the idle intervals of common
applications, but not sufficiently larger than the period for which the
capacity of an Internet path may commonly be regarded as stable. The
capacity of wired networks is usually relatively stable for periods of
several minutes and that load stability increases with the capacity.
This suggests that cwnd may be preserved for at least a few minutes.There are cases where the TCP throughput exhibits significant
variability over a time less than five minutes. Examples could include
wireless topologies, where TCP rate variations may fluctuate on the
order of a few seconds as a consequence of medium access protocol
instabilities. Mobility changes may also impact TCP performance over
short time scales. Senders that observe such rapid changes in the path
characteristic may also experience increased congestion with the new
method, however such variation would likely also impact TCP’s
behaviour when supporting interactive and bulk applications.Routing algorithms may modify the network path, disrupting the RTT
measurement and changing the capacity available to a TCP connection,
however such changes do not often occur within a time frame of a few
minutes.The value of five minutes is therefore expected to be sufficient for
most current applications. Simulation studies (e.g. ) also suggest that for many practical
applications, the performance using this value will not be significantly
different to that observed using a non-standard method that does not
reset the cwnd after idle.Finally, other TCP sender mechanisms have used a 5 minute timer, and
there could be simplifications in some implementations by reusing the
same interval. TCP defines a default user timeout of 5 minutes i.e. how long transmitted data may remain
unacknowledged before a connection is forcefully closed.General security considerations concerning TCP congestion control are
discussed in . This document describes an
algorithm that updates one aspect of the congestion control procedures,
and so the considerations described in RFC 5681 also apply to this
algorithm.There are no IANA considerations.The authors acknowledge the contributions of Dr I Biswas, Mr Ziaul
Hossain in supporting the evaluation of CWV and for their help in
developing the mechanisms proposed in this draft. We also acknowledge
comments received from the Internet Congestion Control Research Group,
in particular Yuchung Cheng, Mirja Kuehlewind, Joe Touch, and Mark
Allman. This work was part-funded by the European Community under its
Seventh Framework Programme through the Reducing Internet Transport
Latency (RITE) project (ICT-317700).RFC-Editor note: please remove this section prior to publication.RFC-Editor note: please remove this section prior to
publication.There are several issues to be discussed more widely:• There are potential interactions with the Experimental
update in that raises the TCP
initial Window to ten segments, do these cases need to be
elaborated?This relates to the Experimental specification for
increasing the TCP IW defined in RFC 6928.The two methods have different functions and different
response to loss/congestion.RFC 6928 proposes an experimental update to TCP that would
increase the IW to ten segments. This would allow faster
opening of the cwnd, and also a large (same size) restart
window. This approach is based on the assumption that many
forward paths can sustain bursts of up to ten segments without
(appreciable) loss. Such a significant increase in cwnd must
be matched with an equally large reduction of cwnd if
loss/congestion is detected, and such a congestion indication
is likely to require future use of IW=10 to be disabled for
this path for some time. This guards against the unwanted
behaviour of a series of short flows continuously flooding a
network path without network congestion feedback.In contrast, this document proposes an update with a
rationale that relies on recent previous path history to
select an appropriate cwnd after restart.The behaviour differs in three ways:1) For applications that send little initially, new-cwv may
constrain more than RFC 6928, but would not require the
connection to reset any path information when a restart
incurred loss. In contrast, new-cwv would allow the TCP
connection to preserve the cached cwnd, any loss, would impact
cwnd, but not impact other flows.2) For applications that utilise more capacity than
provided by a cwnd of 10 segments, this method would permit a
larger restart window compared to a restart using the method
in RFC 6928. This is justified by the recent path history.3) new-CWV is attended to also be used for rate-limited
applications, where the application sends, but does not seek
to fully utilise the cwnd. In this case, new-cwv constrains
the cwnd to that justified by the recent path history. The
performance trade-offs are hence different, and it would be
possible to enable new-cwv when also using the method in RFC
6928, and yield benefits.• There is potential overlap with the Laminar proposal
(draft-mathis-tcpm-tcp-laminar)The current draft was intended as a standards-track update
to TCP, rather than a new transport variant. At least, it
would be good to understand how the two interact and whether
there is a possibility of a single method.• There is potential performance loss in loss of a
short burst (off list with M Allman)A sender can transmit several segments then become idle. If
the first segments are all ACK'ed the ssthresh collapses to a
small value (no new data is sent by the idle sender). Loss of
the later data results in congestion (e.g. maybe a RED drop or
some other cause, rather than the maximum rate of this flow).
When the sender performs loss recovery it may have an
appreciable pipeACK and cwnd, but a very low FlightSize - the
Standard algorithm results in an unusually low cwnd ((1/2)*
FlightSize).A constant rate flow would have maintained a FlightSize
appropriate to pipeACK (cwnd if it is a bulk flow).This could be fixed by adding a new state variable? It
could also be argued this is a corner case (e.g. loss of only
the last segments would have resulted in RTO), the impact
could be significant.• There is potential interaction with TCP Control
Block Sharing(M Welzl)An application that is non-validated can accumulate a cwnd
that is larger than the actual capacity. Is this a fair value
to use in TCB sharing?We propose that TCB sharing should use the pipeACK in place
of cwnd when a TCP sender is in the Non-validated phase. This
value better reflects the capacity that the flow has utilised
in the network path.RFC-Editor note: please remove this section prior to
publication.Draft 03 was submitted to ICCRG to receive comments and
feedback.Draft 04 contained the first set of clarifications after
feedback:Changed name to application limited and used the term
rate-limited in all places.Added justification and many minor changes suggested on the
list.Added text to tie-in with more accurate ECN marking.Added ref to Hug01Draft 05 contained various updates:New text to redefine how to measure the acknowledged pipe,
differentiating this from the FlightSize, and hence avoiding
previous issues with infrequent large bursts of data not being
validated. A key point new feature is that pipeACK only triggers
leaving the NVP after the size of the pipe has been acknowledged.
This removed the need for hysteresis.Reduction values were changed to 1/2, following analysis of
suggestions from ICCRG. This also sets the "target" cwnd as twice
the used rate for non-validated case.Introduced a symbolic name (NVP) to denote the 5 minute
period.Draft 06 contained various updates:Required reset of pipeACK after congestion.Added comment on the effect of congestion after a short burst
(M. Allman).Correction of minor Typos.WG draft 00 contained various updates:Updated initialisation of pipeACK to maximum value.Added note on intended status still to be determined.WG draft 01 contained:Added corrections from Richard Scheffenegger.Raffaello Secchi added to the mechanism, based on
implementation experience.Removed that the requirement for the method to use TCP SACK
optionAlthough it may be desirable to use SACK, this is not essential
to the algorithm.Added the notion of the sampling period to accommodate large
rate variations and ensure that the method is stable. This
algorithm to be validated through implementation.WG draft 02 contained:Clarified language around pipeACK variable and pipeACK sample -
Feedback from Aris Angelogiannopoulos.WG draft 03 contained:Editorial corrections - Feedback from Anna Brunstrom.An adjustment to the procedure at the start and end of Reoloss
recovery to align the two equations.Further clarification of the "undefined" value of the pipeACK
variable.WG draft 04 contained:Editorial corrections.Introduced the "cwnd-limited" term.An adjustment to the procedure at the start of a cwnd-limited
phase - the new text is intended to ensure that new-cwv is not
unnecessarily more conservative than standard TCP when the flow is
cwnd-limited. This resolves two issues: first it prevents
pathologies in which pipeACK increases slowly and erratically. It
also ensures that performance of bulk applications is not
significantly impacted when using the method.Clearly identifies that pacing (or equivalent) is requiring
during the NVP to control burstiness. New section added.WG draft 05 contained:Clarification to first two bullets in describing cwnd-limited, to
explain these are really alternates to the same case.Section giving implementation examples was restructured to
clarify there are two methods described.Cross References to sections updated - thanks to comments from
Martin Winbjoerk and Tim Wicinski.WG draft 06 contained:The section giving implementation examples was restructured to
clarify there are two methods described.Justification of design decisions.Re-organised text to improve clarity of argument.A Practical Evaluation of Congestion Window Validation
Behaviour, 9th Annual Postgraduate Symposium in the Convergence of
Telecommunications, Networking and Broadcasting (PGNet), Liverpool,
UKEnhancing TCP Performance to support Variable-Rate Traffic,
2nd Capacity Sharing Workshop, ACM CoNEXT, Nice, France, 10th
December 2012.Congestion Control without a Startup Phase, 5th International
Workshop on Protocols for Fast Long-Distance Networks (PFLDnet), Los
Angeles, California, USAAnalysing TCP for Bursty Traffic, Int'l J. of Communications,
Network and System Sciences, 7(3)PhD Thesis, Internet congestion control for variable rate TCP
traffic, School of Engineering, University of AberdeenIssues in TCP Slow-Start Restart After Idle
(Work-in-Progress)Notes on burst mitigation for transport protocols