Internet-Draft | CC Guidelines | October 2023 |
Fairhurst & Welzl | Expires 25 April 2024 | [Page] |
When published as an RFC, this document provides guidance on the design of methods to avoid congestion collapse and how an endpoint needs to react to congestion. Based on these, and Internet engineering experience, the document provides best current practice for the design of new congestion control methods in Internet protocols.¶
When published, the document will update or replace the Best Current Practice in BCP 41, which currently includes "Congestion Control Principles" provided in RFC2914.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 25 April 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
This document has two purposes. It first identifices changes in practice and network design that have occurred since the publications of IETF BCPs on the topic of congestion control and identifies current issues in congestion ocontrol. Second, it updates the guidance on the use of Congestion Control (CC) mechanisms. It also provides background information for the design of new mechanisms. A related document provides guidance on the evaluation of these new methods.¶
The IETF has specified a set of Internet transports (e.g., TCP [RFC9293], UDP [RFC0768], UDP-Lite [RFC3828], SCTP [RFC4960], and DCCP [RFC4340]) as well as protocols layered on top of these transports (e.g., RTP [RFC3550], QUIC [RFC9000] [RFC9002], SCTP/UDP [RFC6951], DCCP/UDP [RFC6773]) and transports that work directly over the IP network layer. These transports are implemented in endpoints (either Internet hosts or routers acting as endpoints), and can be designed to detect and react to network congestion. TCP was the first transport to provide this, although the specifications found in RFC 793 [RFC793] predate the inclusion of CC and did not contain any discussion of using or managing a congestion window (cwnd). RFC 9293 [RFC9293] has addressed this.¶
Section 3 of [RFC2914] states "The equitable sharing of bandwidth among flows depends on the fact that all flows are running compatible congestion control algorithms". Internet transports therefore need to react to avoid congestion that could impact other flows sharing a path. The Requirements for Internet Hosts [RFC1122] formally mandates that endpoints perform CC. "Because congestion control is critical to the stable operation of the Internet, applications and other protocols that choose to use UDP as an Internet transport must employ mechanisms to prevent congestion collapse and to establish some degree of fairness with concurrent flows [RFC8085].¶
The popularity of the Internet has led to the deployment of many implementations: Some use standard CC mechanisms, some have chosen to adopt approaches that differ from present standards. Guidance is needed to ensure safe evolution of the CC methods used by transport protocols.¶
There are several reasons to think that things have changed since the original best current practice was published: At one time, it was common that the serialisation delay of a packet at the bottleneck formed a large proportion of the round trip time (RTT) of a path, motivating a need for conservative loss recovery. This is not often the case for today's higher capacity links. The increase in the link speed often means that for many users, current traffic often does not normally experience persistent congestion, and under-load (inability to achieve the bottleneck rate) is often as common as over-load (exceeding the bottleneck rate) That is, a current challenge is that conservative methods lead to under-utilisation of the path, and safe scalable methods need to be found.¶
There also have been changes in the way that protocol mechanisms are deployed in Internet endpoints:¶
On the one hand, techniques have evolved that allow incremental deployment and testing of new methods which can enable the rapid development of methods to detect and react to congestion. This allows new mechanisms to be tested to ensure the majority users see benefit in the networks they use. There has been considerable progress in developing new loss recovery and congestion responses that have been evaluated in this way.¶
On the other hand, the Internet continues to be heterogenous, some endpoints experience very different network path characteristics and some endpoints generate very different patterns of traffic. There is still a need to avoid harm to other flows (stravation of capacity, unecessary increase of latency, congestion collapse).¶
This document has a focus on unicast point-to-point transports, this includes migration from using one path to another path. Some recommendations [RFC5783] and requirements will apply to point-to-multipoint transports (e.g., multicast), however this is beyond the current document's scope. [RFC2914] provides additional guidance on the use of multicast.¶
Finally, experience has shown that successful protocols developed in a specific context, or for a particular application tend to also become used in a wider range of contexts. Therefore, IETF specifications ought to target deployment on the general Internet, or be specified for use only within a controlled environment.¶
Internet paths experience congestion (loss or delay) when there is excess load at a bottleneck that they traverse. This document differentiates two levels of congestion:¶
Flows need to react when they encounter either form congestion to reduce their contribution to the load. For persistent congestion, the reaction needs to be sufficient to avoid excessive harm to other flows.¶
Incipient congestion results during normal operation of the Internet. Buffering (which causes an increase in latency) or congestion loss (discard of a packet) arises when the traffic arriving at a bottleneck exceeds the resources available. A network device uses will drop excess packets when its queue(s) becomes full. This can be managed using Active Queue Management (AQM) [RFC7567], which can be combined with Explicit Congestion Notification (ECN) signalling [RFC3168] to mitigate incipient congestion [RFC8087].¶
Buffers can be divided into pools and traffic can be associated with a specific pool (e.g., using local configuration, or coordinated using the Differentiated Services [RFC2475] architecture). A schedular can [RFC7806] isolate the queuing of packets for different flows, or aggregates of flows, and reduce the impact of flow multiplexing on other flows (e.g., flow scheduling [RFC7567]). This could equally distribute resources between sharing flows, but this equality is explicitly not a requirement [Flow-Rate-Fairness].¶
Even when a path is expected to support such methods, an endpoint MUST NOT rely on the presence and correct configuration of these methods, and therefore needs to employ CC methods that work end-to-end, or employ in-network control, such as a circuit-breaker.¶
In some controlled environments, Internet transports can use mechanisms to reserve capacity. Most Internet paths do not support this. In the absence of such a reservation, endpoints are unable to determine a safe rate at which to start a new transmission. The use of an Internet path therefore requires end-to-end CC mechanisms to detect and respond to congestion.¶
Section 3.3 of [RFC2914] notes that a flow can use CC to "optimize its own performance regarding throughput, delay, and loss. In some circumstances, for example in environments with high statistical multiplexing, the delay and loss rate experienced by a flow are largely independent of its own sending rate." and continues: "in environments with lower levels of statistical multiplexing or with per-flow scheduling, the delay and loss rate experienced by a flow is in part a function of the flow's own sending rate. Thus, a flow can use end-to-end congestion control to limit the delay or loss experienced by its own packets."¶
Early RFCs recognised that a poorly designed transport can lead to significant congestion, which could result in severe service degradation or "Internet meltdown". One effect is called "Congestion Collapse", where an increase in the network load results in a decrease in the useful work done by the network. [RFC0896] [RFC0970]. [RFC2914]. This was first observed in the mid 1980s At that time, this was aggrevated by connections thjat did not use CC and which unnecessarily retransmitted packets that were either in transit or had already been received, resulting in a stable persistent congestion [RFC0896].¶
[RFC2914] also notes that it is even more destructive when applications increase their sending rate in response to an increase in the packet loss rate (e.g., automatically using an increased level of FEC (Forward Error Correction)).¶
The problems of congestion collapse have generally been corrected by improvements to the loss recovery and congestion control mechanisms in transport protocols [Jac88], designed to avoid starving other flows of capacity (e.g., [RFC7567]). Section 3.1 describes preventing congestion collapse. [RFC2309] adds that "all UDP-based streaming applications should incorporate effective congestion avoidance mechanisms." [RFC7567] and [RFC8085] both reaffirm the continued need to provide methods to prevent starvation.¶
CC is an evolving subject, responding to changes in protocol design, operation of applications using the network and understanding of the network operation under load. The IETF has provided guidance [RFC5033] for considering and evaluating alternate CC algorithms.¶
The IRTF has described a set of metrics and related trade-off between metrics to compare, contrast, and evaluate CC algorithms [RFC5166]. [RFC5783] provided a snapshot of CC research in 2008. [RFC6077] discussed open issues in CC research in 2011.¶
In contrast to considering the fairness in distributing capacity between flows, a different approach is to analyse persistent congestion effects to understand the harm to other flows (collateral impact of loss, starvation, collapse, etc). Such an analysis of the suitability of a new mechanism can evaluate how changes impact other flows sharing a bottleneck, and consider the impact on the flows that have outliers in performance (e.g., the last 5%, 1%) For example, the performance often does not provide an indication that a new method could starve other applications that share the bottleneck, or when patterns of packets (e.g., bursts) are sent that disrupt the packet timing needed by another application flow.¶
Recommendations and requirements on CC control are distributed across many documents in the RFC series. This section gathers and consolidates these recommendations. These, and Internet engineering experience are used to derive the best current practice in the design of Internet CC methods.¶
Standardization of new CC algorithms can avoid an "arms race" among competing protocols [RFC2914]. That is, avoid competition for Internet resource in a way that significantly reduces the ability of other flows to use the Internet.¶
The general recommendation in the UDP Guidelines [RFC8085] is that applications SHOULD leverage existing CC techniques, such as those defined for TCP [RFC9293], TCP-Friendly Rate Control (TFRC) [RFC5348], SCTP [RFC4960], and other IETF-defined transports. This is because there are many trade offs and details that can have a serious impact on the performance of a CC mechanism and upon other traffic that seeks to share a bottlneck.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].¶
The path between endpoints (sometimes called "Internet Hosts" for IPv4 and called "source nodes" and "destination nodes" in IPv6) consists of the endpoint protocol stack at the sender and the receiver (which together implement the transport service), and a succession of links and network devices (routers or middleboxes) forming the network path. The set of network devices forming the path is not usually fixed, and it should generally be assumed that this set can change over arbitrary lengths of time.¶
[RFC5783] defines CC as "the feedback-based adjustment of the rate at which data is sent into the network. Congestion control is an indispensable set of principles and mechanisms for maintaining the stability of the Internet."¶
The document draws on language used in the specifications of TCP and other IETF transports. For example, a protocol timer is generally needed to detect persistent congestion, and this document uses the term Retransmission Timeout (RTO) to refer to the operation of this timer. Similarly, it refers to a congestion window (cwnd) as a variable that controls the rate of transmission by the CC. Each new transport needs to make its own design decisions about how to meet the recommendations and requirements for CC. The use of these terms does not imply that endpoints need to implement functions in the current way used by TCP.¶
Other terminology is directly copied from the cited RFCs.¶
This includes:¶
Internet transports need to use a CC method designed for Internet paths.¶
An endpoint needs to be protected from attacks on the traffic it generates, or attacks that seek to increase the capacity that it consumes (impacting other traffic that share a bottleneck).¶
The following guidance is provided on protection:¶
This section summarises the principles for providing CC. It describes principles associated with preventing persistent congestion, reacting to incipient congestion and utilising additional path information.¶
A sender needs to regulate the maximum volume of data in flight over the interval of the current RTT (the cwnd). It needs to react to incipient congestion.¶
When the CC is increasing the cwnd, it transmits faster than the last confirmed safe rate. Such an increase needs to be regarded as tentative and a sender needs to reduce its rate below the last confirmed safe rate when congestion is detected.¶
An endpoint can utilise timers to implement transport mechanisms, e.g., to recover from loss, to trigger pre-emptive retransmission and other protocol functions. An endpoint that does utilise timers needs to follow the rules in section 3.3 of [RFC8085].¶
Principles include:¶
This section describes the principles related to mitigation of incipient congestion (see Section 1.2).¶
When a connection to a new destination is first established, the endpoints have little information about the characteristics of the network path they will use. The safety and responsiveness of new CC proposals needs to be evaluated [RFC5166].¶
This section describes mechanisms to detect loss and provide retransmission, and to protect the network in the absence of timely feedback.¶
In determining an appropriate congestion response to incipient congestion, designs could consider the size of the packets that experience congestion [RFC4828].¶
Path information can be cached. In TCP, this was previously called TCP Control Block (TCB) sharing, and is now called TCP Control Block Interdependence, [RFC9040]. A CC can also utilise signals from the network to help determine how to regulate the traffic it sends.¶
All endpoints are required to implement mechanisms that avoid persistent congestion and can demonstrate that they do not induce starvation and congestion collapse (see Section 1.3).¶
Principles include:¶
Principles include:¶
Many designs place the responsibility of rate-adaption for CC at the sender (source) endpoint, utilising feedback information provided by the remote endpoint (receiver). CC can also be implemented by determining an appropriate rate limit at a receiver and using this limit to control the maximum transport rate (e.g., using methods such as [RFC5348] and [RFC4828]).¶
Applications at an endpoint can send more than one flow. "The specific issue of a browser opening multiple connections to the same destination has been addressed by [RFC2616]. Section 8.1.4 states that "Clients that use persistent connections SHOULD limit the number of simultaneous connections that they maintain to a given server. A single-user client SHOULD NOT maintain more than 2 connections with any server or proxy." [RFC9040].¶
This document owes much to the insight offered by Sally Floyd, both at the time of writing of RFC2914 and her help and review in the many years that followed this.¶
Nicholas Kuhn helped develop the first draft of these guidelines. Tom Jones and Ana Custura reviewed the first version of this draft. Many discussions with Michael Welzl and others have provided immeasurable help to get this far. The University of Aberdeen received funding to support this work from the European Space Agency.¶
This memo includes no request to IANA.¶
RFC Editor Note: If there are no requirements for IANA, the section will be removed during conversion into an RFC by the RFC Editor.¶
This document introduces no new security considerations. Each RFC listed in this document discusses the security considerations of the specification it contains. The security considerations for the use of transports are provided in the references section of the cited RFCs. Security guidance for applications using UDP is provided in the UDP Usage Guidelines [RFC8085].¶
Section 3.3 describes general requirements relating to the design of safe protocols and their protection from on and off path attack.¶
Section 4.3.4 follows current best practice to validate ICMP messages prior to use.¶
Note to RFC-Editor: please remove this entire section prior to publication.¶
Previous versions of the document were presented and discsussed in tsvwg, and eveolved through several versions. This version is a refocus towards the newly formed CC Working Group where it is offered as a candidate for progression.¶
Individual draft -00:¶