CONEX B. Briscoe Internet-Draft BT Intended status: Informational R. Woundy Expires: January 13, 2011 Comcast T. Moncaster, Ed. Moncaster.com J. Leslie, Ed. JLC.net July 12, 2010 ConEx Concepts and Use Cases draft-moncaster-conex-concepts-uses-01 Abstract Internet Service Providers (ISPs) are facing problems where localized congestion prevents full utilization of the path between sender and receiver at today's "broadband" speeds. ISPs desire to control this congestion, which often appears to be caused by a small number of users consuming a large amount of bandwidth. Building out more capacity along all of the path to handle this congestion can be expensive and may not result in improvements for all users so network operators have sought other ways to manage congestion. The current mechanisms all suffer from difficulty measuring the congestion (as distinguished from the total traffic). The ConEx Working Group is designing a mechanism to make congestion along any path visible at the Internet Layer. This document describes example cases where this mechanism would be useful. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on January 13, 2011. Briscoe, et al. Expires January 13, 2011 [Page 1] Internet-Draft ConEx Mechanism July 2010 Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Briscoe, et al. Expires January 13, 2011 [Page 2] Internet-Draft ConEx Mechanism July 2010 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 6 3. Existing Approaches to Congestion Management . . . . . . . . . 7 4. Exposing Congestion . . . . . . . . . . . . . . . . . . . . . 8 4.1. ECN - a Step in the Right Direction . . . . . . . . . . . 9 5. Requirements for ConEx . . . . . . . . . . . . . . . . . . . . 10 5.1. ConEx Issues . . . . . . . . . . . . . . . . . . . . . . . 11 6. A Possible Congestion Exposure Mechanism . . . . . . . . . . . 11 7. ConEx Architectural Elements . . . . . . . . . . . . . . . . . 12 7.1. ConEx Monitoring . . . . . . . . . . . . . . . . . . . . . 13 7.1.1. Edge Monitoring . . . . . . . . . . . . . . . . . . . 13 7.1.2. Border Monitoring . . . . . . . . . . . . . . . . . . 15 7.2. ConEx Policing . . . . . . . . . . . . . . . . . . . . . . 15 7.2.1. Egress Policing . . . . . . . . . . . . . . . . . . . 16 7.2.2. Ingress Policing . . . . . . . . . . . . . . . . . . . 17 7.2.3. Border Policing . . . . . . . . . . . . . . . . . . . 18 8. ConEx Use Cases . . . . . . . . . . . . . . . . . . . . . . . 19 8.1. ConEx as a basis for traffic management . . . . . . . . . 19 8.2. ConEx to incentivise scavenger transports . . . . . . . . 19 8.3. ConEx to mitigate DDoS . . . . . . . . . . . . . . . . . . 20 8.4. Accounting for Congestion Volume . . . . . . . . . . . . . 20 8.5. ConEx as a form of differential QoS . . . . . . . . . . . 21 8.6. Partial vs. Full Deployment . . . . . . . . . . . . . . . 22 9. Other issues . . . . . . . . . . . . . . . . . . . . . . . . . 23 9.1. Congestion as a Commercial Secret . . . . . . . . . . . . 23 9.2. Information Security . . . . . . . . . . . . . . . . . . . 24 10. Security Considerations . . . . . . . . . . . . . . . . . . . 24 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 25 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25 13.1. Normative References . . . . . . . . . . . . . . . . . . . 25 13.2. Informative References . . . . . . . . . . . . . . . . . . 26 Briscoe, et al. Expires January 13, 2011 [Page 3] Internet-Draft ConEx Mechanism July 2010 1. Introduction The growth of "always on" broadband connections, coupled with the steady increase in access speeds [OfCom], have caused unforeseen problems for network operators and users alike. Users are increasingly seeing congestion at peak times and changes in usage patterns (with the growth of real-time streaming) simply serve to exacerbate this. Operators want all their users to see a good service but are unable to see where congestion problems originate. But congestion results from sharing network capacity with others, not merely from using it. In general, today's "DSL" and cable-internet users cannot "cause" congestion in the absence of competing traffic. (Wireless ISPs and cellular internet have different tradeoffs which we will not discuss here.) Congestion generally results from the interaction of traffic from an ISPs own subscribers with traffic from other users. The tools currently available don't allow an operator to identify which traffic contributes most to the congestion and so they are powerless to properly control it. While building out more capacity to handle increased traffic is always good, the expense and lead-time can be prohibitive, especially for network operators that charge flat-rate feeds to subscribers and are thus unable to charge heavier users more for causing more congestion [BB-incentive]. For an operator facing congestion caused by other operators' networks, building out its own capacity is unlikely to solve the congestion problem. Operators are thus facing increased pressure to find effective solutions to dealing with the increasing bandwidth demands of all users. The growth of "scavenger" behaviour (e.g. [LEDBAT]) helps to reduce congestion, but can actually make the ISPs problem less tractable. These users are trying to make good use of the capacity of the path while minimising their own costs. Thus, users of such services may show very heavy total traffic up until the moment congestion is detected (at the Transport Layer), but then will immediately back off. ISP monitoring (at the Internet Layer) cannot detect this congestion avoidance if the congestion in question is in a different domain further along the path; and must treat such users as congestion-causing users. The ConEx working group proposes that Internet Protocol (IP) packets have two "congestion" fields. The exact protocol details of these fields are for another document, but we expect them to provide measures of "congestion so far" and "congestion still expected". Briscoe, et al. Expires January 13, 2011 [Page 4] Internet-Draft ConEx Mechanism July 2010 Changes from previous drafts (to be removed by the RFC Editor): From -00 to -01: Changed end of Abstract to better reflect new title Created new section describing the architectural elements of ConEx Section 7. Added Edge Monitors and Border Monitors (other elements are Ingress, Egress and Border Policers). Extensive re-write of Section 8 partly in response to suggestions from Dirk Kutscher Improved layout of Section 2 and added definitions of Whole Path Congestion, ConEx-Enabled and ECN-Enabled. Re-wrote definition of Congestion Volume. Renamed Ingress and Egress Router to Ingress and Egress Node as these nodes may not actually be routers. Improved document structure. Merged sections on Exposing Congestion and ECN. Added new section on ConEx requirements Section 5 with a ConEx Issues subsection Section 5.1. Text for these came from the start of the old ConEx Use Cases section Added a sub-section on Partial vs Full Deployment Section 8.6 Added a discussion on ConEx as a Business Secret Section 9.1 From draft-conex-mechanism-00 to draft-moncaster-conex-concepts-uses-00: Changed filename to draft-moncaster-conex-concepts-uses. Changed title to ConEx Concepts and Use Cases. Chose uniform capitalisation of ConEx. Moved definition of Congestion Volume to list of definitions. Clarified Section 6. Changed section title. Modified text relating to conex-aware policing and policers (which are NOT defined terms). Briscoe, et al. Expires January 13, 2011 [Page 5] Internet-Draft ConEx Mechanism July 2010 Re-worded bullet on distinguishing ConEx and non-ConEx traffic in Section 8. 2. Definitions ConEx expects to build on Explicit Congestion Notification (ECN) [RFC3168] where it is available. Hence we use the term "congestion" in a manner consistent with ECN, namely that congestion occurs before any packet is dropped. In this section we define a number of terms that are used throughout the document. Congestion: Congestion is a measure of the probability that a given packet will be ECN-marked or dropped as it traverses the network. At any given router it is a function of the queue state at that router. Congestion is added in a combinatorial manner, that is, routers ignore the congestion a packet has already seen when they decide whether to mark it or not. Congestion Volume: Congestion volume is defined as the congestion a packet experiences, multiplied by the size of that packet. It can be expressed as the volume of bytes that have been ECN-marked or dropped. By extension, the Congestion Rate would be the transmission rate multiplied by the congestion level. Upstream Congestion: The congestion that has already been experienced by a packet as it travels along its path. In other words at any point on the path, it is the congestion between the source of the packet and that point. Downstream Congestion: The congestion that a packet still has to experience on the remainder of its path. In other words at any point it is the congestion still to be experienced as the packet travels between that point and its destination. Whole Path Congestion: The total congestion that a packet experiences between the ingress to the network and the egress. Network Ingress: The Network Ingress is the first node a packet traverses that is outside the source's own network. In a domestic network that will be the first node downstream from the home access equipment. In a business network it may be the first router downstream of the firewall. Network Egress: The Network Egress is the last node a packet traverses before it enters the destination network. Briscoe, et al. Expires January 13, 2011 [Page 6] Internet-Draft ConEx Mechanism July 2010 ConEx-Enabled: Any piece of equipment (end-system, router, tunnel end-point, firewall, policer, etc) that fully implements the ConEx protocol. ECN-enabled: Any router that fully enables Explicit Congestion Notification (ECN) as defined in [RFC3168] and any relevant updates to that standard. 3. Existing Approaches to Congestion Management A number of ISPs already use some form of traffic management. Generally this is an attempt to control the peak-time congestion within their network and to better apportion shared network resources between customers. Even ISPs that don't impose such traffic management (such as those in Germany) may have caps on the capacity they allow for Best Effort traffic in their backhaul. These attempts to control congestion have usually focused on the peak hours and aim to rate limit heavy users during that time. For example, users who have consumed a certain amount of bandwidth during the last 24 hours may be elected to have their traffic shaped once the total traffic reaches a given level in certain nodes within the operator's network. The authors have chosen not to exhaustively list current approaches to congestion management. Broadly these approaches can be divided into those that happen at Layer 3 of the OSI model and those that use information gathered from higher layers. In general these approaches attempt to find a "proxy" measure for congestion. Layer 3 approaches include: o Volume accounting -- the overall volume of traffic a given user or network sends is measured. Users may be subject to an absolute volume cap (e.g. 10Gbytes per month) or the "heaviest" users may be sanctioned in some manner. o Rate measurement -- the traffic rate per user or per network can be measured. The absolute rate a given user sends at may be limited at peak hours or the average rate may be used as the basis for inter-network billing. Higher layer approaches include: o Bottleneck rate policing -- bottleneck flow rate policers aim to share the available capacity at a given bottleneck between all concurrent users. Briscoe, et al. Expires January 13, 2011 [Page 7] Internet-Draft ConEx Mechanism July 2010 o DPI and application rate policing -- deep packet inspection and other techniques can be used to determine what application a given traffic flow is associated with. ISPs may then use this information to rate-limit or otherwise sanction certain applications at peak hours. All of these current approaches suffer from some general limitations. First, they introduce performance uncertainty. Flat-rate pricing plans are popular because users appreciate the certainty of having their monthly bill amount remain the same for each billing period, allowing them to plan their costs accordingly. But while flat-rate pricing avoids billing uncertainty, it creates performance uncertainty: users cannot know whether the performance of their connection is being altered or degraded based on how the network operator manages congestion. Second, none of the approaches is able to make use of what may be the most important factor in managing congestion: the amount that a given endpoint contributes to congestion on the network. This information simply is not available to network nodes, and neither volume nor rate nor application usage is an adequate proxy for congestion volume, because none of these metrics measures a user or network's actual contribution to congestion on the network. Finally, none of these solutions accounts for inter-network congestion. Mechanisms may exist that allow an operator to identify and mitigate congestion in their own network, but the design of the Internet means that only the end-hosts have full visibility of congestion information along the whole path. ConEx allows this information to be visible to everyone on the path and thus allows operators to make better-informed decisions about controlling traffic. 4. Exposing Congestion We argue that current traffic-control mechanisms seek to control the wrong quantity. What matters in the network is neither the volume of traffic nor the rate of traffic: it is the contribution to congestion over time -- congestion means that your traffic impacts other users, and conversely that their traffic impacts you. So if there is no congestion there need not be any restriction on the amount a user can send; restrictions only need to apply when others are sending traffic such that there is congestion. For example, an application intending to transfer large amounts of data could use a congestion control mechanism like [LEDBAT] to reduce its transmission rate before any competing TCP flows do, by detecting an increase in end-to-end delay (as a measure of impending Briscoe, et al. Expires January 13, 2011 [Page 8] Internet-Draft ConEx Mechanism July 2010 congestion). However such techniques rely on voluntary, altruistic action by end users and their application providers. ISPs can neither enforce their use nor avoid penalizing them for congestion they avoid. The Internet was designed so that end-hosts detect and control congestion. We argue that congestion needs to be visible to network nodes as well, not just to the end hosts. More specifically, a network needs to be able to measure how much congestion any particular traffic expects to cause between the monitoring point in the network and the destination ("rest-of-path congestion"). This would be a new capability. Today a network can use Explicit Congestion Notification (ECN) [RFC3168] to detect how much congestion the traffic has suffered between the source and a monitoring point, but not beyond. This new capability would enable an ISP to give incentives for the use of LEDBAT-like applications that seek to minimise congestion in the network whilst restricting inappropriate uses of traditional TCP and UDP applications. So we propose a new approach which we call Congestion Exposure. We propose that congestion information should be made visible at the IP layer, so that any network node can measure the contribution to congestion of an aggregate of traffic as easily as straight volume can be measured today. Once the information is exposed in this way, it is then possible to use it to measure the true impact of any traffic on the network. In general, congestion exposure gives ISPs a principled way to hold their customers accountable for the impact on others of their network usage and reward them for choosing congestion-sensitive applications. 4.1. ECN - a Step in the Right Direction Explicit Congestion Notification [RFC3168] allows routers to explicitly tell end-hosts that they are approaching the point of congestion. ECN builds on Active Queue Mechanisms such as random early discard (RED) [RFC2309] by allowing the router to mark a packet with a Congestion Experienced (CE) codepoint, rather than dropping it. The probability of a packet being marked increases with the length of the queue and thus the rate of CE marks is a guide to the level of congestion at that queue. This CE codepoint travels forward through the network to the receiver which then informs the sender that it has seen congestion. The sender is then required to respond as if it had experienced a packet loss. Because the CE codepoint is visible in the IP layer, this approach reveals the upstream congestion level for a packet. Alas, this is not enough - ECN gives downstream nodes an idea of the Briscoe, et al. Expires January 13, 2011 [Page 9] Internet-Draft ConEx Mechanism July 2010 congestion so far for any flow. This can help hold a receiver accountable for the congestion caused by incoming traffic. But a receiver can only indirectly influence incoming congestion, by politely asking the sender to control it. A receiver cannot make a sender install an adaptive codec, or install LEDBAT instead of TCP congestion-control. And a receiver cannot cause an attacker to stop flooding it with traffic. What is needed is knowledge of the downstream congestion level, for which you need additional information that is still concealed from the network. 5. Requirements for ConEx This document is intended to highlight some of the possible uses for a congestion exposure mechanism such as the one being proposed by the ConEx working group. The actual ConEx mechanism will be defined in another document. In this section we set out some basic requirements for any ConEx mechanism. We are not saying this is an exhaustive list of those requirements. This list is simply to allow readers to make a realistic assessment of the feasibility and utility of the use cases set out in Section 8. The three key requirements are 1. Timeliness of information. The limitations of current network design gives a minimum delay of 1 round trip time (RTT) for congestion information to circulate the network. It is important that the conex mechanism operates on similar timescales to ensure the congestion information it exposes is as up to date as possible. Stale congestion information is useless since congestion levels can fluctuate widely over relatively short timescales. 2. Accuracy of information. In order to be useful, congestion information has to be sufficiently accurate for the purposes for which it is to be used. In general the main purposes are monitoring congestion and controlling congestion. As a minimum, conex should equal the accuracy required for current TCP implementations. A unary signal such as that provided by ECN is sufficient though a more precise signal may be desirable. 3. Visibility of information. In order to be useful conex information should be visible at every point in the network. In today's networks that means it must be visible at the IP layer. Briscoe, et al. Expires January 13, 2011 [Page 10] Internet-Draft ConEx Mechanism July 2010 5.1. ConEx Issues If ConEx information is to be useful, it has to be accurate (within the limitations of the available feedback). This raises three issues that need to be addressed: Distinguishing ConEx traffic from non-ConEx traffic: An ISP may reasonably choose to do nothing different with ConEx traffic. Alternatively they might want to incentivise it in order to give it marginally better service. Over-declaring congestion: ConEx relies on the sender accurately declaring the congestion they expect to see. During TCP slow- start a sender is unable to predict the level of congestion they will experience and it is advisable to declare that expect to see some congestion on the first packet. However it is important to be cautious when over-declaring congestion lest you erode trust in the system. We do not initially propose any mechanism to deal with this issue. Under-declaring congestion: ConEx requires the sender to set the downstream congestion field in each packet to their best estimate of what they expect the whole path congestion to be. If this expected congestion level is to be used for traffic management (see use cases) then it benefits the user to under-declare. Mechanisms are needed to prevent this happening. There are three approaches that may work (individually or in combination): * An ingress router can monitor a user's feedback to see what their reported congestion level actually is. * If the congestion field carries the actual congestion value then a ConEx-Enabled Policer could potentially drop any packet with a downstream-congestion value of zero or less. * An egress router can actively monitor some or all flows to check that they are complying with the requirement that the downstream congestion value should be zero or (slightly positive) when it reaches the egress. 6. A Possible Congestion Exposure Mechanism One possible protocol is based on a concept known as re-feedback [Re-Feedback], and builds on existing active queue management techniques like RED [RFC2309] and ECN [RFC3168] that network elements can already use to measure and expose congestion. The protocol is Briscoe, et al. Expires January 13, 2011 [Page 11] Internet-Draft ConEx Mechanism July 2010 described in more detail in [Fairer-faster], but we summarise it below. In this protocol packets have two Congestion fields in their IP header: o An Upstream Congestion field to record the congestion already experienced along the path. Routers indicate their current congestion level by updating this field in every packet. As the packet traverses the network it builds up a record of the overall congestion along its path in this field. This data is sent back to the sender who uses it to determine its transmission rate. This can be achieved by using the existing ECN field [RFC3168]. o A whole-path congestion field that uses re-feedback to record the total congestion expected along the path. The sender does this by re-inserting the current Congestion level for the path into this field for every packet it transmits. Thus at any node downstream of the sender you can see the Upstream Congestion for the packet and the whole path congestion (with a time lag of one round-trip-time (RTT)) and can calculate the Downstream Congestion by subtracting the Upstream from the Whole Path Congestion. So congestion exposure can be achieved by coupling congestion notification from routers with the re-insertion of this information by the sender. This establishes information symmetry between users and network providers. 7. ConEx Architectural Elements ConEx is a simple concept that has revolutionary implications. It is that rare thing -- a truly disruptive technology, and as such it is hard to imagine the variety of uses it may be put to. Before even thinking what it might be used for we need to address the issue of how it can be used. This section describes four architectural elements that can be placed in the network and which utilise ConEx information to monitor or control traffic flows. In the following we are assuming the most abstract version of the ConEx mechanism, namely that every packet carries two congestion fields, one for upstream congestion and one for downstream. Section 6 outlines one possible approach for this. Briscoe, et al. Expires January 13, 2011 [Page 12] Internet-Draft ConEx Mechanism July 2010 7.1. ConEx Monitoring One of the most useful things ConEx provides is the ability to monitor (and control) the amount of congestion entering or leaving a network. With ConEx, each packet carries sufficient information to work out the Upstream, Downstream and Total Congestion Volume that packet is responsible for. This allows the overall Congestion Volume to be calculated at any point in the network. In effect this gives a measure of how much excess traffic has been sent that was above the instantaneous transmission capacity of the network. A 1 Gbps router that is 0.1% congested implies that there is 1 Mbps of excess traffic at that point in time. The figure below shows 2 conceptual pieces of network equipment that utilise ConEx information in order to monitor the flow of congestion through the network. The Border Monitor sits at the border between two networks, while the Edge Monitor sits at the ingress or egress to the Internetwork. ,---. ,---. ,-----. / \ ,------. / \ ,------. ,-----. | Src |--( Net A )-| B.M. |-( Net B )--| E.M. |--| Dst | '-----` \ / '------` \ / '------` '-----` '---` ^ '---` ^ Border Monitor Edge Monitor NB, the Edge Monitor could also be at the Src end of the network Figure 1: Ingress, egress and border monitors Note: In the tables below ECN-enabled and ConEx-Enabled are as defined in Section 2. 7.1.1. Edge Monitoring Briscoe, et al. Expires January 13, 2011 [Page 13] Internet-Draft ConEx Mechanism July 2010 +------------+----------------+----------------+--------------------+ | Network | ECN-Enabled? | ConEx-Enabled? | Notes | | Element | | | | +------------+----------------+----------------+--------------------+ | Sender | Yes, if ECN is | Yes, must be | Must be receiving | | | used as basis | sending ConEx | congestion | | | for congestion | information | feedback | | | signal | | | | Sender's | ECN would be | Should | NB, it doesn't | | Network | beneficial | understand | have to be fully | | | | ConEx markings | ConEx-Enabled | | Core | ECN would be | Needn't | ConEx markings | | Network | beneficial | understand | must get through | | | | ConEx | the network | | Receiver's | ECN would be | Should | Deosn't have to be | | Network | beneficial | understand | fully | | | | ConEx markings | ConEx-Enabled | | Receiver | Only needed if | Should | Has to feedback | | | network is | understand | the congestion it | | | ECN-Enabled | ConEx | sees (either ECN | | | | | or drop) | +------------+----------------+----------------+--------------------+ Table 1: Requirements for Edge Monitoring Edge Monitors are ideally positioned to verify the accuracy of ConEx markings. If there is an imbalance between the expected congestion and the actual congestion then this will show up at the egress. Edge Monitors can also be used by an operator to measure the service a given customer is receiving by monitoring how much congestion their traffic is causing. This may allow them to take pre-emptive action if they detect any anomalies. Briscoe, et al. Expires January 13, 2011 [Page 14] Internet-Draft ConEx Mechanism July 2010 7.1.2. Border Monitoring +------------+-----------------+-----------------+------------------+ | Network | ECN-Enabled? | ConEx-Enabled? | Notes | | Element | | | | +------------+-----------------+-----------------+------------------+ | Sender | Must be | Yes, must be | Must receive | | | ECN-enabled if | sending ConEx | accurate | | | any of the | information | congestion | | | network is | | feedback | | Sender's | ECN should be | Should | Ideally would be | | Network | enabled | understand | ConEx-Enabled | | | | ConEx markings | | | Core | ECN should be | Should | Ideally would be | | Network | enabled | understand | ConEx-Enabled | | | | ConEx markings | | | Receiver's | ECN should be | Should | Ideally would be | | Network | enabled | understand | ConEx-Enabled | | | | ConEx markings | | | Receiver | Must be | Must be ConEx | Receiver has to | | | ECN-enabled if | enabled | feedback the | | | any of the | | congestion it | | | network is | | sees | +------------+-----------------+-----------------+------------------+ Table 2: Requirements for Border Monitoring At any border between 2 networks, the operator can see the total Congestion Volume that is being forwarded into its network by the neighbouring network. A Border Monitor is able to measure the bulk congestion markings and establish the flow of Congestion Volume each way across the border. This could be used as the basis for inter- network settlements. It also provides information to target upgrades to where they are actually needed and might help to identify network problems. Border Monitoring really needs the majority of the network to be ECN-Enabled in order to provide the necessary Upstream Congestion signal. Clearly the greatest benefit comes when there is also ConEx deployment in the nnetwork. However, as long as the sender is sending accurate ConEx information and the majority of the network is ECN-enabled, border monitoring will work. 7.2. ConEx Policing As shown above, ConEx gives an easy method of measuring Congestion Volume. This information can be used as a control metric for making traffic management decisions (such as deciding which traffic to prioritise) or to identify and block sources of persistent and damaging congestion. Simple policer mechanisms, such as those Briscoe, et al. Expires January 13, 2011 [Page 15] Internet-Draft ConEx Mechanism July 2010 described in [Policing-freedom] and [re-ecn-motive], can control the overall congestion volume traversing a network. Ingress Policing typically happens at the Ingress Node, Egress Policing typically happens at the Egress Node and Border Policing can happen at any border between two networks. The current charter concentrates on use cases employing Egress Policers. ,---. ,---. +-----+ +------+ / \ +------+ / \ +------+ +-----+ | Src |--| I.P. |--( Net A )-| B.P. |-( Net B )--| E.P. |--| Dst | +-----+ +------+ \ / +------+ \ / +------+ +-----+ ^ '---` ^ '---` ^ Ingress Policer Border Policer Egress Policer Figure 2: Ingress, egress and border policers 7.2.1. Egress Policing +------------+--------------+----------------+----------------------+ | Network | ECN-Enabled? | ConEx-Enabled? | Notes | | Element | | | | +------------+--------------+----------------+----------------------+ | Sender | The sender | Must be | Must be receiving | | | should be | ConEx-Enabled | congestion feedback | | | ECN-enabled | | | | | if any of | | | | | the network | | | | | is | | | | Sender's | ECN is | ConEx is | ConEx would enable | | Network | optional but | optional | them to do Ingress | | | beneficial | | Policing (see later) | | Core | ECN is | Not needed | ConEx marks must | | Network | optional but | | survive crossing the | | | beneficial | | network | | Receiver's | ECN is | Must fully | Each receiver needs | | Network | optional but | understand | an Egress Policer | | | beneficial | ConEx | | | Receiver | Should be | Should | Must feedback the | | | ECN-enabled | understand | congestion it sees. | | | if any of | ConEx | ConEx may have a | | | the network | | compatibility mode | | | is | | if the receiver is | | | | | not ConEx-Enabled | +------------+--------------+----------------+----------------------+ Table 3: Egress Policer Requirements An Egress Policer allows an ISP to monitor the Congestion Volume a Briscoe, et al. Expires January 13, 2011 [Page 16] Internet-Draft ConEx Mechanism July 2010 user's traffic has caused throughout the network, and then use this to prioritise the traffic accordingly. By itself, such a policer cannot tell how much of this congestion was caused in the ISP's own network, but it will identify which users are the "heaviest" in terms of the congestion they have caused. Assuming the ConEx information is accurate then the Egress Policer will be able to see how much congestion exists between it and the final destination (what you might call "last-mile" congestion). There are a number of strategies that could be used to determine how traffic is treated by an Egress Policer. Obviously traffic that is not ConEx enabled needs to receive some form of "default" treatment. Traffic that is ConEx enabled may have under-declared congestion in which case it would be reasonable to give it a low scheduling priority. Traffic that appears to be over-declaring congestion may be simply a result of especially high "last-mile" congestion, in which case the ISP may want to upgrade the access capacity, or may want to try and reduce the volume of traffic. Where the ISP knows what the "last-mile" congestion is (for instance if it is able to measure several users sharing that same capacity) then any remaining over-declared congestion might be seen as a signal that the sender wishes to prioritise this traffic. 7.2.2. Ingress Policing +------------+--------------+----------------+----------------------+ | Network | ECN-Enabled? | ConEx-Enabled? | Notes | | Element | | | | +------------+--------------+----------------+----------------------+ | Sender | Should be | Must be | Must be receiving | | | ECN-enabled | ConEx-enabled | congestion feedback | | Sender's | ECN is | Must | | | Network | optional but | understand | | | | beneficial | ConEx | | | Core | ECN is | Needn't | ConEx markings must | | Network | optional but | understand | survive crossing the | | | beneficial | ConEx | network | | Receiver's | ECN is | Needn't | ConEx markings must | | Network | optional but | understand | survive crossing the | | | beneficial | ConEx | network | | Receiver | Should be | Should be | Must feedback the | | | ECN-enabled | ConEx-Enabled | congestion it sees. | | | if any of | | ConEx may have a | | | the network | | compatibility mode | | | is | | if the receiver is | | | | | not ConEx-Enabled | +------------+--------------+----------------+----------------------+ Table 4: Ingress Policer Requirements Briscoe, et al. Expires January 13, 2011 [Page 17] Internet-Draft ConEx Mechanism July 2010 At the Network Ingress, an ISP can police the amount of congestion a user is causing by limiting the congestion volume they send into the network. One system that achieves this is described in [Policing-freedom]. This uses a modified token bucket to limit the congestion rate being sent rather than the overall rate. Such ingress policing is relatively simple as it requires no flow state. Furthermore, unlike many mechanisms, it treats all a user's packets equally. 7.2.3. Border Policing +------------+--------------+----------------+----------------------+ | Network | ECN-Enabled? | ConEx-Enabled? | Notes | | Element | | | | +------------+--------------+----------------+----------------------+ | Sender | ECN should | Must be | Must receive | | | be enabled | ConEx-enabled | accurate congestion | | | | | feedback | | Sender's | ECN is | Must be | | | Network | optional but | ConEx-enabled | | | | beneficial | | | | Core | ECN is | Should be | Must be | | Network | optional but | ConEx-Enabled | ConEx-Enabled if it | | | beneficial | | is doing the | | | | | policing. At a | | | | | minimum must pass | | | | | ConEx markings | | | | | unaltered | | Receiver's | ECN is | Should be | At a minimum must | | Network | optional but | ConEx-Enabled | pass ConEx markings | | | beneficial | | unaltered | | Receiver | Should be | Should be | Must feedback the | | | ECN-Enabled | ConEx-Enabled | congestion it sees. | | | if any of | | ConEx may have a | | | the network | | compatibility mode | | | is | | if the receiver is | | | | | not ConEx-Enabled | +------------+--------------+----------------+----------------------+ Table 5: Border Policer Requirements A Border Policer will allow an operator to directly control the congestion that it allows into its network. Normally we would expect the controls to be related to some form of contractual obligation between the two parties. However, such Policing could also be used to mitigate some effects of Distributed Denial of Service (see Section 8.3). In effect a Border Policer encourages the network upstream to take responsibility for congestion it will cause Briscoe, et al. Expires January 13, 2011 [Page 18] Internet-Draft ConEx Mechanism July 2010 downstream and could be seen as an incentive for that network to participate in ConEx (e.g. install Ingress Policers) 8. ConEx Use Cases This section sets out some of the use cases for ConEx. These use cases rely on some of the conceptual network elements (policers and monitors) described in Section 7 above. The authors don't claim this is an exhaustive list of use cases, nor that these have equal merit. In most cases ConEx is not the only solution to achieve these. But these use cases represent a consensus among people that have been working on this approach for some years. 8.1. ConEx as a basis for traffic management Currently many ISPs impose some form of traffic management at peak hours. This is a simple economic necessity -- the only reason the Internet works as a commercial concern is that ISPs are able to rely on statistical multiplexing to share their expensive core network between large numbers of customers. In order to ensure all customers get some chance to access the network, the "heaviest" customers will be subjected to some form of traffic management at peak times (typically a rate cap for certain types of traffic) [Fair-use]. Often this traffic management is done with expensive flow aware devices such as DPI boxes or flow-aware routers. ConEx offers a better approach that will actually target the users that are causing the congestion. By using Ingress or Egress Policers, an ISP can identify which users are causing the greatest Congestion Volume throughout the network. This can then be used as the basis for traffic management decisions. The Ingress Policer described in [Policing-freedom] is one interesting approach that gives the user a congestion volume limit. So long as they stay within their limit then their traffic is unaffected. Once they exceed that limit then their traffic will be blocked temporarily. 8.2. ConEx to incentivise scavenger transports Recent work proposes a new approach for QoS where traffic is provided with a less than best effort or "scavenger" quality of service. The idea is that low priority but high volume traffic such as OS updates, P2P file transfers and view-later TV programs should be allowed to use any spare network capacity, but should rapidly get out of the way if a higher priority or interactive application starts up. One solution being actively explored is LEDBAT which proposes a new congestion control algorithm that is less aggressive in seeking out bandwidth than TCP. Briscoe, et al. Expires January 13, 2011 [Page 19] Internet-Draft ConEx Mechanism July 2010 At present most ISPs assume a strong correlation between the volume of a flow and the impact that flow causes in the network. This assumption has been eroded by the growth of interactive streaming which behaves in an inelastic manner and hence can cause high congestion at relatively low data volumes. Currently LEDBAT-like transports get no incentive from the ISP since they still transfer large volumes of data and may reach high transfer speeds if the network is uncongested. Consequently the only current incentive for LEDBAT is that it can reduce self-congestion effects. If the ISP has deployed a ConEx-aware ingress policer then they are able to incentivise the use of LEDBAT because a user will be policed according to the overall congestion volume their traffic generates, not the rate or data volume. If all background file transfers are only generating a low level of congestion, then the sender has more "congestion budget" to "spend" on their interactive applications. It can be shown [Kelly] that this approach improves social welfare -- in other words if you limit the congestion that all users can generate then everyone benefits from a better service. 8.3. ConEx to mitigate DDoS DDoS relies on subverting innocent end users and getting them to send flood traffic to a given destination. This is intended to cause a rapid increase in congestion in the immediate vicinity of that destination. If it fails to do this then it can't be called Denial of Service. If the ingress ISP has deployed Ingress Policers, that ISP will effectively limit how much DDoS traffic enters the 'net. If any ISP along the path has deployed Border Monitors then they will be able to detect a sharp rise in Congestion Volume and if they have Border Policers they will be able to "turn off" this traffic. If the victim of the DDoS attack is behind an Egress Monitor then their ISP will be able to detect which traffic is causing problems. If the compromised user tries to use the 'net during the DDoS attack, they will quickly become aware that something is wrong, and their ISP can show the evidence that their computer has become zombified. DDoS is a genuine problem and so far there is no perfect solution. ConEx does serve to raise the bar somewhat and can avoid the need for some of the more draconian measures that are currently used to control DDoS. More details of this can be found in [Malice]. 8.4. Accounting for Congestion Volume Accountability was one of the original design goals for the Internet [Design-Philosophy]. At the time it was ranked low because the network was non-commercial and it was assumed users had the best interests of the network at heart. Nowadays users generally treat Briscoe, et al. Expires January 13, 2011 [Page 20] Internet-Draft ConEx Mechanism July 2010 the network as a commodity and the Internet has become highly commercialised. This causes problems for ISPs and others which they have tried to solve and often leads to a tragedy of the commons where users end up fighting each other for scarce peak capacity. The most elegant solution would be to introduce an Internet-wide system of accountability where every actor in the network is held to account for the impact they have on others. If Policers are placed at every Network Ingress or Egress and Border Monitors at every border, then you have the basis for a system of congestion accounting. Simply by controlling the overall Congestion Volume each end-system or stub-network can send you ensure everyone gets a better service. 8.5. ConEx as a form of differential QoS Most QoS approaches require the active participation of routers to control the delay and loss characteristics for the traffic. For real-time interactive traffic it is clear that low delay (and predictable jitter) are critical, and thus these probably always need different treatment at a router. However if low loss is the issue then ConEx offers an alternative approach. Assuming the ingress ISP has deployed a ConEx Ingress Policer, then the only control on a user's traffic is dependent on the congestion that user has caused. Likewise, if they are receiving traffic through a ConEx Egress Policer then their ISP will impose traffic controls (prioritisation, rate limiting, etc) based on the congestion they have caused. If an end-user (be they the receiver or sender) wants to prioritise some traffic over other traffic then they can allow that traffic to generate or cause more congestion. The price they will pay will be to reduce the congestion that their other traffic causes. Streaming video content-delivery is a good candidate for such ConEx- mediated QoS. Such traffic can tolerate moderately high delays, but there are strong economic pressures to maintain a high enough data rate (as that will directly influence the Quality of Experience the end-user receives. This approach removes the need for bandwidth brokers to establish QoS sessions, by removing the need to coordinate requests from multiple sources to pre-allocate bandwidth, as well as to coordinate which allocations to revoke when bandwidth predictions turn out to be wrong. There is also no need to "rate-police" at the boundaries on a per-flow basis, removing the need to keep per-flow state (which in turn makes this approach more scalable). Briscoe, et al. Expires January 13, 2011 [Page 21] Internet-Draft ConEx Mechanism July 2010 8.6. Partial vs. Full Deployment In a fully-deployed ConEx-enabled internet, [QoS-Models] shows that ISP settlements based on congestion volume can allocate money to where upgrades are needed. Fully-deployed implies that ConEx-marked packets which have not exhausted their expected congestion would go through a congested path in preference to non-ConEx packets, with money changing hands to justify that priority. In a partial deployment, routers that ignore ConEx markings and let them pass unaltered are no problem unless they become congested and drop packets. Since ConEx incentivises the use of lower congestion transports, such congestion drops should anyway become rare events. ConEx-unaware routers that do drop ConEx-marked packets would cause a problem so to minimise this risk ConEx should be designed such that ConEx packets will appear valid to any node they traverse. Failing that it could be possible to bypass such nodes with a tunnel. If any network is not ConEx enabled then the sender and receiver have to rely on ECN-marking or packet drops to establish the congestion level. If the receiver isn't ConEx-enabled then there needs to be some form of compatibility mode. Even in such partial deployments the end-users and access networks will benefit from ConEx. This will put create incentives for ConEx to be more widely adopted as access networks put pressure on their backhaul providers to use congestion as the basis of their interconnect agreement. The actual charge per unit of congestion would be specified in an interconnection agreement, with economic pressure driving that charge downward to the cost to upgrade whenever alternative paths are available. That charge would most likely be invisible to the majority of users. Instead such users will have a contractual allowance to cause congestion, and would see packets dropped when that allowance is depleted. Once an Autonomous System (AS) agrees to pay any congestion charges to any other AS it forwards to, it has an economic incentive to increase congestion-so-far marking for any congestion within its network. Failure to do this quickly becomes a significant cost, giving it an incentive to turn on such marking. End users (or the writers of the applications they use) will be given an incentive to use a congestion control that back off more aggressively than TCP for any elastic traffic. Indeed they will actually have an incentive to use fully weighted congestion controls that allow traffic to cause congestion in proportion to its priority. Traffic which backs off more aggressively than TCP will see congestion charges remain the same (or even drop) as congestion Briscoe, et al. Expires January 13, 2011 [Page 22] Internet-Draft ConEx Mechanism July 2010 increases; traffic which backs off less aggressively will see charges rise, but the user may be prepared to accept this if it is high- priority traffic; traffic which backs off not at all will see charges rise dramatically. 9. Other issues 9.1. Congestion as a Commercial Secret Network operators have long viewed the congestion levels in their network as a business secret. In some ways this harks back to the days of fixed-line telecommunications where congestion manifested as failed connections or dropped calls. But even in modern data-centric packet networks congestion is viewed as a secret not to be shared with competitors. It can be debated whether this view is sensible, but it may make operators uneasy about deploying ConEx. The following two examples highlight some of the arguments used: o An ISP buys backhaul capacity from an operator. Most ISPs want their customers to get a decent service and so they want the backhaul to be relatively uncongested. If there is competition, operators will seek to reassure their customers (the ISPs) that their network is not congested in order to attract their custom. Some operators may see ConEx as a threat since it will enable those ISPs to see the actual congestion in their network. On the other hand, operators with low congestion could use ConEx to show how well their network performs, and so might have an incentive to enable it. o ISPs would like to be part of the lucrative content provision market. Currently the ISP can gain a competitive edge as it can put its own content in a higher QoS class, whereas traffic from content providers has to use the Best Effort class. The ISP may take the view that if they can conceal the congestion level in their Best Effort class this will make it harder for the content provider to maintain a good level of QoS. But in reality the Content Provider will just use the feedback mechanisms in streaming protocols such as Adobe Flash to monitor the congestion. Of course some might say that the idea of keeping congestion secret is silly. After all, end-hosts already have knowledge of the congestion throughout the network, albeit only along specific paths, and ISPs can work out that there is persistent congestion as their customers will be suffering degraded network performance. Briscoe, et al. Expires January 13, 2011 [Page 23] Internet-Draft ConEx Mechanism July 2010 9.2. Information Security make a source believe it has seen more congestion than it has hijack a user's identity and make it appear they are dishonest at an egress policer clear or otherwise tamper with the ConEx markings ... {ToDo} Write these up properly... 10. Security Considerations This document proposes a mechanism tagging onto Explicit Congestion Notification [RFC3168], and inherits the security issues listed therein. The additional issues from ConEx markings relate to the degree of trust each forwarding point places in the ConEx markings it receives, which is a business decision mostly orthogonal to the markings themselves. One expected use of exposed congestion information is to hold the end-to-end transport and the network accountable to each other. The network cannot be relied on to report information to the receiver against its interest, and the same applies for the information the receiver feeds back to the sender, and that the sender reports back to the network. Looking at each in turn: The Network In general it is not in any network's interest to under- declare congestion since this will have potentially negative consequences for all users of that network. It may be in its interest to over-declare congestion if, for instance, it wishes to force traffic to move away to a different network or simply to reduce the amount of traffic it is carrying. Congestion Exposure itself won't significantly alter the incentives for and against honest declaration of congestion by a network, but we can imagine applications of Congestion Exposure that will change these incentives. There is a perception among network operators that their level of congestion is a business secret. Today, congestion is one of the worst-kept secrets a network has, because end-hosts can see congestion better than network operators can. Congestion Exposure will enable network operators to pinpoint whether congestion is on one side or the other of any border. It is conceivable that forwarders with underprovisioned networks may try to obstruct deployment of Congestion Exposure. Briscoe, et al. Expires January 13, 2011 [Page 24] Internet-Draft ConEx Mechanism July 2010 The Receiver Receivers generally have an incentive to under-declare congestion since they generally wish to receive the data from the sender as rapidly as possible. [Savage] explains how a receiver can significantly improve their throughput my failing to declare congestion. This is a problem with or without Congestion Exposure. [KGao] explains one possible technique to encourage receiver's to be honest in their declaration of congestion. The Sender One proposed mechanism for Congestion Exposure deployment adds a requirement for a sender to advise the network how much congestion it has suffered or caused. Although most senders currently respond to congestion they are informed of, one use of exposed congestion information might be to encourage sources of persistent congestion to back off more aggressively. Then clearly there may be an incentive for the sender to under-declare congestion. This will be a particular problem with sources of flooding attacks. "Policing" mechanisms have been proposed to deal with this. In addition there are potential problems from source spoofing. A malicious sender can pretend to be another user by spoofing the source address. Congestion Exposure allows for "Policers" and "Traffic Shapers" so as to be robust against injection of false congestion information into the forward path. 11. IANA Considerations This document does not require actions by IANA. 12. Acknowledgments Bob Briscoe is partly funded by Trilogy, a research project (ICT- 216372) supported by the European Community under its Seventh Framework Programme. The views expressed here are those of the author only. The authors would like to thank Contributing Authors Bernard Aboba, Joao Taveira Araujo, Louise Burness, Alissa Cooper, Philip Eardley, Michael Menth, and Hannes Tschofenig for their inputs to this document. Useful feedback was also provided by Dirk Kutscher. 13. References 13.1. Normative References [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Briscoe, et al. Expires January 13, 2011 [Page 25] Internet-Draft ConEx Mechanism July 2010 Congestion Notification (ECN) to IP", RFC 3168, September 2001. 13.2. Informative References [BB-incentive] MIT Communications Futures Program (CFP) and Cambridge University Communications Research Network, "The Broadband Incentive Problem", September 2005. [Design-Philosophy] Clarke, D., "The Design Philosophy of the DARPA Internet Protocols", 1988. [Fair-use] Broadband Choices, "Truth about 'fair usage' broadband", 2009. [Fairer-faster] Briscoe, B., "A Fairer Faster Internet Protocol", IEEE Spectrum Dec 2008 pp38-43, December 2008. [I-D.briscoe-tsvwg-re-ecn-tcp-motivation] Briscoe, B., Jacquet, A., Moncaster, T., and A. Smith, "Re-ECN: A Framework for adding Congestion Accountability to TCP/IP", draft-briscoe- tsvwg-re-ecn-tcp- motivation-01 (work in progress), September 2009. [KGao] Gao, K. and C. Wang, "Incrementally Deployable Prevention to TCP Attack with Misbehaving Receivers", December 2004. [Kelly] Kelly, F., Maulloo, A., and D. Tan, "Rate control for communication networks: shadow prices, proportional fairness and Briscoe, et al. Expires January 13, 2011 [Page 26] Internet-Draft ConEx Mechanism July 2010 stability", Journal of the Operational Research Society 49(3) 237--252, 1998, . [LEDBAT] Shalunov, S., "Low Extra Delay Background Transport (LEDBAT)", draft-ietf- ledbat-congestion-01 (work in progress), March 2010. [Malice] Briscoe, B., "Using Self Interest to Prevent Malice; Fixing the Denial of Service Flaw of the Internet", WESII - Workshop on the Economics of Securing the Information Infrastructure 2006, 2006, . [OfCom] Ofcom: Office of Communications, "UK Broadband Speeds 2008: Research report", January 2009. [Policing-freedom] Briscoe, B., Jacquet, A., and T. Moncaster, "Policing Freedom to Use the Internet Resource Pool", RE-Arch 2008 hosted at the 2008 CoNEXT conference , December 2008. [QoS-Models] Briscoe, B. and S. Rudkin, "Commercial Models for IP Quality of Service Interconnect", BTTJ Special Edition on IP Quality of Service vol 23 (2), April 2005. Briscoe, et al. Expires January 13, 2011 [Page 27] Internet-Draft ConEx Mechanism July 2010 [RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, S., Wroclawski, J., and L. Zhang, "Recommendations on Queue Management and Congestion Avoidance in the Internet", RFC 2309, April 1998. [Re-Feedback] Briscoe, B., Jacquet, A., Di Cairano-Gilfedder, C., Salvatori, A., Soppera, A., and M. Koyabe, "Policing Congestion Response in an Internetwork Using Re- Feedback", ACM SIGCOMM CCR 35(4)277--288, August 2005, . [Savage] Savage, S., Wetherall, D., and T. Anderson, "TCP Congestion Control with a Misbehaving Receiver", ACM SIGCOMM Computer Communication Review , 1999. [re-ecn-motive] Briscoe, B., Jacquet, A., Moncaster, T., and A. Smith, "Re-ECN: A Framework for adding Congestion Accountability to TCP/IP", draft-briscoe- tsvwg-re-ecn-tcp- motivation-01 (work in progress), September 2009. Briscoe, et al. Expires January 13, 2011 [Page 28] Internet-Draft ConEx Mechanism July 2010 Authors' Addresses Bob Briscoe BT B54/77, Adastral Park Martlesham Heath Ipswich IP5 3RE UK Phone: +44 1473 645196 EMail: bob.briscoe@bt.com URI: http://bobbriscoe.net/ Richard Woundy Comcast Comcast Cable Communications 27 Industrial Avenue Chelmsford, MA 01824 US EMail: richard_woundy@cable.comcast.com URI: http://www.comcast.com Toby Moncaster (editor) Moncaster.com Dukes Layer Marney Colchester CO5 9UZ UK EMail: toby@moncaster.com John Leslie (editor) JLC.net 10 Souhegan Street Milford, NH 03055 US EMail: john@jlc.net Briscoe, et al. Expires January 13, 2011 [Page 29]