Congestion Exposure (CONEX) BoF =============================== Tuesday, November 10 from 1520-1810 NOTES by Steven Blake, Mat Ford CHAIRS: Leslie Daigle Philip Eardley Meeting materials: https://datatracker.ietf.org/meeting/76/materials.html AGENDA ------ Administrivia [ 5 mins] - no objections to the agenda Introduction by chairs [ 5 mins] Background The problem [50 mins] == End-user perspective [Murari Sridharan] - OS bottlenecks removed over last 6-7 years - congestion control has become so sophisticated they can fill up any residual capacity now - people complain about lack of resources now that pipes regularly get filled to end-host, network is largely a black box - characteristics are inferred imposes complex usage requirements - volume caps - example of end user hit with heavy pay-per-use bill for downloading windows update, despite the fact that windows update runs over a scavenger service. network view - end hosts can't be trusted, application performance needs to be inferred, end-hosts typically establish connectivity to well-known servers, but reality is different. innovation is all in layer-violation - this isn't sustainable! congestion control purity is no longer reality - TCP tunnelled inside TCP isn't a route to deterministic results! - no comments at the mic == ISP Context/motivation [Rich Woundy] Enable customers to control their own network experience - take ISP out of the loop of being 'application police' > Linda Dunbar: you discuss making the heavy user lower priority; what does this have to do with network congestion? > Rich W: Comcast mechanism is not a conex proposal. Our mechanism does not see end2end. It only solves congestion problem at the edge, not for any other part of our network or for third parties. comcast solution is monitoring capacity and resource use in the network. Measuring end-to-end congestion requires another mechanism. > Linda Dunbar: congestion is very transient. End user has no control over over the network's behavior. > Leslie: wait until after the other presentations > Bob Briscoe: mention LEDBAT > Leslie: non-IETF ISOC meeting today on bandwidth; meeting materials (including audio) at ISOC website == Technical problem [Mark Handley] Whenever Mark presents about congestion people say 'we're overprovisioned, don't care' - he doesn't subscribe to that - will talk about why we need to care about congestion TCP isn't broken, but we can do better used to be receive window limited in end systems - no longer the case in popular modern TCP stacks Goal should be for congested networks, at least at the bottleneck. That doesn't mean persistently congested. But a transport that can't congest a bottleneck is broken. Packet loss isn't necessarily a problem - for apps where it is a problem we have ECN > Greg Lebovitz: think about next-gen apps, and where they might fit on the latency/size graph (e.g., Internet-of=things). Existing apps were built a certain way because the network works the way it does. > Mark H: some things we haven't put on the net because they don't work with today's Internet. > Geoff Huston: making a good use of the network in either case (re: latency, latency, latency) slide. > Mark H: better from a user's perspective Discussion after Mark's summary slide: > Aaron Falk: ISPs aren't charging by the byte > Jana Iyengar: works at residential education institution - students tend to do a lot of 'stuff' that needs separated from academic work - DPI is used for that, maybe in other enterprises too > Ben Niven-Jenkins: if the prime driver for ISPs to use DPI is to control costs, don't see how it is relevant in the context of controlling congestion > Mark H: because they need to control costs, because they have congestion - need to maintain good service for higher-paying customers; trying to provide acceptable service to non P2P apps. > Ben N-J: DPI being used to prevent going over usage caps > Mat Ford: That doesn't explain why users on relatively low service tiers have their throughput throttled regardless of whether or not they have exceeded any volume threshold > Ben N-J: economic issue; how does technology solve it. Don't know how the two are linked. > Vijay Gill: agree with Mat; if an ISP have chronic congestion, why not keep raising the price until users fall off. Comcast has two tiers of service (residential, business). Residential has 250 GB cap; business service has no cap. Most folks I know go for the business service ($100/month). > Bob Briscoe: try to answer Ben's question - he was talking about retail ISPs as customers. Wholesaler sells bit rates to the retail ISPs. > Ben N-J: it is not us causing the bottleneck > Leslie: not here to talk about how to make things work with DPI; let's move on. > Stanislav Shalunov: what can you convey with conex that you can't convey with simple drops? Latency ... > Mark H: only talking about the problem space > Stanislav S: current loss rates (< 1%) are working fine. > Mark H: loss rates are very variable > Aaron Falk: re: IETF Goals - we shouldn't be talking about economics. Disagree that the only cost is congestion cost (overstatement). Infrastructure has cost. Question is incremental cost. Economic expertise would be useful in this discussion. > Mark H: yes, underutilised network gives you ability to communicate - additional traffic doesn't cost anything more, until you experience congestion Towards a solution [20 mins] == Overview of re-ECN [Bob Briscoe] > Greg L: (re: slide 4: measuring contribution to congestion of scavenger vs bursty user) before when we saw that sometimes there is a long duration connection and something that was very short has to wait, in that case, one of them will show the network is very bad, and the other one showing that the network is working very well. > Bob B: small transfer can go fast, because it won't build up a big congestion volume; incentive for large transfer to back off because it could build up large congestion volume. > Greg L: how do they know about each other? > Bob B: because of the congestion notifications. I'm able to use up my congestion allowance if I know I won't be transferring long. > Greg L: you said that is how TCP works. This works by some box in the middle. > Bob B: imagine a TCP with either a high weight or a low weight. > Greg L: is it dependent on a box in the middle, or just the endpoints. > Bob B: box in middle just enforces congestion volume allowance, in one possible deployment. > Lars Eggert: trying to show that the problem is solvable, not that this is the solution. > Lars Eggert (at start of slide 9): we believe you Bob that you can't cheat (leap of faith) > Murari S: Thought I understood conex, now I'm confused. If you only look at the problem of short flows coming in and using lots of BW, problem can be solved if all the bulk users used scavenger service. Problem would automatically solve itself. > Bob B: ISPs can't see congestion, how do you get a free pass to use scavenger and not have volume counted against you. > Murari S: ISPs problem is because customers complain that they're not getting service. > Philip E: ISPs can't trust that end users are using LEDBAT; don't want them using DPI to figure this out. Trying to get the two to cooperate. > Murari S: if everyone would start using scavenger, then the main purpose of Conex is to reroute traffic under congestion. > Bob B: how does the ISP know that the user is using LEDBAT. Purpose is to help operators know which traffic is causing most congestion > Mark H: reasonable for ISPs to limit traffic that is causing a problem. But ISPs cannot see which traffic is causing the problem. > Jana Iyengar: incentive issue with LEDBAT; end hosts may choose not to use it. > Bob B: what about big-big flows and little-big flows? Two classes may not be enough. > Stanislav Shalunov: why send the info from the user? Why not keep it within the ISP, so that you can evolve the algorithm. > Bob B: don't want to do congestion control within the network. > Stanislav S: seems like two justifications: solve the LEDBAT problem, ?? > Lars Eggert: bunch of questions, high-level question is whether we all understand what the problem is (pauses for show of hands) -- seems like it. Questions that relate to whether exposing congestion solves the problems. Then questions around how re-ECN exposes congestion. > Leslie: next discussion is around principles and constraints. > Lars Eggert: should have questions along whether there is a problem that IETF should solve. Is everyone confused? Then there is a question of whether exposing congestion is a way to solve the problem, then there is the question of re-ECN as a suitable solution - lets tease these apart, and start at the beginning > Linda Dunbar: want to confirm the problem statement, to expose network congestion to the end user? > Bob B: the opposite - expose it to the network > Linda Dunbar: if I'm an end user, what benefits me by doing something different? > Bob B: suppose I say that the amount of volume you send is a problem; that's not true. I can't judge whether you are sending during a trough. > Linda D: one application is so trivial in the whole network. > Leslie: hypothetical, my husband tells me the road is congested; I can use the toll road or wait in traffic > Linda D: I don't have a choice > Bob B: you can choose to slow down and send later > Linda D: can we have one single problem statement? > Aaron Falk: minor epiphany - way to identify responsive flows; give them a seal of approval, giving them higher priority. Been discussed in IETF for a long time. However, you are talking about aggregated statistics, idea of distinguishing flows is problematic. > Bob B: can differentiate between networks that encourage users to behave vs. ones that don't. > Tim Shephard: answer Lars - I think I understand this, but it has nothing to do with what I have heard in this room today. Seen Bob talk many times and I don't think he has taken people that don't understand this to understand it. Not effective use of time to get these people to understand this in the meeting. You want to expose to your ISP which traffic you are sending is exposing congestion somewhere in the network. ISP has another thing to bill you on. ISP than then stop billing you on how much total traffic you send, so then you can send lots of traffic. It might be possible, bit of genius here. Whether IETF should proceed is the question here. will waste time if we try to educate everyone today. > Leslie: line closed after Stanislav > Luca Martini: talking about congestion-based pricing. What is the guarantee I have as a user that the network will provide enough BW? > Bob B: ISP wants to try to offer a better service than other ISPs. If it is only limiting the traffic that causes congestion, you can get more out of my network. If I allow my network to get more congested I'd have a crap network and would lose customers. Overall bandwidth availability isn't the issue, it's about good use of the available capacity - doesn't mean net-ops can ignore need to provision 'adequately'. > Luca Martini: what is the SLA that you are giving me? > Bob B: my SLA might be that you can send 70 GB a day under typical conditions > Rich W: not sure I am changing my pricing as a result of this > Matt Mathis: very worried that there is a bigger problem on the horizon that conex might be part of the solution. TCP Friendly is an oxymoron, only beginning to feel the symptoms of that. > Murari S: brilliant extension of ECN, summarizing: causes better cooperation between users so we don't have to do weird things in the network, incentive to use scavenger service, better traffic engineering in the network > Christian Vogt: presentations show that the is an improvement opportunity. ECN has not been deployed. Seems like this is a solution to congestion. Would this be more likely to be deployed than ECN? Why is it easier to deploy than ECN. Think it might be due to additional incentives. > Bob B: why is this more deployable? Would give network operators much better incremental performance boost than just with ECN. It would give net-ops a stronger incentive to get much greater performance out of their network > Stanislav S: made the point that ISPs cannot trust users to use the right applications. If this does affect pricing, this is a license to charge whatever you want, because the network is sending you the signal that you are congested. > Leslie: this is about information available in the network > Stanislav S: don't see incentives to setting those markings in congestion. Only works with perfect competition and transparency. If there is no pricing impact, then this is an extra throttle (?). Applicable to any mechanism ... > Lars Eggert: this isn't the network telling the end system that there is congestion - ISPs can already drop packets if they don't like your flow. The question is can the ISP correctly trust the end-systems to accurately indicate the congestion that they are going to cause. > Stanislav S: extra throttle, it already has a throttle because it can drop packets. > Philip E: several points. ISPs can't lie to their end-users in the presence of competition and negative customer feedback. In a market where you have a termination monopoly, they can charge what they like anyway, so this doesn't make a difference - that problem can only be solved through regulation. > Jana Iyengar: re-ECN presentation conflated two things: one way to expose congestion, and what the operator could do with the information. That is one candidate way of exposing congestion. > Leslie: moving discussion to whether there is a problem for the IETF to solve. > Lars Eggert: knew that this wasn't an easy thing to get your head around. Who has read the documents? (several hands -- more than a dozen, around the room). Who thought that the documents were perfectly clear? (didn't check for hands). Expected more discussion on the mailing list. Disappointed. Not sure if we are ready at this point to discuss a charter. Problem statement is good to discuss now. If we are going for a second BoF, shouldn't spend time on tutorials. > Matt Mathis: how much risk there is associated with starting the work before everything else is clear. Launch with a draft charter. > Bob B: one way to ask the question is whether those who don't understand it want to stop those that do from solving it. > Leslie: for those who do understand it, is the problem statement something the IETF should undertake (lots of hands), if you believe the IETF should not undertake (no hands). > Greg L: process point, reason we bring to standards is so that everyone else can understand and implement. Need to make sure that the rest of the community can understand. > Matt Mathis: to people who don't understand it; how many of them are afraid of it. > Stanislav S: believe I understand, but my understanding does not correspond with the intended one. > Leslie: repeat with hums - understand the material, believe the problem statement describes work the IETF should do (loud hum) - understand, don't believe this is a good problem statement (silent) Believe there is momentum > Mat Ford: another 30 minutes left, good use of remaing time to explain for those who don't understand > Jana I: we need a really well written applicability statement > Aaron Falk: problem statement doesn't say anything about whether this is a standards activity; is that the intention > Philip E: deliberately did not answer that in the charter; let IESG decide that. > Aaron Falk: good candidate for experimental/informational RFCs on possibly multiple mechanisms; premature for standards track. > Vijay Gill: very interesting work; should proceed; need deployment considerations documentation > Leslie: two things: deployment considerations, how this would be considered useful in practice > Alissa Cooper: presenters have tried hard to keep the potential uses apart from what is being exposed, but it makes it more difficult to understand why this is useful. In future, it would be better to be able to talk about potential uses, but carefully state assumptions about ISP business models. > Leslie: important output would be an applicability statement > Francois Le Faucheur: responding to Stanislav - if ISPs don't make use of the feedback in unpleasant ways, then it is not useful, may be true. Good properties because it is incrementally deployable. Not obvious that this won't work. There is space for someone to say "in my restaurant, if you eat more, you pay more" > Ingemar Johansson: don't want to exclude other transports than TCP. Feels that different dynamics in congestion behaviour could cause some issues for a protocol like this. > Stanislav S: difference between restaurants and data transmission is ratio of cost of billing to cost of goods. Restaurants will itemise, but in data networks models are simpler. > Bob B: reason I put up congestion volume billing (limit at flat fee), was because it is simple. > Francois L: agree? > Stanislav S: ISPs will expose congestion, but sometimes you will get to send more, sometimes less. I expect that ISPs would use flat-pricing but impose different congestion limits. > Francois L: something better than flat rate will evolve. Like saying unlimited calling for cell phones. > Stanislav S: that is how things are rapidly evolving > Leslie: should a WG be formed with this charter + some word-smithing - Yes (some hums) - No (some hums) "Yes" hum was distinctly louder than "no". Seems that there is rough consensus to form a WG. Willing to declare victory. > Philip E: demo will start in 5 minutes. Meeting adjourned. Agenda items not formally addressed: Discussion of potential IETF work Constraints [10 mins] [Philip Eardley] Discussion of viability of congestion exposure [40 mins] [Leslie Daigle] Draft charter discussion [20 mins]