Congestion Exposure (CONEX) BoF
===============================
Tuesday, November 10 from 1520-1810

NOTES by Steven Blake, Mat Ford

CHAIRS:
     Leslie Daigle   <leslie@thinkingcat.com>
     Philip Eardley  <philip.eardley@bt.com>

Meeting materials: https://datatracker.ietf.org/meeting/76/materials.html

AGENDA
------

Administrivia [ 5 mins]

- no objections to the agenda

Introduction by chairs [ 5 mins]

Background
   The problem [50 mins]

== End-user perspective [Murari Sridharan]

  - OS bottlenecks removed over last 6-7 years
  - congestion control has become so sophisticated they can fill up any residual capacity now
  - people complain about lack of resources now that pipes regularly get filled

to end-host, network is largely a black box - characteristics are inferred imposes complex usage requirements - volume caps - example of end user hit with heavy pay-per-use bill for downloading windows update, despite the fact that windows update runs over a scavenger service.

network view - end hosts can't be trusted, application performance needs to be inferred, end-hosts typically establish connectivity to well-known servers, but reality is different.

innovation is all in layer-violation - this isn't sustainable!

congestion control purity is no longer reality - TCP tunnelled inside TCP isn't a route to deterministic results!


- no comments at the mic


== ISP Context/motivation [Rich Woundy]

Enable customers to control their own network experience - take ISP out 
of the loop of being 'application police'


 > Linda Dunbar: you discuss making the heavy user lower priority; what 
does this have to do with network congestion?

 > Rich W: Comcast mechanism is not a conex proposal.  Our mechanism 
does not see end2end.  It only solves congestion problem at the edge, 
not for any other part of our network or for third parties. comcast 
solution is monitoring capacity and resource use in the network. 
Measuring end-to-end congestion requires another mechanism.

 > Linda Dunbar: congestion is very transient.  End user has no control 
over over the network's behavior.

 > Leslie: wait until after the other presentations

 > Bob Briscoe: mention LEDBAT

 > Leslie: non-IETF ISOC meeting today on bandwidth; meeting materials
   (including audio) at ISOC website


== Technical problem [Mark Handley]

Whenever Mark presents about congestion people say 'we're 
overprovisioned, don't care' - he doesn't subscribe to that - will talk 
about why we need to care about congestion

TCP isn't broken, but we can do better

used to be receive window limited in end systems - no longer the case in 
popular modern TCP stacks

Goal should be for congested networks, at least at the bottleneck.  That 
doesn't mean persistently congested. But a transport that can't congest 
a bottleneck is broken.

Packet loss isn't necessarily a problem - for apps where it is a problem 
we have ECN


 > Greg Lebovitz: think about next-gen apps, and where they might fit on 
the latency/size graph (e.g., Internet-of=things).  Existing apps were 
built a certain way because the network works the way it does.

 > Mark H: some things we haven't put on the net because they don't work 
with today's Internet.

 > Geoff Huston: making a good use of the network in either case (re: 
latency, latency, latency) slide.

 > Mark H: better from a user's perspective

Discussion after Mark's summary slide:

 > Aaron Falk: ISPs aren't charging by the byte

 > Jana Iyengar: works at residential education institution - students 
tend to do a lot of 'stuff' that needs separated from academic work - 
DPI is used for that, maybe in other enterprises too

 > Ben Niven-Jenkins: if the prime driver for ISPs to use DPI is to 
control costs, don't see how it is relevant in the context of 
controlling congestion

 > Mark H:  because they need to control costs, because they have 
congestion - need to maintain good service for higher-paying customers; 
trying to provide acceptable service to non P2P apps.

 > Ben N-J: DPI being used to prevent going over usage caps

 > Mat Ford: That doesn't explain why users on relatively low service 
tiers have their throughput throttled regardless of whether or not they 
have exceeded any volume threshold

 > Ben N-J: economic issue; how does technology solve it.  Don't know 
how the two are linked.

 > Vijay Gill: agree with Mat; if an ISP have chronic congestion, why 
not   keep raising the price until users fall off.  Comcast has two 
tiers of   service (residential, business).  Residential has 250 GB cap; 
business   service has no cap.  Most folks I know go for the business 
service   ($100/month).

 > Bob Briscoe: try to answer Ben's question - he was talking about 
retail ISPs as customers.  Wholesaler sells bit rates to the retail ISPs.

 > Ben N-J: it is not us causing the bottleneck

 > Leslie: not here to talk about how to make things work with DPI; 
let's move on.

 > Stanislav Shalunov: what can you convey with conex that you can't 
convey with simple drops?  Latency ...

 > Mark H: only talking about the problem space

 > Stanislav S: current loss rates (< 1%) are working fine.

 > Mark H: loss rates are very variable

 > Aaron Falk: re: IETF Goals - we shouldn't be talking about economics. 
    Disagree that the only cost is congestion cost (overstatement). 
Infrastructure has cost.  Question is incremental cost.  Economic 
expertise would be useful in this discussion.

 > Mark H: yes, underutilised network gives you ability to communicate - 
additional traffic doesn't cost anything more, until you experience 
congestion

   Towards a solution [20 mins]

== Overview of re-ECN   [Bob Briscoe]

 > Greg L: (re: slide 4: measuring contribution to congestion of 
scavenger vs bursty user) before when we saw that sometimes there is a 
long duration connection and something that was very short has to wait, 
in that case, one of them will show the network is very bad, and the 
other one showing that the network is working very well.

 > Bob B: small transfer can go fast, because it won't build up a big 
congestion volume; incentive for large transfer to back off because it 
could build up large congestion volume.

 > Greg L: how do they know about each other?

 > Bob B: because of the congestion notifications.  I'm able to use up 
my   congestion allowance if I know I won't be transferring long.

 > Greg L: you said that is how TCP works.  This works by some box in 
the   middle.

 > Bob B: imagine a TCP with either a high weight or a low weight.

 > Greg L: is it dependent on a box in the middle, or just the endpoints.

 > Bob B: box in middle just enforces congestion volume allowance, in 
one    possible deployment.

 > Lars Eggert: trying to show that the problem is solvable, not that 
this is the solution.

 > Lars Eggert (at start of slide 9): we believe you Bob that you can't 
cheat (leap of faith)

 > Murari S: Thought I understood conex, now I'm confused.  If you only 
look at the problem of short flows coming in and using lots of BW, 
problem  can be solved if all the bulk users used scavenger service. 
Problem would automatically solve itself.

 > Bob B: ISPs can't see congestion, how do you get a free pass to use 
scavenger and not have volume counted against you.

 > Murari S: ISPs problem is because customers complain that they're not 
   getting service.

 > Philip E: ISPs can't trust that end users are using LEDBAT; don't 
want them using DPI to figure this out.  Trying to get the two to cooperate.

 > Murari S: if everyone would start using scavenger, then the main 
purpose of Conex is to reroute traffic under congestion.

 > Bob B: how does the ISP know that the user is using LEDBAT. Purpose 
is to help operators know which traffic is causing most congestion

 > Mark H: reasonable for ISPs to limit traffic that is causing a 
problem.  But ISPs cannot see which traffic is causing the problem.

 > Jana Iyengar: incentive issue with LEDBAT; end hosts may choose not 
to    use it.

 > Bob B: what about big-big flows and little-big flows?  Two classes 
may not be enough.

 > Stanislav Shalunov: why send the info from the user?  Why not keep it 
within the ISP, so that you can evolve the algorithm.

 > Bob B: don't want to do congestion control within the network.

 > Stanislav S: seems like two justifications: solve the LEDBAT problem, ??

 > Lars Eggert: bunch of questions, high-level question is whether we 
all   understand what the problem is (pauses for show of hands) -- seems 
like it.  Questions that relate   to whether exposing congestion solves 
the problems.  Then questions  around how re-ECN exposes congestion.


 > Leslie: next discussion is around principles and constraints.

 > Lars Eggert: should have questions along whether there is a problem 
that IETF should solve.  Is everyone confused? Then there is a question 
of whether exposing congestion is a way to solve the problem, then there 
is the question of re-ECN as a suitable solution - lets tease these 
apart, and start at the beginning

 > Linda Dunbar: want to confirm the problem statement, to expose 
network   congestion to the end user?

 > Bob B: the opposite - expose it to the network

 > Linda Dunbar: if I'm an end user, what benefits me by doing something 
   different?

 > Bob B: suppose I say that the amount of volume you send is a problem; 
   that's not true.  I can't judge whether you are sending during a trough.

 > Linda D: one application is so trivial in the whole network.

 > Leslie: hypothetical, my husband tells me the road is congested; I 
can   use the toll road or wait in traffic

 > Linda D: I don't have a choice

 > Bob B: you can choose to slow down and send later

 > Linda D: can we have one single problem statement?

 > Aaron Falk: minor epiphany - way to identify responsive flows; give 
them a seal of approval, giving them higher priority.  Been discussed in 
IETF for a long time.  However, you are talking about aggregated 
statistics, idea of distinguishing flows is problematic.

 > Bob B: can differentiate between networks that encourage users to 
behave vs. ones that don't.

 > Tim Shephard: answer Lars - I think I understand this, but it has 
nothing to do with what I have heard in this room today.  Seen Bob talk 
many times and I don't think he has taken people that don't understand 
this to understand it.  Not effective use of time to get these people to 
understand this in the meeting.  You want to expose to your ISP which 
traffic you are sending is exposing congestion somewhere in the network. 
  ISP has another thing to bill you on.  ISP than then stop billing you 
on how much total traffic you send, so then you can send lots of 
traffic.  It might be possible, bit of genius here.  Whether IETF should 
proceed is the question  here.  will waste time if we try to educate 
everyone today.

 > Leslie: line closed after Stanislav

 > Luca Martini: talking about congestion-based pricing.  What is the 
guarantee I have as a user that the network will provide enough BW?

 > Bob B: ISP wants to try to offer a better service than other ISPs. If 
it is only limiting the traffic that causes congestion, you can get more 
out of my network.  If I allow my network to get more congested I'd have 
a crap network and would lose customers. Overall bandwidth availability 
isn't the issue, it's about good use of the available capacity - doesn't 
mean net-ops can ignore need to provision 'adequately'.

 > Luca Martini: what is the SLA that you are giving me?

 > Bob B: my SLA might be that you can send 70 GB a day under typical 
conditions

 > Rich W: not sure I am changing my pricing as a result of this

 > Matt Mathis: very worried that there is a bigger problem on the 
horizon that conex might be part of the solution.  TCP Friendly is an 
oxymoron, only  beginning to feel the symptoms of that.

 > Murari S: brilliant extension of ECN, summarizing: causes better 
cooperation  between users so we don't have to do weird things in the 
network, incentive  to use scavenger service, better traffic engineering 
in the network

 > Christian Vogt: presentations show that the is an improvement 
opportunity.  ECN has not been deployed.   Seems like this is a solution 
to congestion.   Would this be more likely to be deployed than ECN?  Why 
is it easier to deploy than ECN.  Think it might be due to additional 
incentives.

 > Bob B: why is this more deployable?  Would give network operators 
much   better incremental performance boost than just with ECN.  It 
would give net-ops a stronger incentive to get much greater performance 
out of their network

 > Stanislav S: made the point that ISPs cannot trust users to use the 
right applications.  If this does affect pricing, this is a license to 
  charge whatever you want, because the network is sending you the 
signal   that you are congested.

 > Leslie: this is about information available in the network

 > Stanislav S: don't see incentives to setting those markings in 
congestion.  Only works with perfect competition and transparency.  If 
there is no pricing impact, then this is an extra throttle (?). 
Applicable to any mechanism ...

 > Lars Eggert: this isn't the network telling the end system that there 
is congestion - ISPs can already drop packets if they don't like your 
flow. The question is can the ISP correctly trust the end-systems to 
accurately indicate the congestion that they are going to cause.


 > Stanislav S: extra throttle, it already has a throttle because it can 
drop packets.

 > Philip E: several points. ISPs can't lie to their end-users in the 
presence of competition and negative customer feedback. In a market 
where you have a termination monopoly, they can charge what they like 
anyway, so this doesn't make a difference - that problem can only be 
solved through regulation.

 > Jana Iyengar: re-ECN presentation conflated two things: one way to 
expose  congestion, and what the operator could do with the information. 
  That is one candidate way of exposing congestion.

 > Leslie: moving discussion to whether there is a problem for the IETF 
to solve.

 > Lars Eggert: knew that this wasn't an easy thing to get your head 
around.  Who has read the documents? (several hands -- more than a 
dozen, around the room).  Who thought that the documents   were 
perfectly clear? (didn't check for hands).  Expected more discussion  on 
the mailing list.  Disappointed.  Not sure if we are ready at this point 
  to discuss a charter.  Problem statement is good to discuss now.  If 
we  are going for a second BoF, shouldn't spend time on tutorials.

 > Matt Mathis: how much risk there is associated with starting the work 
before everything else is clear.  Launch with a draft charter.

 > Bob B: one way to ask the question is whether those who don't 
understand it want to stop those that do from solving it.

 > Leslie: for those who do understand it, is the problem statement 
something the IETF should undertake (lots of hands), if you believe the 
IETF should not undertake (no hands).

 > Greg L: process point, reason we bring to standards is so that 
everyone  else can understand and implement.  Need to make sure that the 
rest of  the community can understand.

 > Matt Mathis: to people who don't understand it; how many of them are 
afraid of it.

 > Stanislav S: believe I understand, but my understanding does not 
correspond with the intended one.

 > Leslie: repeat with hums
   - understand the material, believe the problem statement describes 
work  the IETF should do (loud hum)
   - understand, don't believe this is a good problem statement (silent)

Believe there is momentum

 > Mat Ford: another 30 minutes left, good use of remaing time to 
explain   for those who don't understand

 > Jana I: we need a really well written applicability statement

 > Aaron Falk: problem statement doesn't say anything about whether this 
is a standards activity; is that the intention

 > Philip E: deliberately did not answer that in the charter; let IESG 
decide that.

 > Aaron Falk: good candidate for experimental/informational RFCs on 
possibly multiple mechanisms; premature for standards track.

 > Vijay Gill: very interesting work; should proceed; need deployment 
considerations documentation

 > Leslie: two things: deployment considerations, how this would be 
considered useful in practice

 > Alissa Cooper: presenters have tried hard to keep the potential uses 
  apart from what is being exposed, but it makes it more difficult to 
understand why this is useful. In future, it would be better to be able 
to talk about potential uses, but carefully state assumptions about ISP 
business models.

 > Leslie: important output would be an applicability statement

 > Francois Le Faucheur: responding to Stanislav - if ISPs don't make 
use of the feedback in unpleasant ways, then it is not useful, may be 
true.   Good properties because it is incrementally deployable.  Not 
obvious that this won't work.   There is space for someone to say "in my 
restaurant, if you eat more, you pay more"

 > Ingemar Johansson: don't want to exclude other transports than TCP. 
Feels that different dynamics in congestion behaviour could cause some 
issues for a protocol like this.

 > Stanislav S: difference between restaurants and data transmission is 
ratio of cost of billing to cost of goods. Restaurants will itemise, but 
in data networks models are simpler.

 > Bob B: reason I put up congestion volume billing (limit at flat fee), 
was because it is simple.

 > Francois L: agree?

 > Stanislav S: ISPs will expose congestion, but sometimes you will get 
to send more, sometimes less. I expect that ISPs would use flat-pricing 
but impose different congestion limits.

 > Francois L: something better than flat rate will evolve.  Like saying 
   unlimited calling for cell phones.

 > Stanislav S: that is how things are rapidly evolving

 > Leslie: should a WG be formed with this charter + some word-smithing
   - Yes (some hums)
   - No  (some hums)
   "Yes" hum was distinctly louder than "no".  Seems that there is rough 
consensus to form a WG.  Willing to declare victory.

 > Philip E: demo will start in 5 minutes.


Meeting adjourned.


Agenda items not formally addressed:

Discussion of potential IETF work
   Constraints [10 mins] [Philip Eardley]
   Discussion of viability of congestion exposure [40 mins] [Leslie Daigle]
   Draft charter discussion [20 mins]