IETF-74: LEDBAT minutes Agenda: - Low volume on list was due to drafts not being out. But now they're out and we have some comments/reviews as well. draft-penno-ledbat-app-practices-recommendation-00 - Presentation overview: incorporated comments from list. + Advantages/disadvantages/recommendations presented + Reinaldo did a simple test to connect to bunch of servers and he still sees 'connection close => method completion' semantics. + Looked at extant research, but they seem to be more observational - 'this is what is there today'. They dont say how many we need or how many or enough etc. - Comments: + Gorry: We talked about some connections are control and others are data. Is that within scope? + This draft is more about bulk data transfers + Gorry: Are there more connections due to control plane transcns? + Good qn, in observing iTunes, HTTP etc, there was this dynamic where there are a bunch of control connections coming alive now and then, but nevertheless existing and consuming resources. + Lars: - Related to previous point, Jabber opens a bunch of control and data connections and all of the consume resources. This is true of many situations. - How do recommendations follow from what was said earlier in a draft? - You should mention LEDBAT as one of the ways of reducing impact. - Concerned whether diffserv really fits + If it is just bulk data and not interactivity, then you could mark packets. You could mark with lower priority and still achieve your goals. - It will be good to have a rule of thumb to say how many connections are ok or good - Going with the charter, we can do less-than-best-effort by using congestion control and this is more relevant in LEDBAT. + Question from jabber: advantages seem to be end-host based and disadvantages are mostly network-based. So is it ok if multiple connections traverse different bottlenecks? + From b/w perspective, yes. But from state in a NAT middlebox, multiple paths not make much difference + Gorry: there is at least one case where network would benefit - if end-host use multi-homed access and use disjoint paths. + Murari: we should talk about disadv due to reassembly on top of TCP at the app layer. + Murari: - some things are easier said than done. Give more guidance on multi-core scenarios. - Also, just because window scaling is available, not enough information is there (no good way to say how much socket buffers are available). - Instead of saying how many connections we need, should be talking about using some co-ordinated congestion control - Should look at bottleneck detection techniques and if there are disjoint paths, there should be no problem multiple conxns. + Jabber qns: - not just bulk data - could create new problems by spreading among multiple paths. + lars: - Comment on diffserv - Reassembly at app layer: some apps use multiple conxns to send different objects, but some may be splitting one big object among multiple connxns. So these need to be differentiated. --------------- draft-shalunov - Presentation overview + Proposed congestion algorithm has been deployed and tested. + On loss, halve cwnd. Saverio's alternative is to use base * measured b/w if that is lower than cwnd/2. This may not make much of a difference if the target delay is set appropriately. - Questions: + Illitch: slow-start - if TCP is in slow start, LEDBAT will cause a spike in delay and ramp down faster + Qn: Liked the analysis. why not use TCP Vegas? - There have been many suggestions. This is a flavor of delay based cong ctrl. If it helps this can be considered a Vegas flavor - 1-way vs r-trip delay change from Vegas is one among differences. - Murari: survey draft also has related work. - usually delay is used as a source of extra information, esp imptnt at high speeds. - other non-delay based cong ctrl algos have not set the goal as minimizing delay. + Qn: so you took a fresh start? - explaining based on first principles so we dont rely on assumptions + Mark Handley: slow start, seems unlikely you can get out of the way. But it does not seem to be significant. - yes, there is a delay spike. but if there is ssthresh it is not bad. + Comment: oddly enough it might help smooth slow start + Qn: how often does it take longer than several minutes for base delay to be estimated? Esp if other protocols are keeping the queue full. - the more flows in the bottleneck, you have a higher chance of making a false inference - typical home conxn, this is unlikely + Qn: there are streaming protocols that try to keep the pipe full by transmitting at slightly higher than the available b/w. - but usually if there is a high level of stat muxing with 10s of thousands of bulk flows, behavior is aggregated and smoother. + Illitch: in practice you always get something close to min with even a few dozen pkts + Murari: experiments also support this. + Are you going to mimic the timeout behavior of tcp, since timeout dominates AIMD. - setting cwnd to 1 in catastrophic case has safety benefits and should be ok to add. + Lars: in worst case you still want to be less aggressive than TCP - you can tweak the parameters to do that. + Bob Briscoe: if LEDBAT competes against TCP with AQM and if its delay goal is lower what happens? - should not be a problem. + Bob Briscoe: if there are 2 LEDBATs with one larger delay goal, will one lock out the other? - there are interesting considerations about how they share a bottleneck. Fairness section needs more inputs. - if there are 10 LEDBATs in a bottleneck they are trying to get at one base target. - but question is how they divide up the bandwidth among each other. This mechanism is already in the network and that is noise. Randomness causes reshuffle. - but if there is substantial difference, there will be starvation + Bob Briscoe: so we need a goal for inter-op/fairness to deal with starvation. + Qn: on very slow link, q-ing delay of single packet can be 10s of ms. Can that impact performance. - if we sent external data probes, then we would not be able to send anything. - but we are doing the delay measument with the same data packet. so this is not a problem. + Qn: Impact of one or a few large delay estimates - till the sample ages out we will be starved - but this is ok for background transport needs + Qn: Why will apps use this? - being a good network citizen - and not harming itself by being too aggressive + Bob Briscoe: if AQM threshold is way below LEDBAT target, it does not do anything - yes it will be same as tcp - but it will not impact LEDBAT any more than it does TCP - Murari: if underlying protocol is responding ECN, then LEDBAT will not be less aggressive than TCP. + Illitch - can you back off more than TCP? - one reason not to back-off excessively is to not to over-react to non-congestive loss. + Illitch - how do you determine the target? - target needs to be as low as we need and as high as we need for proper measurement (e.g. scheduling delays on end host may have impact). - 25 ms is the draft recommendation. - Murari: you will need to adapt it if it is not working well. + LANs: - sometimes high buffering can cause a lot of delay and it does not do us any good. - LEDBAT can help eliminate/reduce this. + Qn: do we necessarily need to saturate the bottleneck? - without this it is hard to possible to hit the other gaols. the min problem becomes unconstrained. + Richard Bennett: do not agree that users wont use it unless perf is better There can be other monetary benefits (e.g. Bob Briscoe's suggestion that marking can have economic incentives to make it less than best effort). + Richard Bennett: design goal needs to be to avoid saturation. - Skipping the survey draft - Need to decide whether to adopt these drafts as WG documents: - very few read the first draft - Lars: charter is clear that there is a work item. Encourage to work on draft to make it better so that people take notice. - Schulzrinne: unless someone is willing to review the draft and take interest why do it? - Briscoe: willing to review - second draft: a few hands go up for adoption - Schulzrinne: unless this is informational, it seems early to adopt this doc at this time. Give until say Jul 1st and give a chance for competing drafts. - Qn: where is this change going to be done, TCP stacks? - Stas: no mandate to clarify implementation