ICCRG meeting minutes, IETF 92

==Carles Gomez, "CoAP Simple Congestion Control/Advanced (CoCoA)"==
<http://www.ietf.org/proceedings/92/slides/slides-92-iccrg-2.pptx>
Congestion control for Constrained App Environments (CoAP [RFC7252]), 
more advanced than the default CC, but still target for constrained
devices.

CoApp background:
Over UDP
Identifies whether each message is reliable (CONfirmable = ACK'd) or 
unreliable (NON-confirmable)

Intro to CoCoA:
* For CON only: 2 RTT estimators, weak & strong. Weak is when rexmts 
have been required.
* Overall RTO is evolved from the EWMA of the weak/strong RTO that 
was most recent.
* EWMA of strong estimator weights recent values 4x more than weak.
* Overall RTO is dithered
Evaluation: See slides. In summary "CoCoA performs similarly to or 
better than default CoAP"

Questions:
Q1. Michael Welzl: Variable back-off factor (VBF): What is backing off? Does 
it use a sliding window of more than 1 packet?
A1. The config include a factor (n*) =  max window, which is usually 
1, so there is no window. There could be if n* > 1, but that has not 
been used so far.

Michael: Given that not very much harm comes from a false RTO, did
you try to make it more aggressive?

Carles: If we are more aggressive, other performance characteristics
are degraded (eg cocoa-S)


Q2. Richard Scheffenegger: Thanked the presenter for addressing all 
his concerns as raised in CORE.


==Koen de Schepper, "Data Center to the Home"==
<http://www.ietf.org/proceedings/92/slides/slides-92-iccrg-5.pdf>

Goals:
* Unmanaged low latency, high throughput service
* Migration path for DCTCP to run fairly alongside Reno & Cubic

Work in progress:
* Only steady-state measurements so far. No dynamic measurements 
(short flows work fine subjectively)

Test setup:
* Real broadband testbed, with breakout to the new AQM from the 
Broadband Network Gateway.

#8-10: PDF comparing RED & DCTCP (separately) giving background
#11-15: Can't just mix DCTCP/ECN with Reno/drop: DCTCP pushes up to 
the RED step and Reno starves itself.
#16: Use ECN as a low latency service identifier (potentially for all
traffic) #17-20: Derive relation between marks:drops. It's a square
relation, which is easy to implement by comparing with two random
variables, as opposed to one.
#20-21: Fairness sorted, but latency of RED, not DCTCP.
#22-24: Dual queue so DCTCP has low latency, then need a scheduler.
#25: Use a strict priority scheduler. It's important to measure the 
Non-ECN queue in time (it doesn't have to be RED tho). Then the flows 
automatically schedule themselves through the strict priority scheduler
#26: Results: Fairness between any combination of flows, with slight 
drop in Reno for large numbers of flows. And hardly any queue for DCTCP.
#30: Interactive Video app: Panning and zooming an HD window within a 
larger video. With DCTCP through coupled AQM,
#31: Future work & Conclusions:
 * Dynamics still to be measured
 * V important for DCTCP to respond to drop as Reno. Linux 3.18 
lacks this feature.
 * RTT fairness decisions have to be made

Questions:
Q1. Jana Iyengar: Where did the 8 factor come from, to compensate for
RTTs? A1. Compensates for the O(BDP) queue in the Reno side, then
rounded to a power of 2.

Q2. Jana Iyengar: Have you done mixed RTT experiments?
A2. No, but done calculations (see spare slide #33)

Q3. David Black: Have you thought how to adapt the 8 factor for the 
measured RTT
A3. Hard to measure RTT in network, but would like to do this if possible.

Q4. David Black: Thanks for backwards compatibility work. But how are 
you classifying flows as Reno or DCTCP?
A4. On ECN. So if a Reno host turns on ECN, it will starve itself 
within the DCTCP queue.

Q5. Stuart Cheshire: I've had ECN turned on for 15 years, we don't 
want an incentive against ECN with Reno.
A5. Bob Briscoe: Main assumption is that after 14 years, operators 
still aren't going to turn on ECN at bottlenecks unless there is a 
major performance (i.e. latency) benefit to their customers, not just 
the lack of loss that ECN currently offers.

Q6. Stuart Cheshire: Surely you don't want a ramp for the DCTCP side.
A6. We're using that, because that's what DCTCP was tested against.
For the Internet, we probably want a ramp, or at least something
less cliff-like.

Q7: Caitlin Bestler: Have you done any studies with a client in the data centre?

Koen: Only changing it for the in-the-home case, if in the DC leaving
DCTCP as it is.

Q8: Andrew McGregor: CDFs, not PDFs please.
A8: OK.

Q9: Andrew McGregor: You need to be able to deal with the fact that the RTT can change.
Suggest you look at what happens when caches get drained. 


==Brian Trammell, "Enabling Internet-Wide Deployment of Explicit 
Congestion Notification"==
<http://www.ietf.org/proceedings/92/slides/slides-92-iccrg-1.pdf>

#2 ECN Failed deployment, but it is relevant again.
#3 Hockey stick for ECN server support, but client and network is 
still effectively zero.
#4-7 slides self-explanatory, except clarification questions:
     #5 Title "Endpoint-dependent" means Server-dependent
     #8 Classification by TTL is remaining
#10: We have seen 2 CEs very probably set in the network.

Q1. Tim Sheperd: Might see more if uploads were significant.
A2a) Correct: Much more difficult to send heavy traffic as part of a 
test, so avoided risk of test AS being black-listed.

A2b) Stuart Cheshire, looking at stats on his own Yosemite laptop:
    12558 connections negotiated ECN since last reboot
     1983 CWRs in response to CE.
Chair asked Stuart to copy his stats to the list.

Q3. Richard Scheffennegger (co-author): Are you going to expand on 
the bleaching behaviours
Q4. No, but will say that dropping CE or ECE is not problematic, but 
bleaching CE is, because the former replaces one congestion signal 
with another, whereas the latter removes a congestion signal.

#7 See slide for stats.
Brian gave an aside: Lots of weird stuff seen on IPv6 Internet.

Q4. Trevor Chandler: Please expand on the weird IPv6 stuff.
A4. 9.36% of IPv6 sites cannot be connected to, but they're still in 
the Alexa top 1M because Happy Eyeballs is ensuring they fall-back to 
v4 connectivity: I.e. if you work around a problem, the problem will
stay.

==You Jianjie, "Increasing TCP's CWND based on Throughput"==
<http://www.ietf.org/proceedings/92/slides/slides-92-iccrg-0.ppt>

#2-6 Propose that an app that has a target throughout sets a sockopt 
to alter TCP's window increase factor (alpha), which TCP adapts 
dependent on RTT.
#7 Only 1 RTT needed to reach proposed throughput

Q1. Marie-Jose Montpetit:
Marie-Jose Montpetit: Why is this working? Why will this not lead to
congestion collapse in the internet?

Jianjie: Issues transmitting 4k video.

Marie-Jose: This is not a valid solution to transmit 4k video over the internet. This is going to break
the Internet.

Q2. Bob Briscoe: One of the main purposes of TCP is to find what rate 
the path can support, because from one moment to the next other flows 
can arrive and depart. The goal of reaching rate for 4K video in one RTT
means  that, if there are other flows using the capacity, and the available 
capacity is less than the video takes, it will break the other apps 
before it is possible to have received any feedback that it has done 
so. Therefore, this approach is highly unsafe.

Q3. Stuart Cheshire: Thank you for coming to ietf and giving the presentation.
Use on the internet is dangerous, but OK in your own data centre. You start
by showing a problem with packet loss in slow start, then at the
end assume no packet loss.


Q4. Yuchung Cheng: Why is slow-start dropping out before reaching the 
target rate.
A4: (Stuart Cheshire): If there's no packet loss, slow start will 
reach the desired capacity quickly, although not in one RTT. The case 
here is where an unexpected loss has caused the video to leave slow 
start too early, then traditional TCP slowly senses capacity using 
congestion avoidance. So the problem here is the spurious loss. The 
way to solve this problem is first to try to fix the cause of this 
spurious loss.

Q5. Georgios Karagiannis: How can we support 4K video getting up to 
speed quickly, because there is demand for a solution?

Q6. Mirja Kuehlewind: Main problem is the assumption on slide 4: 
The assumption that network can take the target rate is just not
true and why we have congestion control.


==Jinzhu Wang, "Combining TCP with coding in wireless networks"==
<http://www.ietf.org/proceedings/92/slides/slides-92-iccrg-3.pptx>

#2-3 Problem statement: TCP mistakes bit-error loss for congestion
packet loss. #4: Use e2e coding to erase packet loss: then don't do RTO.

Q1a). Gorry Fairhurst: You say you're not doing RTO, but do you do a 
congestion response?
A1. No.
Q1b) Gorry: So I assert that this is no longer TCP

#9: interworking with standard TCP

Q2a). Brian Trammell: Which bit do you need for identifying that you 
are using this coding?
A2. A TCP reserved bit is the current idea, but undecided.
Q2b): Brian: Depending on which bit, it will be either very hard or 
impossible to deploy.

Q3. Brian Trammell: Why wedge this into TCP? Suggest you try to build 
this over UDP. Shameless advertisement for SPUD BoF.

Q4. Stuart Cheshire: maybe the loss that you are assuming ought to be 
the thing that gets fixed here.
    If WiFi is unreliable, that should be fixed in the WiFi Alliance 
etc, not e2e.
    Individual links have to be 'largely' reliable.


Q5. Andrew McGregor: Only 1% loss on WiFi? That's really good. It's 
probably because the underlying link is already repairing a lot of 
losses. Agree with Stuart that this coding should not be munged into 
TCP, which has run out of extension space. Perhaps QUIC instead?

Q6. David Black: Agree with Stuart. You're going to break the 
Internet with this. Stop it.

Michael Welzl as chair: Will have to wrap up with one slide:

#15: Results.