ICCRG session notes IETF-97, Seoul November 15, 2016, 350p-620p --------------------------- Notetaker: David Black Jabber Scribe: Stuart Cheshire ICCRG chairs: Michael Welzl & Jana Iyengar In general, see slides - these notes attempt to capture discussion in addition to slides -- Load Transient Awareness and AQM Algorithms, Ilpo Jaervinen [discussion after presentation] Paper describing this new Predict AQM is not yet publicly available. Traffic used for testing: short flows, starting from empty link unless a previous flow is still active. Have not tried turning off PIE's burst allowance for comparison - Bob Briscoe suggests doing that. Suggestion (Randell Jesup): Explore interaction with delay-based congestion control schemes under development for rmcat. Assuming idle link at flow results in different behavior than partially occupied link - big overshoot on startup may be required to establish fairness with existing flow(s). Koen De Schepper notes: L4S has reverse situation, wants to allow slow start with little or no marking. RTT predictor is based on history of RTTs seen. More details are in paper. Chair request: Send paper to mailing list when available. This Predict AQM is looking for exponential increase behavior - not designed for sudden drop in available bandwidth, this work hasn't considered that case. -- BBR congestion control, Yuchung Cheng & Neal Cardwell Have been working on this for several years at Google. BBR = Bottleneck Bandwidth and Round-trip propagation time [discussion during presentation] Comment (Stuart Cheshire): This is about not getting large RTTs when buffers are large by comparison to link bandwidth - at 3 minute RTT, human times out. Speaker (Neal) agrees - congestion control is a problem when a 3 minute RTT happens. Comment (Tim Shepard): CUBIC and other loss-based congestion control algorithms have problems when buffering is smaller than BDP (Bandwidth Delay Product) even when not competing with BBR flow. ACM Queue paper on BBR will be publicly available in a few days. [discussion after presentation] Short flow latency improvement is both for BBR short flows and short flows that compete with BBR due to BBR's bottleneck queue length reduction. Google's B4 network (connects their data centers) has shallow buffers, BBR runs much faster than Cubic in that network because Cubic behaves poorly with shallow buffers - comparison is on empty link, but has led Google to use fewer TCP sessions with BBR. Thousands or 10s of thousands of flows share each bottleneck. Request (Pat Thaler): Info on BBR behavior on mix of bursty and constant rate flows. Suggestion (Randell Jesup): Explore interaction with delay-based congestion control schemes under development for rmcat. Speaker (Neal) agrees that additional input signals (e.g., delay) would be useful. Flow synchronization behavior - they do attempt to probe Min RTT at about same time, coordination develops among flows so they do this at about same time, see paper. BBR + ECN: Tricky issue - BBR currently ignores ECN CE marks (bad). Need to do something smarter, could pay attention to ECN marks. BBR does not build really large queues - when seen by Google, they tend to be due to Cubic or Reno. BBR estimate is its fair share of path, not total bandwidth. 2% of time bandwith tax to probe RTT - this could drive up latency of something like an RPC when it occurs during RTT probe. Tom Herbert: 2% is an external constant - can this somehow be automatically determined? Answer: Yes, follow-up online. RACK and TLP always used with BBR. -- (Single-path) TCP congestion control coupling, Safiqul Islam Intent is to publish this work via ICCRG - feedback wanted on mailing list. [discussion after presentation] ACK clocking is optional - if done, requires additional changes to TCP implementation. Discussion about whether flows w/same IP addresses share the same path - they don't always due to ECMP and CGN, proposal addresses this with tunneling or by configuration (use when known). -- Update on LISA, Michael Welzl Abbreviated presentation due to lack of time. Hoping to see reviews on list. LISA seems to help when flows share a bottleneck, minor impacts when they don't - seems better overall. -- Malicious Oversubscribing in Multicast: Problem and proposed solution, Jake Holland Goal: Multicast service with 1000s of video channels Please read draft and provide feedback. Measurement work is in progress. [discussion after presentation] This is about what to do when multicast congestion control doesn't solve problem - Jake is assuming a compromised receiver and looking at circuit breaker to protect network. Alternative: Police rate down? - Jake indicates that multicast ought to be doing this, and this mechanism aims to protect against link oversubscription when that doesn't work. Q: What if receiver provides no feedback - can that be used to not send traffic? Leads to a discussion of WEBRC multicast congestion control. Off-line follow-up. -- Skipped last agenda item on L3QCN due to extensive discussion in preceding TSVWG session, so presenter has plenty of feedback.