TCP Maintenance and Minor Extensions Working Group

Meeting : IETF 81, Tuesday 26 July 2011, 9:00-11:30
Chairs  : David Borman <david.borman@windriver.com>
          Michael Scharf <michael.scharf@alcatel-lucent.com>
Minutes : Andrew McGregor <andrewmcgr@gmail.com>
==============================================================

Note: David Borman could not travel to Quebec City. Therefore, Gorry
Fairhurst <gorry@erg.abdn.ac.uk> agreed to help with chairing the
session during IETF 81.



==============================================================
AGENDA

A. Working Group Status
B. Current WG Items
C. Non WG Items
==============================================================



A. Working Group Status
==============================================================

WG Status
Chairs
15 minutes



B. Current WG Items
==============================================================

Initial window increase discussion
http://tools.ietf.org/html/draft-ietf-tcpm-initcwnd-01
http://tools.ietf.org/html/draft-touch-tcpm-automatic-iw-01
Chairs
60 minutes

First question:
* Option A: Fixed upper limit to Proposed Standard (PS)
* Option B: Fixed upper limit to Experimental (EXP)
* Option C: Something else

Mirja Kuehlewind: Did the Google experiment pace the IW packets over
one RTT?

Jerry Chu: No, we didn't go for that complexity. Packet pacing is a
way to mitigate bursts. It is probably necessary if one goes much
beyond IW 10.

Colin Perkins: My concern is the impact on latency-sensitive non-TCP
applications. The measurements seem to concentrate on TCP impact,
which appears to work well. The analysis doesn't seems to study VoIP
traffic, which is latency sensitive.

Matt Mathis: TCP's design goal is to cause queues, and the network has
to fix the queueing problems. The bigger problem is receive window
autotuning, and queues are already out there.  This is a drop in the
bucket compared to other issues.

Jerry: We have done some testbed studies, and we do find there is a
bit of queueing delay change. But it is only really visible in extreme
cases at high load.

Colin: This does not help VoIP.

Jana Iyengar: My concern over doing anything static is that IW 10
assumes there are deep buffers around that can absorb the burst. If
those buffers go away then IW 10 may start to lose. Jim Gettys is
looking at buffer bloat, this needs to be fixed. There is a goal to
reduce the buffering, and a fixed IW 10 may make that harder.

Lars Eggert: I don't think option A, i. e., fixed value to PS, is the
right goal, and option C, well, there are no other serious proposals.
So option B seems to be the right thing to allow the experience of
deployment.  A static value is well understood. If we want fixed or
adaptive... Adaptive would be really nice, but I don't see it being
implemented.  10 seems to be a reasonable value, and the potential for
harm doesn't seem that high if there are not many flows. I'd lean
toward 10 as EXP, and I would love to see an adaptive scheme
implemented as long-term solution.

Kathleen Nichols: I would go for option A. I remember writing a draft
on this ~12 years ago, and would go to PS for IW 10. We can't solve
the problems of the universe by keeping IW small.

Scott Bradner: One of the things Jim Gettys is worried about is the
number of flows caused by a modern browser.

Wolfgang Beck: Serialisation delay on various kinds of lines may give
VoIP problems.

Colin: We don't want to have a discussion in a few years that we can't
reduce buffer bloat without breaking TCP that now requires IW
10. There are too many concerns for PS.

Jana: Option A or B make little difference if they're deployed. If it
goes to EXP, let's define what is needed to make it PS.

Andrew McGregor: An experiment report will be needed. Question: How
deep were the NIC queues in the Google experiment?

Jerry: Linux default, that is 1k packets.

Jerry: What is the meaning of the experimental status? What will
vendors do with EXP? Can they turn-on by default?

Colin: IETF can't control what implentors implement. We need to say
what the concerns are and ask vendors to evaluate them and provide
guidance.

Andrew: Each stack vendor has their own policy on experimental
features. Linux already deployed it upstream, so many devices and
distributions will be picking it up. Someone needs to do the checking
to see if this raises issues.

Michael Scharf: F-RTO was implemented when it was EXP. When it moved
to PS, some issues were fixed.

Jerry: Google has analyzed the results of 2 years. IW 10 is an upper
bound. The draft does not require to use IW 10.

Matt: I observe all of these answers are wrong. In 10 years IW 10 will
be too small. The right answer is an adaptive algorithm and the values
are registered by IANA via the IETF. RFCs were initially seeking
comments. Vendors pick their risk profile. I don't believe transport
protocols are really standardised, the real standards are in the
stack.

Lars: PS means the IETF thinks that this is ready for global
deployment. I do not think we are yet there for IW 10. But I think
this is on the path towards being a standard very soon. EXP means we
think it's a bit risky, we want to verify that it doesn't cause harm,
and we may revisit this in the future.

Jana: Does the IETF have relevance here? Even if we go to EXP, given
that IW 10 may hurt other flows, can we then get stacks to reduce it?

Jerry: We have studied the friendliness issue and our studies show
that IW 10 does not impact other traffic.

Randall: ECN was experimental. It's not the end of the world. Vendors
don't seem to look at the document status. I think EXP would let us
experiment, at least via a sysctrl, and do the experiment for 2 years.

Matt: Do you think it will go faster as PS or EXP?

Randall: I think this makes NO difference, people will implement it.

Michael Scharf: There are a bunch of experimental specs in tcpm.

Nandita Dukkipati: What are we still looking for to make IW 10 go to
Standards Track?

Lars: What about requiring ECN if IW is over 3? I think Google has
done the homework for their types of traffic - and this is really
good, I wish others would do this as well. So, I'd like to suggest if
you are turning this on for large volume systems, the IETF recommends
monitoring what happens for your users. I wonder if that sort of
statement would capture the concern.

Jana: I agree with Randall, EXP vs PS doesn't affect much. If we knew
the issues, this would be fixed in the draft. The Internet may reveal
new issues, and we need to know how to measure the impact. If it's
EXP, we need a way to look at the results, or else it's just silly.

Mirja: There seem to be too many concerns for PS, whereas EXP means we
encourage monitoring. I would like to see more experience of other
uses on the Internet.

Colin: I think Lars made a good statement. Turn it on, then
measure. Try and measure that it is not doing harm to other
applications. For example, see if it effects Google Voice.

Matt: The challenge is finding collateral damage. Page completion time
might be very sensitive for that. And we don't see collatoral damage.

Jana: That are still web pages. What about VoIP or video streaming?

Matt: There's a second order control system, which is content
providers sharding their content to raise the initial congestion
window, and have somewhat usurped the IETFs ability to control this
because IW assumes unsharded applications. There are certainly
applications out there that fail on certain links. Delay sensitive
apps can't share a queue with a working TCP, and that can't continue
to block TCP improvements.

Nandita: The closest approximation to VoIP we had was one packet
responses, and we saw no impact on those in traces.

Lars: I'm not saying Google people should do more, you came along with
a record amount of data, I'm saying OTHER large sources of traffic
should do the same when and if they turn this on.

Gorry: I concur with Lars, and the experiment is that other sources
should do the same work.

Mirja: We should allow a socket option to let the app choose.

Matt: Increase or decrease?

Mirja: That depends on what the rest of the document says.

Show of hands:
* PS: 3
* Exp: ~20 + 1 on jabber
* Something else: 0

Second question:
* Option A: 10 MSS
* Option B: Other value or don't know

No discussion

Show of hands:
* 10 MSS: ~7
* Other value or don't know: 1

Third question:
* Option A: draft-touch-tcpm-automatic-iw-01
* Option B: No adoption

Andrew: I don't know this is anywhere nearly adaptive enough for cell
phones and laptops. The mechanism might be too slow.

Gorry: As individual I agree with Andrew. The scheme doesn't seem to
be adaptive enough. What measures do we have to know what damage we
are causing? We may need to be more conservative if things go wrong.

Michael Tuexen: This should be protocol independant. All transport
should do it, not just TCP.

Colin: The chairs should ask first in principle whether there is
interest in this topic.

Matt: This might be a topic for the IRTF.

Lars: Is there any implementer backing? I'm going to abstain, because
I don't see that, but we can hope.

Show of hands:
* Principle interest in an adaptive solution: ~8 + 1 on jabber
* Disagree: 0

Show of hands:
* WG adoption of draft-touch-tcpm-automatic-iw-01: ~1.5
* Need for a different draft: ~6
* Who is willing to contribute: ~3

Chair: The concensus in the room is that EXP is the best approach,
that a maximum initial window of 10 MSS is a good proposal, but there
is no consensus on picking up draft-touch-tcpm-automatic-iw-01 as a WG
item at this stage. This last question needs more discussion and will
be taken to the list.



C. Non WG Items
==============================================================

Proportional Rate Reduction for TCP
Matt Mathis
http://tools.ietf.org/html/draft-mathis-tcpm-proportional-rate-reduction-01
30 minutes

Michael Scharf: Are the mentioned results published somewhere?

Matt: The camera ready version will be published on our web site.

Randall: Regarding the 40 packets being retransmitted: A long time ago
Sun implemented maxburst. SCTP implemented this. Have you thought
about that?

Matt: There is equivalent code in Linux, which we turned on, but the
implementation effectively reduces the congestion window, which
exchanges the burst for long term congestion control consequences. My
favourite answer is to avoid this.

Michael Tuexen: Counting packets vs. reducing the congestion
window... If the burst is from the application, counting fails,
whereas the congestion window also handles that.

Gorry: Would this make the draft unuseful? If Michael's comment is
important, can we handle that?

Matt: That is a bigger change that effects more things and is harder
to evaluate.

Andrew: I don't see this draft being anything but good.

Jana: Mark Allman published a paper on a few such microburst
elimination techniques.

Nandita: I believe those techniques were all outside of loss recovery,
so they apply only in other situations.

Show of hands:
* Adoption as WG item: ~8 + 1 on jabber
* No adoption: 0

Chair: The consensus is that there is support for adoption as
experimental WG document. This will be confirmed on the mailing list.


TCP Fast Open Proposal
Jerry Chu
http://tools.ietf.org/html/draft-cheng-tcpm-fastopen-00
30 minutes

Mirja: Do you send the cookie for every connection, without knowing
the server supporting this?

Jerry: Yes. By default for every connection, subject to the system
configuration.

Randall: In the example, how big is the page? Were you doing IW 10 for
these examples?

Jerry: Yes.

Gorry: In a mobile network, caching an RTT to seed the RTO is is
actually quite problematic. In this case the RTT for a new SYN could
be quite different. Did you look at this?

Jerry: I agree. This data set does not look at this effect. We saw the
same problem with the minimum RTO. We would like to keep the RTO
low. 3 seconds for the RTO is not a good option.

Wes Eddy: TCP-CT Rapid Restart has been submitted for publication in
the independent stream. It is being reviewed right now.

Jerry: Has it passed the review? Because I don't understand it.

Lars: The TCP-CT document was by Bill Simpson. As far as I remember,
it is published as experimental, and uses the experimental option
numbers. IANA requires IESG approval or standards action to get a TCP
option number. The IESG policy says it needs to go to the working
group, and the working group can ask for option codepoints even for
experimental documents.

Wes: When the IESG reviews TCP-CT Rapid Restart, we need to know of
conflicts, and if this is to be IETF work that changes the answer.

Jerry: Does this need to be a PS to get a TCP option number?

Lars: This needs to be PS or it needs to be IESG approved. The WG
could ask for this for an experimental WG document.

Jerry: We want to ask for WG adoption. We've implemented it and it
seems to work well. We've addressed some of the attacks.

Michael Scharf: Do you have evidence of intrusion detection system
issues?

Jerry: No. We don't produce FIN+SYN packets.

Michael Scharf: Do you have data on middlebox reactions to SYN+Data,
i. e., how many SYN packets get dropped in real networks due to
middleboxes?

Jerry: About 10% in our study.

Michio Honda: I have performed similar experiments. We sent SYN+Data
to the Alexa top 500 list. We found one middlebox that replied with
RST.

Michael Scharf: How does this interact with IW 10? If you do the
SYN+ACK, the connection setup delay provides some protection against a
congested path, whereas SYN+Data will immediately drop an entire
initial window on the path.

Jerry: We haven't studied that yet.

Michael Scharf: It would be worth understanding if this hurts.

Jerry: OK, if it's adopted.

Michael Scharf: If you're doing happy eyeballs, i. e., use the SYN to
probe for IPv6 versus IPv4, to some extent you're bypassing this.

Jerry: There has to have been a previous connection that delivered the
cookie, so that's fine.

Gorry: What is the impact of RTT variability?

Jerry: We believe the impact is small, one or two additional retransmits.

Gorry: What is the minimum RTO?

Jerry: 1s, standard.

Show of hands:
* Read the document: ~7
* Adoption as WG item: ~4
* Who is willing to contribute: ~4-5

Chair: The consensus is that we will take this to the list. But we
note that there is some interest in the room.