TCPM meeting minutes IETF-90, Toronto, Canada Monday, July 21, 2014, 0900-1130 Chairs: Yoshifumi Nishida Michael Scharf (remote) Pasi Sarolahti (remote) Note takers: Michael Welzl Stuart Cheshire ============================ * TCPM status chairs 10 mins Wes Eddy: made updates to draft-eddy-rfc793bis, going to xml format, putting in errata, third version: putting in ugent pointer changes. I was hoping to bring it in as a WG doc at some point, sometime soon. Working Group Items ------------------- * draft-ietf-tcpm-rtorestart Anna Brunstrom 15 mins Stuart Cheshire: I think this is good work and useful. Do you have measurements of how many unnecessary retransmissions it causes? In some sense it's unsurprising that if you retransmit sooner, you complete sooner. You could get the retransmission time even lower and you'd have more spurious retransmissions. Do you have data? This allows implementers to make an informed cost/benefit decision. Anna: I don't have the numbers, but the issue is really in the second case, because in the first and third scenario where you have a controlled RTT in the network you don't get spurious retransmissions. In the second, you get some but you can't see any negative effect of them because it doesn't hamper your performance. Stuart: The negative effect is in that it's bad for the network, getting in the way for everybody else. There is a balance there. Anna: Other problem is it could hurt your own flow. Stuart: Anytime you're wasting capacity you hurt yourself or someone else. Anna: I meant because you reduce your cwnd. This has been a discussion also before on the list. Stuart: My suggestion is when you present this data you should also mention what the risk of spurious retransmissions was. Speaking as a vendor, if we were to put this into the iPhone, what is the consequence in the customer's bill going up because of retransmitting unnecessarily. Both sides must be known to make an adoption decision. Jana Iyengar: What happens when you do TLP and RTOR at the same time. It seems that maybe the benefits should be additive, but I don't think it is. Anna: They work together but the benefits are not fully additive. Sometimes TLP would trigger when RTOR would be better to recover. There's no problem in having them enabled together. Jana: Have you tried them together? Anna: Yes. There is no problem. Yuchung Cheng: Linux has a different RTO than the standard, like minimum RTO of 200 ms instead of 1000 ms, and repairing incorrect reduction of CWND. It would be good to show how RTOR works with other TCP stacks without these Linux refinements. The data should include some data on a stack that follows the IETF standard. You considered a TCP stack that can detect a spurious retransmit so the damage would be minimized, but for a stack that doesn't support this feature, it would be good to know the harm. Anna: Yes but for the type of send patterns that we have here, when we talk about the losses at the tail, we don't have this problem. You're talking about a burst of data, then after a period of silence we start again. Yuchung: Just saying it's good to include additional data. Anna: So completely different send pattern. Yuchung: Yes. I'm not arguing that the data you presented is flawed, just saying this would be good extra data to see. Retransmitting spuriously can reset cwnd to 1 with stacks that don't do spurious retransmit detection. Anna: some discussion in the draft. That's why it's experimental. Yuchung: You're presenting the good side of data. Would be good to see the downside. Based on Google measurement 30% of retransmissions are spurious, so this could be quite harmful. Chair: Wait for submission of new draft. * draft-ietf-tcpm-newcwv Gorry Fairhurst 10 mins Yuchung: Linux uses different algorithm to detect cwnd limited. We recently submitted a patch. What are your comments about the difference of the two? How does this interact with TSO? Gorry: I don't know it. Yuchung: Let's take it offline. Gorry: Is it likely to change the spec? Yuchung: We believe our idea is probably simpler. Yours is more difficult to implement. Linux uses TSO which creates a different way to see if cwnd is limited. These are interesting issues to consider which are not addressed in this draft. Gorry: How ready is the WG to take this forward? Obviously this discussion has to happen first. Chair: What about comments from Mark Allman? Gorry: He made the draft much more readable, the comments were very helpful, but they are included in this version, there are no outstanding comments from Mark. He has agreed to read the current version, so we should work for his feedback. Chair: How many people read the new draft? (approx 3 hands) How many people think it's ready for WG LC? (none? or few..) Lars Eggert: It had a WG LC, and we're doing a second one? Chair, Gorry: Yes. [chairs' offline correction note: No WGLC has been called for draft-ietf-tcpm-newcwv so far] Individual Drafts ----------------- * draft-touch-tcpm-tcp-edo Wesley Eddy 10 mins Brandon Williams: If middlebox supports this option, it can't know if end destination supports it before it sees the data flow has been established. Wes: Good point, it's a different case than what we've been thinking about. Yuchung Cheng: Receiver respond more than 4 SACK blocks, what happens if we use this with say 20 SACK blocks? Is there anything that would conflict in earlier spec? Wes: SACK would come after this option, so you're only limited by the size of the length field, you've got plenty of room for way more than 20 SACK blocks William Adamson: When a middlebox splits a segment, should this option be replicated across all segments? Richard Scheffenegger: Current version has no specific wording on what to do when middlebox that understands this option Wes: Focus has been on detecting middleboxes that cause trouble and not on middleboxes that are trying to be helpful. But great point. Chair: Adopt as PS? Speak up if you disagree Bob Briscoe: Not exactly disagree. But asking whether going straight to Proposed Standard makes sense, the whole thing of middleboxes etc sounded a bit experimental to me. Chair: After some discussion we can change the status of the draft. Lars: Would be happier with Experimental for now, it's new. Since we have another proposal in the SYN option space, good to know if they can work together. Wes: Could be coupled. Lars: Some coordination required between the two Wes: I wouldn't say required Bob: SYN should say “You must support EDO and you must understand it” Lars relaying from Jabber, Toby Moncaster: It would seem interactions are one reason that this should be Experimental. Lars relaying from Jabber, Michael Scharf: basically the same constraints as in the last meeting apply: running code. But this can be resolved before last call. Chair: I think it's almost ready, if you agree with adoption raise your hand. Consensus is strong, almost 20 hands. Opposite: none. * draft-zimmermann-tcpm-reordering-detection * draft-zimmermann-tcpm-reordering-reaction Alexander Zimmermann 10 mins Yuchung: Data presented on slide 9 is really interesting. Reordering patterns look alike. At Google we see two styles of reordering: very fine grained reordering, e.g., 2,1,4,3,6,5, and coarse reordering, e.g., a whole chunk of segments are all delayed by 40 ms. Do you see that pattern in your data? Alex: I see more of the second type you mentioned in my data, larger timescale Yuchung: Sharing more data on that would be great. Also not just sender side relevant, but receiver, how to handle super-high reordering? Receiver usually has a simple FIFO queue, but with really high reordering the receiver stack won't be able to handle this with high performance. Could be interesting issues to discuss in your draft as well. Karen Nielsen: Last slide said there is clear consensus that we need to handle reordering. Where was that consensus established? Alex: Comment based on meeting notes from the last TCPM. Karen: I don't think there was a clear consensus there. Chair: Your draft is just one solution. There was some agreement that we need to solve reordering. Jana: Do you have the same data with time on the x-axis, not bytes? Bytes can be a function of cwnd, it would be interesting to see as a fraction of RTT how much reordering is introduced. Alex: This is only reordering that didn't last longer than an RTT. Explained in the draft. If you wait longer it's highly possible that you run into an RTO. We only delay fast retransmit until 1 cwnd which is 1 RTT. Based on the data that we have I see it's enough to wait less than an RTT. We don't see reordering that's much longer than one RTT. Jana: Would be good to see that chart if you have it. Alex: Not here now, but no problem to generate it. Karen: Clear consensus that TCP should be fixed to take care of reordering? Chair: We didn't measure consensus yet. Show of hands on “should we solve reordering” [roughly, not exact formulation] showed many hands. Brian Trammell: What do you need with solve reordering? The space for solving it is much larger than this proposal. Gorry: We should start solving the re-ordering problem now. Guidelines for tunnels and lower layer not to reorder should remain, but we should improve TCP to relax this constraint over time. Michael Scharf from Jabber: I disagree that the WG is ready to work on reordering. Need workable solution draft first. Lars: Do you believe that Alex' draft is not a workable solution? Michael Tuexen: Are you only looking at TCP-specific solutions or are you considering other transports too? Lars: Question I would ask is: do we want to make TCP more robust against reordering? Second question depends on answering the first one positively. In Jabber John Leslie said he's against working on reordering. Chair: We didn't reach consensus yet. Matt Mathis: The only way to do it really high speed is to do sharding at the low layer. Consequences to keeping routers from reordering are bad. Delusion that the network doesn't reorder. We need to make all protocols more tolerant on reordering, end of story. Stuart: Question should be “should we work on the reordering problem, should we try to fix it”. Karen: I wonder if this is the right group to make such a call. Gorry: +1 to Karen's comment. If we fix it in TCP we have to fix it for everything else in the long term. But this is the right place to ask about TCP. Lars: For this WG it's TCP that matters. I think we should work on this. If we can make TCP more robust against some kinds of reordering that's better than nothing. Whether all other protocols should be made robust against reordering is a different discussion, in this WG we don't have to wait for that broader discussion. Brian: +1 to all Lars said. We can solve the problem for stream based transports. We can solve it for other protocols. TCP is by far the most used protocol, but saying can we we look into general approaches for reordering tolerance then that's a good way to frame this, I support doing that work in this WG. Jim Roskind: Fixing is wrong but improving is a good idea. I work with a TCP-like protocol, and 6% of the time we get reordering in excess of the RTT, and 0.5% reordered beyond 100 ms. Interesting to realize how severe this is. Dave Oran: I have idea how many gates needed in switches to work around reordering. A major part of the gate cost in ether channel is to prevent reordering. Reducing this burden would be good. Chair: initiate discussion on mailing list Lars: I heard consensus that we need to start working on making TCP more robust on reordering Martin: do right now the consensus call Q: how many people think we should make TCP more robust against reordering: ~30. Disagree: none. Lars: 3 positives, 2 negatives in Jabber room. Karen: In order to find out what the solution should be we should understand what the problem is, what kind of reordering do we want to make TCP robust against? * draft-zimmermann-tcpm-undeployed Alexander Zimmermann 5 mins Yoshifumi: what if some normative reference to historic draft, what happens? Lars: We should check which standards track documens cite these RFCs. Cookie Transactions 6013 is independent stream. Can the WG make an independent stream document historic? Wes: 896 has Nagle. We may want to be referencing that one. We should fix … in terms of Nagle, we could try to fix it to Linux implemented Greg Minshel's version of Nagle, ancient draft that can be quickly revived. Either obsolete that by other document, or make it historic here. Gorry: RFC 816: half of it is already regarded as overruled by the IETF. * draft-nishida-tcpm-apaws Yoshifumi Nishida 15 mins (Wes Eddy filling in as chair) Lars: Many of our connections at NetApp transfer more than 4 GB. If I need the extra option space for a security option, what do I do once a connection hits 4GB? Tim Shepherd: If I open a connection somewhere and send some data. I need the timestamps on every packet from the very beginning, else I have no way of knowing if this was an old packet (next slide) Tim: Nothing in the Internet forces MSL. I unplugged a cable and went to lunch, then plugged it in again and all packets made it. Karen: In a lot of big servers today we do need to use SO_REUSEADDR. Need to support other connections, need to not use too many IP addresses. Matt: Should have TCP state to see wrapping of TCP state (slides continue) Richard: Minor clarification. 7323 SHOULD is replaced with MUST, check must now be done if timestamps check IS enabled. Now a much more stronger check. 1323 didn't have strong wording, so stacks didn't check for invalid timestamps, in my opinion that's a flaw in these stacks, we've cleared up the language. Lars: same feedback as before. Nice for short connections. Doesn't actually help for long ones, doesn't actually free up option space. Yoshifumi: to some extent. * draft-kuehlewind-tcpm-accurate-ecn Bob Briscoe 15 mins Bob: does anyone care about delayed ACKs? Matt: there are a lot of devices esp in the low bandwidth areas that delay ACKs for you and do that for a reason Bob: it would just be simpler if we didn't have delayed ACKs Matt: for a variety of reasons ACKs can be not transmitted Andrew McGregor: There are places on the edge of the net where the uplink is not provisioned for all the ACKs Yuchung: Stretch ACKs need to be kept, useful in high bandwidth cases Richard: When ECN markings should return, only for data packets or control packets should be included? In my opinion, going for the latter allows for future extensibility. Should be another point to ask the WG to discuss. Bob: want to do a generic receiver. Mirja via Jabber: Would it be possible to adapt the ACK rate dynamically in those situations? Bob: this probably meant the delack question. Michael Welzl: in favor of using the urgent pointer, not just because of complexity but also reliability. There are constraints for the counter based solution, this can create problems with future congestion controls. Bob: ok. Would like to ask for adoption. Chair: need more feedback on the mailing list. Should initiate discussion on list. Bob: will also involve DCLC IRTF group. * draft-moncaster-tcpm-rcv-cheat Bob Briscoe 10 mins Dave Oran: When would you run this test? Bob Briscoe: All the time, or randomly, or just when under stress. Could be done by sender, or by a middlebox. Jana: last slide says minimal problems. Does that mean no problems or just minimal? Bob: the only problem is you get a slight delay. Jana: broader comment. This could also ossify the receiver - the receiver's mechanisms are then stuck to this test. Bob: good way to think about it. Dave Taht: do you have source code? Bob: no, just checking if there is interest here. * draft-touch-tcpm-tcp-syn-ext (to be submitted on July 21st) Bob Briscoe 10 mins Lars: Dual-SYN going to double NAT bindings you get Wes: David Borman is doing a draft that's really simple on extended SYN, turns 3-way handshake into a 4, seems to be nice. We don't actually agree that the last two bullets on slide 7 are accurate --- end of meeting ---