TCPM WG Meeting - IETF 68 - Prague Tuesday, March 20, 2007, 17:40 -- 18:40 Note takers: Pasi Sarolahti and Gorry Fairhurst Acting chair: David Borman (for Mark and Ted) Jabber watcher: Lars Eggert (My thanks to Pasi and Gorry for taking good notes, these minutes are mainly just a merger of their notes. -David) Agenda bashing -- no comments WG status * anti-spoof: Passed WGLC, currently in IETF last call * syn-flooding: in seemingly good shape, intent to have WGLC * soft errors: seemingly good shape, intent to have WGLC * rfc2581bis: authors believe issues have been addressed - for Draft Standard we need implementation reports - Mark to survey implementors -- implementors are asked to contact Mark * ecn-syn: Sally working on simulations to get clarity on what is right response to ECN-marked SYN-ACK. Sally and Mark have some disagreements on what the response should be. * tcp-auth: There is WG consensus on developing TCP-MD5 replacement. TCPM to do the transport protocol work. Design team has been established to produce -00 version from two competing drafts, chaired by Steve Bellovin. After that the usual WG process will take over. User Timeout -- Lars Eggert draft-ietf-tcpm-tcp-uto-05.txt * Lars and chairs did not remember if this is going for experimental or proposed. Tim Shepard has preference for experimental. - Tim Shepherd: in San Diego asked this explicitly from chairs during agenda bashing, they said the plan is for Experimental. - Lars: ok * Status - TCP connection dies if there is long period of disconnection without ACKs coming in, relates to mobility, etc. - Draft proposes one way to solve it, a TCP option to signal the appropriate timeout - ver -05 has clarifications, Ted Faber commented earlier that the language needs to be consistent, lot of overload on UTO value. There was also one other issue during author discussion. * Changing to make it explicit of what we talk about -- three pieces of connection state: - "enabled": whether UTO is enabled or not - "local_uto": local UTO in use, system-wide default, or optionally applications to tell TCP stack what to use - "changeable": controls whether local UTO may be changed based on incoming options. Default is true unless application sets local_uto, false if application has set the local_uto. * Last issue - Distinction between UTO I am currently using vs. last sent UTO information. Ability to shorten the local UTO depends on maintaining last_sent_uto information. If don't need to shorten local UTO, the implementation becomes simpler. Does the WG consider this is important? - There will be a revision, then WGLC - Gorry Fairhurst: Is there a real use-case for reducing the UTO? In the last meeting, we discussed the issues of having too small a value. So, things are simpler if we can only make this bigger. - Lars:That is one option. A TCP may wish to shorten this to release resources. Use case for short UTO could be busy web server, for long UTO it is periods of disconnectivity. Fernando would like to keep short UTO, Lars would like to remove. No one has requested short UTO, no one has opposed. - David Borman: if we don't have ability to shorten it, does it affect the web server? If the server uses a shorter timeout and closes the connection, the client would know about it anyway. - Lars: it wouldn't change anything on the web server. Would give the other guy ability to shorten timeout, if a disconnection happens. - Lars: server can locally timeout while client might keep trying for a longer time vs. client would stop immediately. * Next version need to choose which way to go, then we will need committed reviewers. - Do the new variables clarify things? - Do we need to be able to reduce the UTO? Two volunteers: Anantha Ramaiah Arjuna Sathiaseelan TCP-secure -- Anantha Ramaiah draft-ietf-tcpm-tcpsecure-07.txt Improving TCP robustness. Has been around as WG document for two years, would like to go for WG last call. Provides mitigations to known security issues. Has been thoroughly discussed on list. Have been running in Internet for two years. No known serious issues. Changes: * Mitigation recommendations changed from MUST to SHOULD, after a comment by Ted Faber, would have made existing TCP implementations non-compliant with this. * Security considerations text rewritten. * Got many comments: some public, some private. Data injection mitigation: SHOULD or MAY? * explaining data injection mitigation specification * implementations can choose to hard code the value of max.snd.wnd mitigation. Increases robustness to FIN attack. Comments? Ready for WGLC? * Joe Touch: All documents need to be careful with the use of SHOULD - it means most implementations will do this, but there could be cases where a specific implementor will not do this. We SHOULD say when (under what conditions) this is NOT OK to implement. Perhaps we could say a SHOULD (or even) MUST for all routers. * Joe Touch: Raising a comment from the list -- this is a general statement for all documents: What SHOULD means, and what it allows? There seems to be consensus about how it is documented in RFC 2119. I would be ok with SHOULD or even a MUST with regard to TCP implementations on routers. Requiring SHOULD without caveats to all hosts in the world, it is not compliant to update. On general host it is MAY. * Scott Bradner: This interpretation is correct. The idea of SHOULD is saying this is to be done. You may not imagine there is a good reason for not doing this, (but you may not know all the possibilities at the time the RFC is published) - the aim is to do something to make the system to work. * Tim Shepard: clarifying question to Joe -- why did you say router? There could be end hosts and all sorts of places where you run long-lived connections that find this useful to improve robustness. Router is a wrong distinction. If you said BGP that might make more sense. * Joe: because in routers TCP attack would be reasonable and likely. * Tim: There could be applications on end hosts, that may need this. Not just "router" it could be that we really mean a BGP session. * Anantha: Distinction to router is gone. We have router that can use TCP for all kinds of purposes, HTTP, voice-over-IP, all kinds of stuff. * Joe: It may be useful for end hosts that have long-lived sessions were both ends are likely to be known. * Tim: It is needed on any system that may in the future need robustness. * Joe: ... yes, but in cases where the host fails to be implement "proper" security. * Lars Eggert: Note that previous discussions considered the document also has an IPR statement. The WG needs to include consideration of this in the decisions. * Anantha: why mentioned it now? * Lars: WG needs to include the existence of IPR in its decision. Draft has a long history, and shouldn't forget about the IPR. * Mark Allman - via jabber: The IPR point is that the IPR could have an impact on MUST v. SHOULD v. whatever, i.e., I would personally be against saying a TCP MUST implement tcpsecure, because it has an IPR statement. F-RTO update to proposed standard -- Markku Kojo (RFC 4138) Updating F-RTO to proposed standard. No revised protocol specification available, just an evaluation report Small modification in TCP sender algorithm, allows detecting a spurious RTO * Experimental RFC since Aug 2005. * number of known implementations. * Experimentations with all major implementations show encouraging results. * Interest to promote has been expressed already earlier * Last IETF we were asked to write document to evaluate & show it is not harmful. Material is available on a web page, but not yet in the repository. First question: is F-RTO useful? * showing time-sequence graphs of normal TCP behavior on delay spike * full window of segments unnecessarily retransmitted, wastes requests, breaks the packet conservation principle. * Problem in a mobility case of moving from WLAN to GPRS environment. * Problem is not about causing congestion, but about performance of an individual TCP flow * Presenting time-seq diagram of case using F-RTO. If two segments acknowledge new data, can declare timeout spurious and continue sending new data. Avoids unnecessary retransmissions, and additionally can take RTT samples from the delayed segments. Can F-RTO be harmful? No * If RTO is not spurious or F-RTO cannot detect spurious reverts back to the traditional RTO recovery. Exactly the same number of segments are transmitted as with normal RTO recovery. * There are a few corner cases where F-RTO can declare RTO spurious even if there are packet losses. It would be harmful if congestion control response was aggressive. If congestion window was not halved in response to spurious RTO, it should be ok. * Few known scenarios - 1: loss of unnecessary RTO retransmission. Quite rare situation to happen - 2: severe reordering * Mark Allman (via jabber): You have only ten minutes, get to the point - 3: malicious receiver - Might be harmful if congestion control response is reverted, but proposing that congestion window is reduced in response to spurious RTO, in which case false positives are not be harmful. Next steps * Revise RFC 4138, targeting at Proposed Standard. Specify basic algorithm only, and TCP only, leave out SCTP because there is no implementation experience. Leave out SACK-enhanced variant, because it has only limited benefit. * What to do with response? The draft does not specify any response in the original RFC. Options are 1) do not specify response 2) specify conservative default response, or 3) specify a conservative response in the new draft * Recommend implementing conservative response * Anantha Ramaiah: how does it cooperate with other related enhancements: Eifel, DSACK? - e.g, if all three coexist. * Markku: Does not require additional information. If timestamps have been enabled, Eifel can be used without problems. DSACK does not prevent unnecessary retransmissions. * David: Continue discussion on the mailing list. Please indicate if you want or don't want this to be promoted from experimental to proposed. Non-WG Drafts Identifying Cheating Receivers -- Toby Moncaster draft-moncaster-tcpm-rcv-cheat-00.txt First presentation of a new draft * TCP senders rely on accurate feedback from receivers * Dishonest receiver can do optimistic ACKs, causing sender to transmit at higher rate, conceal lost data, harmful for congestion control * Some existing proposed solutions: - Randomly skipped segments - ECN nonce - Transport layer nonce in TCP headers Listing 7 key requirements for solution * Joe Touch: Proposes re-phrasing "Test should not harm innocent receiver": Anything that network could have done cannot be interpreted as malicious. Sender should not be allowed to do anything that the network couldn't have done anyway. That allows to develop the solution rather than just mess up the receiver. * Joe: The test should not harm an innocent receiver, i.e. anything the network may have done accidentally should not be seen as malicious. * Joe: It also should only do what the network would normally allow a sender to do. Assessing proposed solutions * Table evaluating the earlier solutions * Joe: should have separate column in the draft, or have a separate row works only with certain options. Otherwise I like the table, should be also in the draft. * Toby: it already is Our proposed solution * Based on Rob Sherwood's randomly skipped segments solution 1. delay a segment by small amount 2. delay segment until duplicate ack is received. Graph of stage 1 test Assessing stage 1 test * Meets all requirements set, but does not strictly prove dishonesty. * Joe: Should also say works only work with certain options. * Tim Shepard: if receiver is using SACK, the test gives receiver chance to prove it works nicely with SACK. The table did not mention delayed ACKs. * Toby: main gain of optimistic acking is to reduce RTT * Joe: any indications that things mentioned in tcpsecure have relationship with this. For example if anyone wanting to cheat by bursting. * Toby: need to look at that in detail Stage 2 test * Meets all requirements set, except doesn't harm innocent receiver Conclusion * Cheating receiver gets more resources that it should get. Could possibly cause congestion collapse (Gorry needs to leave a room, other minute taker is gone) * Anantha: what about senders using byte counting? * Joe: should verify that there are no interactions with Nagle in cases when there are segments dropped, should check interactions with partial segments TCP Response to Lower-Layer Connectivity-Change Indications -- Simon Sch�tz draft-schuetz-tcpm-tcp-rlci-01.txt Problem: TCP is unaware of what happens on lower layers. RLCI uses generic indications from lower layers, avoids long idle time due to repetitive RTOs. Connection stays idle after gaining connection after disconnection, because RTO has backed off Why RLCI? * Provides generic approach to overcome problems - hand-overs - connectivity disruptions * Bob Briscoe: The draft was not clear is this about getting information from local link or of the remote link? * Simon: Defining the source of indications is out of scope, defining the response to indications. * Joe Touch: has been discussed before, problem with earlier approaches is that you are trying to get indications from the link layer, transitive to issue on putting this on the API layer. Tickling the end point deliberately, unreasonable thing to do. What's new Next steps * Would like to start discussion on the mailing list * Candidate for experimental. * Soliciting discussion on the mailing list Concluding the TCPM meeting * Explicit reviewers sought for tcpsecure. Send mail to Ted and Mark if you are volunteering. * Looking for people to read tcp-persist and comment on the mailing list