Multipath TCP (MPTCP) Meeting : IETF77, Tuesday March 23, 2010, 15:20-17:20 Location : Anaheim, California C Chairs : Philip Eardley , Yoshifumi Nishida AD : Lars Eggert URL : http://tools.ietf.org/wg/mptcp/ Note Taker: Gorry Fairhurst Andrew McGregor ------------------------------------------------------------------------------------ 1: Announcement from chairs There will be an implementers workshop at Maastricht (July 24th). There was sufficient interest from OS vendors to organise this workshop. There is a call for participants, especially from the applications community. ------------------------------------------------------------------------------------ 2: Multipath TCP Architecture by Alan Ford http://www.ietf.org/id/draft-ford-mptcp-architecture-01.txt Scott: Will you document issues in the addressing? Alan: Yes. Alan: would like detailed reviews on this. Tim Shepherd: What does a "middlebox split" mean? Alan: Where it sits in the path. Yuri: Does it allow data ACK to come on different paths? Alan: It comes on one channel, but no specific requirements. The data ACK can work on any path. Mark: My assumption is that data ACKs occur on all paths. ??? : How does IPv6 and IPv4 work? Alan: We can signal IPv4 and IPv6 and can't see why there could not be subflows using both network layers. Tim: Now you have ack inside headers vs single ID last time, it's a good thing that opens possibilities Alan: we're trying to find a balance between enough information to operate through as much network as possible vs going too far Mark: The Charter says this should be submitted in August, I think it should be kept open until the protocol document. Jana: I support this, but suggest it would be useful if we get someone to review. Philip: OK, we could keep this open and explore if we could get review. ------------------------------------------------------------------------------------ 3: Multipath TCP Protocol by Alan Ford http://tools.ietf.org/id/draft-ford-mptcp-multiaddressed-03.txt Gorry: speculating we can do away with timestamp options, then we can use the data sequence number Mark: One of the thoughts is that this helps in fast networks and could be a way of reducing the need for the timestamp option as an alternative method to do PAWS - but we have not worked through all the cases. Marcelo: Can you expand on policy? Alan: for an example, what if the receiver has an expensive link and would much prefer the other to be used, how do you signal that? Marcelo: What is the policy problem? Alan: A sender can change the path; how does the receiver do this? Marcello: How do you understand how many ports, etc to use. Alan: This will hopefully be discussed on Friday Mark: Receiver policy requires wire changes, not just implementation. Is this also for the sender case? Marcello: I don't know. Eric Nordmark: If two parties communicate and there is private/public address conflict and you try to contact the wrong host (e.g., the wrong private address). Alan: We have tokens at each end, and the "wrong" end will respond with a reset, or the token returned will not match. Mark: There is another draft that proposes a solution Eric: there can be cases not covered and very difficult to fix Mark: there are cases where you do not advertise mp capabilities because you do not know what's on the other side Eric: It depends on which addresses you check reachability for. I wouldn't use a combination of one inside and one outside pair. Lars: Has anyone (besides the Linux work) started implementation? Costin: UCL Belgium have an implementation. ------------------------------------------------------------------------------------ 4: Mutipath TCP Signalling options vs payload by Costin Raiciu http://www.ietf.org/proceedings/10mar/slides/mptcp-4.pdf ???: How do you know if a host send a SYN with an unknown option got the option or dropped the unknown option? (i.e. did the midbox discard the option?) Mark: It's OK if the SYN gets there, even if it does not contain the unknown option. Eric: The useful data is to what extent are options allowed on the SYN and not later on data packets. Costin: I'm assuming if they allow the option, they'll allow the data later with the same option. Eric: People could also implement a split stack that doesn't forward the option? Mark: Seems logical to always allow data options if syn option was allowed, we do not know how to test this. Scott: It does not help to know if options pass, when we are hoping to use mptcp. Carsten Borman: There may also be a midterm solution to allow this. However, it is impossible to let options through on data packets - because that is not how they work internally. Anantha: I don't want to label segments, these can be combined. Touching data is somehow dangerous. you are assuming data is also tlv Costin: Yes but you can have important gain Anantha: People can have multiple data chunks and this depends on the Push usage. Middleboxes have issues, but probably are not a big deal. Nishida: you cannot optimize MPTCP if middleboxes do not negotiate between them. Costin: Local retransmits on a wireless link may be OK, not all PEPs are difficult. Jana: TCP gives us a byte stream. Separate segments are not preserved across some middleboxes. Sequence number options are going to be hard to preserve in this case. Lars: There could be a third option using a control connection (a subflow that only carries signalling). Jana: Does the control go as options or payload? Lars: Data Jana: Why not with data? Lars: Because the stream operation is preserved. Marcelo: what if that one goes down Jana: signalling as payload of that flow Costin: is very tricky Jana: Which ACKs drive the CC? Costin: The subflow ACKs are clocking data out. Jana: The CC is driven by the TCP acks on the subflows. If buffers are large, this behaves as New-Reno. Costin: Yes. But you also need ACKs at the next level. Mark: There are strange interactions with timeouts in one direction impacting the other. These may not be common. Costin: If there is a time-out you can't send new data ACKs. ???: Can you eliminate data ACKs completely. Costin: That was the initial design. There are middleboxes that do proactive ACK at the subflow level. ???: If the connection fails, you have to do something. I can see this offers more robustness. Mark: Proactive ACKs can be hard to change, and break to move-on (e.g. after a mobility change), because it is hard to how much the flow has sent. Jabber: Why not send data ACKs as urgent data? Costin: Well, they still get congestion controlled... Costin: My own view is we should use options, since this is the right long-term decision, and it has a problem in the short-term. The flip side of using data is that this looks like TCP, but then if we do happen to still have middleboxes in the long-term they will have to do some difficult parsing to do anything useful. Phil: On summary table, are there any missing issues? Eric: I thought we were dealing with middleboxes that can resegment and strip options. Does this mean that for each subflow we need to decide to do TLV-encoding. Costin: The subflow will die if it strips options. Even if you receive garbage at the other end you drop because does not respect the expected format. Eric: Can you always tell there was a problem and data was lost? Can we show that all combinations can be robustly handled. Costin: The draft has a section on this. Tim: Data can be modified in weird ways by some types of middlebox. Mark: We are to trying to fix sessions that would break with single TCP. ???: There are a class of middleboxes that would not work with MPTCP, if a web browser start a flow on port 80 and then use MPTCP, which they can not then intercept. There can also be middleboxes that do traffic characterisation that may adversely impact a flow that doesn't look like something that it knows. I have worries about adding options to ACK. Costin: SACK does this. ???: Yes, but SACK is advisory. It does not stop things working. Mark: Intrusion detection systems may be impacted more by payload than options. But in the long run. If we are successful, which would be best? It looks like options are going to be better for the network - allowing future middleboxes to work better, rather than forcing middleboxes towards a DPI-like approach seems a step in the wrong direction. I think options to me is better for the long-run. Marcello: The security negotiation may need the payload. Costin: Yes, at the start of a packet (e.g. for public keys). Marcelo: There is a possibility that security methods could fit in the options - we have not made this decision yet. Jana: There is an assumption that TCP does not run end to end, when we see the need for data ACKs. I would like to be able to predict my end state. Costin: We always need to allow transport over transport. Mark: Are you talking about putting TLS half way up the TCP stack. Scott: The option mode seems theoretically better. I am missing data on how the Internet really works. We have one experiment. We need more data. Mark: How do we get that data? David Borman: Timestamps is used in TCP for getting accurate timing. The options method is extending the option space. Why not actually extend the TCP options space to do this? There is a question of how this gets through middleboxes. Eric: I am puzzled by the middlebox discussion. If a middlebox inserts bytes then what do we do? This can mess-up the internal synchronisation of MPTCP. FTP port command in a NAT will likely change the length, and then the data-ack sequence numbers don't add up Phil: Could Costin organise a further discussion on this? This is a non-binding poll People in favour of the options approach? People in favour of the payload approach? People who do not know which to choose? - There seem to be more people in favour of options, although this needs more discussion. ------------------------------------------------------------------------------------ Meeting : IETF77, Friday March 26, 2010, 9:00-11:30 Location : Anaheim, California B Chairs : Philip Eardley , Yoshifumi Nishida AD : Lars Eggert URL : http://tools.ietf.org/wg/mptcp/ Note Taker: Costin Raiciu Bachir Chihani ------------------------------------------------------------------------------------ 1: Summary of options vs payload analysis by Mark Handley Mark Handley: end of WG discussion: options 2/3 in favour, payload 1/3, some people said it's not clear yet what to do. after WG session there was another discussion involving 10 people, which 2.5 hours long - found a problem with boxes that modify content and change length, current version does not cope with it; no answers at the end plans fix the technical issue with middleboxes that change content and length try and get real data from people running; we will run a web proxy for multipath at UCL ask middlebox vendors a set of questions on how they deal with different types of packets work both options through for now - make payload into a spec Lars Eggert: as AD I am worried this may take too long, we need to come to a conclusion Mark: costin and I thought we had decided on options Lars: don't want to make this process too long other draft, tcpct, proposes to extend option space; we should look at that ------------------------------------------------------------------------------------ 2: Threat Analysis by Marcelo Bagnulo Braun Marcelo: In the current solution, we have a cookie-based solution that basically place during the initial setup. we exchange a cookie and each time we have new address. we show that cookie that proved that is the same guy in the initial connection the one is trying to add the new subflow, We could improve and provide other solution that will require the attacker to change some other crypto material that is exchanged, in order to be able to fond the attack, which is hard to the attacker. if you're intended to be compatible with NATs, protecting against integrity attack is challenging. The problem with the integrity is due to compatibility with NATs. The question is how much do we care and whether the threats are relevant enough so we care about the additional complexity. One option is to say what is good for SCTP is good for us. Yuri Ismailov: if mptcp falls back to single TCP, we will not be able to do anything at all. Marcelo: you don't need to do anything at all in that case Yuri: just follow SCTP then Marcelo: then we go for SCTP Mark: that seems reasonable, but we should leave option for better later Marcelo: in ipv6 you can use CGAs for ipv6 to get better security. So do this as default base thing and makes rue we have a couple of diff security approaches for the future Lars: baseline security against most obvious threats would be enough. Phil: people should have time to take this in, and after consensus go with that. Michael Tuxen: look for the shared key stuff from sctp. Mark: great idea, but it may different for mptcp because we don't have a new socket api for now. Lars: set a deadline call for consensus Stuart Randall: issue with md5 ------------------------------------------------------------------------------------ 3: Congestion control for mptcp by Mark Handley - problems with existing theory: it flaps. Ken Culvert: what's the rtt? Mark: it's just for illustrative purposes; I doesn't matter the timescale, it still flaps. - equipoise: balance traffic more - family of algorithms to explore the spectrum between fully coupled and uncoupled - for static congestion coupling is better for resource pooling - for dynamic congestion, uncoupling is better Tim Shepherd: clarify experiment setup Tim: these are long lived. how about short lived flows? Mark: probably not want to do multipath for really short flows. If you open too many subflows to keep an ace clock going that will be worse than tcp. we haven't done these heuristics. Tim: what about initcwnd for multiple subflows? Costin: multipath opens subflows sequentially, no problems should appear Lars: there is more noise in graphs Mark: yes, but that's normal. multipath works on both links, and average is always constant at the beginning when multiple TCP flows share a link Mark: the algo is robust enough and deployable. Ken Culver: is there any insight in terms of variance of out of order packets? Mark: not the cc, but math produces a ton of o of order packets between subflows. This does not affect CC though. Lars: you can use a large send buffer to minimise e2e jitter. Phil: what about slowstart? Mark: slowstart on each subflow, but they are out of phase. worst case slow start at the same time. when should additional subflows be started? we shouldn't specify. Mark: I think we can adopt this as a wg doc. Tim: is this the same protocol since Stockholm: Mark: yes. we just understand it more. Phil: how many have read the cc doc? who thinks this should be adopted? consensus for yes Phil: protocol doc? Lars: should adopt now. Scott Brim: we should go for it. Phil: poll for adoption for congestion draft. Good consensus, none against. Phil: poll for adoption for protocol draft. Good consensus, none against, a few to postpone. ------------------------------------------------------------------------------------ 4: mptcp API by Michael Scharf Ken Culver: automating the turn on "process" when it starts, etc Michael: that is orthogonal to whether we negotiate multipath. Mark: if small buffer, mptcp may still be used but only a subflow at one time. Still gives robustness. Michael: yes, slide is not entirely correct. Eric Nordmark: would make sense if the path api stuff is similar to sctp to help implementers. also we should keep state in pcb to avoid rebinding the same port even after subflow closes. Michael: interaction of mptcp and shim6/hip out of scope. Yuri: mptcp and mobile ip interaction? --- Application profile Scott Brim: it would be nice but this is scary (i.e. application profiles). Mark: you'd have to be careful with the overflow option -‘¹ good potential for being unstable. Hot-standby is fine. We don't know how to do overflow. Lars: this is not what the API is doing. I would like the API doc to be minimum, this is too much. Alan Ford: what do profiles do to the traffic, and the ability of API to allow the apps to do this is different. Another stuff: duplicating traffic. Lars: we should not go there now. Eric: one can infer app profile by looking at traffic, e.g. when there is not enough traffic to fill the cwnd. ------------------------------------------------------------------------------------ 5: AF_MULTIPATH by Pasi Sarolahti --- in favour of keeping the API minimal. Lars: do not derive multiple addresses from DNS lookup. Alan: structure should be populated during the protocol Mark: dns shouldn't be used, today dns gives you addresses for different hosts for the same time. you would need a new dns record. Pasi: this would help transition to v6 Lars: the protocol can help you do this. Eric: this is useful for robustness maybe, passing down multiple addresses at the same time. you should parallelise everything, lookup + setup: we need connectbyname. Mark: the protocol copes fine when you accidentally connect to the wrong host. if we want connectbyname we should change dns rather than overloading A record lists. Eric: it's not clear what "same means". you'd like to switch quickly to a new address if you can. Mark: connectbyname is the right way to go, if we want to make a change. M: what about ports? spec says you should connect to the same port, but doesn't say you must. Specifying which may be an api stuff. Yuri: concerned about the general use of dns in some cases, with mobile clients. mark: getaddrinfo comment. ------------------------------------------------------------------------------------ 5: name based sockets by Javier Ubillos Tim: I was looking at all the rest of slides. In the vast majority of cases hosts don't have names. What do you do? Javier: that is true. there are solutions to address this.