[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MMUSIC] comedia-fix-00 comments



At 09:58 PM 1/14/2003, Jonathan Rosenberg wrote:
David,

Ben, Paul and I discussed your comments extensively over the last week or so. Our conclusion was that we will not use comedia for simple messaging sessions. At a high level, the conclusion was that we were looking at "message sessions over TCP" and that comedia was perhaps optimized for "streaming media sessions over TCP". As such, many of our requirements that were critical, or assumptions we could make (such as demux within a single TCP connection) don't necessarily apply to streaming media over TCP.

Now, I still think that much of what we proposed is valid. However, it is no longer critical to get it changed. So, keep that in mind as you read my responses below.
The the meta-question is whether comedia-fix will continue to block comedia's progress. On one hand you imply above that it won't, but then again we have the rest of this email that argues that comedia is severely flawed. So, where are we going from here?


lem at all, IMO), and the draft should be updated to reflect this.
I conceded this point. You are correct that its not 1:1.

I still believe that this will fundamentally make tcp-based streaming media complicated for firewalls. It would be REALLY nice to allow an admin to open the "streaming FOO port" and that would allow for all media traffic of that type through. You are still going to need dynamic ports in your proposal, which perpetuates the firewall problems. The use of dynamic ports was (arguably) needed for UDP because there is no other protocol-independent demux. However, that is NOT true for TCP, which has the notion of multiple connections on a port. Thus, to me, any usage of TCP should make use of this natural and well-understood property, rather than perpetuating a (IMHO ultimately wrong) design choice for UDP [ranting for a moment, given the huge problems we have encountered in deploying voip, a large number of which are based on the usage of dynamic ports for session demux, I think ultimately we should have chosen a well-known port for RTP, and then used something like SSRC or CNAME to demux. But that is another conversation.]
I agree that fixed ports are desirable. Changing your endpoint ID from its current form to be something keyed from within the media protocol itself would go a long way towards this while alleviating my other concerns.


3.2 No Connection Re-Use
This references 3.1 and restates the incorrect scaling assumption.
This scenario seems to be the only problem for which connection/session lifetime decoupling is presented as a solution. I cover this in more detail below, but the short answer is that any economy of design derived from this solution is offset by burdening other, more common scenarios with undesirable restrictions.
Can you explain what you mean?
Well, like I said, I explain what I mean below. :-)


3.3 Security
The Man-in-the-Middle attack can be mitigated by simply disallowing multiple connections to the advertised IP/port during the offer period.
If more than one arrives, then the endpoint assumes an attack and does a session teardown. (yes, this requires updating the language in comedia to reflect this new scenario)
That helps, but doesnt eliminate the problem. I can dos the other party, preventing them from connecting to you. This came up during the development of stun, which documents a similar attack.

I admit this is not a crucial point, since if it truly bothers you, using real media level security will fix it.
A DoS is a very different animal than a security breach, and arguably an improvement in terms of vulnerability introduced by the protocol, since protecting against all forms of DoS is nearly impossible. Besides, comedia-fix didn't fix the security breach either, so I'm not sure what it is you're after.


The firewall trust issue can be addressed by a chain of trust, can it not? I.e., presumably the firewall authenticates/trusts a SIP proxy which in turn authenticates/trusts each of the UA's.
That would be the midcom approach, yes.
Ok, so you agree that the original assertion in comedia-fix is not valid.

The instruction to open/close pinholes is made by the SIP proxy to the firewall as a result of the message traffic the proxy is relaying, not by any sort of stateful packet inspection by the firewall.
As for the "there's no ALG yet" argument for residential gateways: I submit yet again that without including the source address/port, comedia will preclude an ALG from ever being written.
I think thats a good thing.

ALGs are bad. Thats why midcom got started. We need to keep application layer intelligence out of them, and either get protocols to work through them without adding more to the mess (stun/turn), or even better, specify well-defined behaviors for them that can be controlled through something like nsis. But, all of that is far off.
I think we understand each others' opposing points of view. Not clear to me that we need to discuss it further, and I think the topic at hand is best addressed via other points (like the other email thread we're having).


The suggestion in comedia-fix (to use a header at the start of the connection) jumbles the protocol stack, and as such may cause some unintended consequences.
Arguably, yes. Not the first time though. It is common to do this in many protocols that are strongly recommended in IETF.

At the heart of the issue is "what is it that we are signaling here?". If we are signaling RTP/AVP over TCP, then we violate the spec with the initial ID transmission.
THink of it like a windowing operation in an operating system. During the times at which the RTP/AVP stack has the connection, its compliant. Its just that it doesnt get the connection right away.
This doesn't convince me. I still haven't seen precedent in SDP where the protocol, as described in SDP, was augmented by some other protocol not listed within the SDP.


If we are signaling RTP/AVP over TLS, where does the ID transmission go, as part of the TLS stream or prior to the TLS handshake? In either case we end up with a stream that is neither (a) 100% RTP/AVP and/or (b) 100% TLS.
Funny you should say this, since TLS is a good example of where common practice is to mix protocol layers.

The recommended practice these days is for protocols to NOT run their apps over tls on a separate port. The prtoocol itself needs to be able to demux the two, and to transition from "raw protocol" to TLS. An example of this is the STARTTLS mechanism used in http and other protocols. First its HTTP, and then its TLS, all on the same TCP connection.
But what you are really talking about is SMTP, not TLS, so this doesn't hold either. The difference here is that the STARTTLS command IS IN THE SMTP SPEC. The session from start-to-finish is 100% legal SMTP, and any security middlebox is going to recognize it as such. Not so with the Endpoint ID approach in comedia-fix.


Also, this is the only way for SASL to work. SASL, by design, inserts a security layer in the middle of the protocol stream. Sasl compliant protocols need to specify how they will demarcate the point at which you "hand the connection" over to the sasl security layer, so it can decrypt the messages and hand them over to the application.

So, while I agree with your concerns, there is a long-established and recommended practice of such a thing in a good number of widely deployed IETF protocols.
To repeat: these are embedded in the protocol specs and designs themselves, so they don't go out of scope the way that Endpoint ID does.


This problem can manifest itself in implementation complications as well as middlebox problems.
I don't see the middlebox issue. Please elaborate.
Stateful packet inspection.  See below.


Take TLS for example: let's say we put the endpoint ID ahead of the TLS handshake. The OpenSSL folks were kind enough to provide C developers a simple API where you prepend "ssl_" in front of the normal BSD socket calls, and for the most part you can write code for SSL/TLS in the same manner as raw TCP sockets. But with an endpoint ID, suddenly I the developer have to do a more complicated scheme where I (a) connect a TCP socket, (b) send/receive the endpoint ID, then (c) construct an SSL session based on the connected socket. This of course presumes that the language-specific API to SSL is flexible enough for the developer to juggle the order of the protocol stack in this manner. Other API's (Java, Perl, VB, etc.) may not be so forgiving.
Indeed. THis is a weakness. Not a new one. In fact, the need to support STARTTLS makes me think this is a very common feature of TLS stacks. But, I have not looked into it.
Implementation difficulties are rarely met with sympathy at the IETF it seems. :-) I'll agree to differ on this and leave it for now.

Then there are the considerations for a decomposed gateway topology. By multiplexing signalling information into the media stream, suddenly you force the a tight coupling between the signalling and media transport infrastructures by intermixing signalling tokens with the media stream.
How is this different than having the signaling thing ask the media thing for an IP address, and then putting that IP address into the SDP? Then, when the media thing receives media on that IP address, it knows the "control" connection associated with it. Here, we do the same, but instead of IP address, its IP+eid. Same difference.
The difference is that the media thing needs to know about windowing the protocol, which means that it's now tightly bound to SDP-based signalling rather than being a generic media thing.


For middlebox problems, consider the stateful packet inspection firewall. The idea of course is to enforce a policy by protocol rather than by connectivity. So SMTP, HTTP, FTP, and DNS are allowed, but nothing else, and it doesn't matter what ports or addresses are used.
In theory the sysadmin should also be able to specify SIP, TLS, RTP/AVP, and IM protocols in this list and allow those forward-looking end-users to utilize this fancy new SIP stuff. With comedia-fix, every protocol that is signaled will not, in fact, *be* the protocol described in the SDP. So when the firewall watches the media stream, it doesn't see a stream that fits any of the protocol structures it recognizes, and therefore disallows the stream.
Now I am confused. The way that firewall admins enforce policy by protocol rather than by connectivity is through ports. In the current comedia spec, there will almost never be just a single port for the newfangled media stream. So, you can't enforce policy. The mechanism in comedia-fix allows there to be one, and only one, port allocated for any specific media-over-tcp type. This will enable firewall admins to do things, not disable it.
You're talking about port-number based enforcement. I'm talking about stateful packet inspection, where the validation (yes, often combined with port filtering) is done by verifying that the content of network traffic itself---not just the endpoint information---conforms to the protocols that are allowed by the policy. By adding the initial data, the traffic no longer conforms to the protocols as understood by the inspection engine.

I do not understand your point about the protocol not "being" the one that is advertised. It most certainly is, with the addition of a 32 byte ID up front, but who cares?
Who cares?  The firewall does.  See above.


Another case is where TLS is provided by an external accelerator box to offload the compute load of the server---same issue.

4.2 Decoupled lifetimes
This has a number of side-effects. First, it essentially forces a single-port architecture on both of the endpoints, as the only other option is the "this-doesn't-scale" 1:1 client-to-port-number mapping.
With comedia as it stands today, the lifetime of the listener port is decoupled from the lifetime of the session that is begat from that listener. No longer the case with comedia-fix.
You can use as many ports as you like, ranging from 1 to one for each session. Because I don't need the port to correlate, I am free to use any algorithm I like. Comedia has to use an algorithm which has at least one-per-pending-offer, because you have no way to correlate. Therefore, I do not believe that comedia-fix introduces a binary "1 port" or "1 port per session" restriction.
Speaking of decoupling... :-) I was decoupling the lifetime issue from endpoint-ID. So if you assume that the endpoint-ID idea in comedia-fix does not become spec, but decoupled lifetimes *does* become spec, then the issue I stated above becomes a factor. So we are both correct, but are working from different assumptions.

At any rate, at the moment this is eclipsed by more important points, discussed below.

Second, it forces endpoints to use "both" where one could have opted for just "active" or "passive". The problem here is that if one endpoint specified "passive", and the media connection drops, it has no way to legally reinstate the connection if it needs to send data to the other endpoint. In essence, comedia-fix "raises the stakes" as to what is an appropriate connection mode to negotiate. In comedia, the only thing at stake was the ability to bring up the intial connection. In comedia-fix, we add the issue of recovering from the scenario that the remote endpoint will drop the connection.
You need to handle connection failures. In the current comedia, if the passive side loses the connection, it can bring it up by doing a re-invite and then include a reconnect parameter. What you CANNOT do is have the active side reconnect without doing a re-inVITE. In comedia-fix, you can have the active side reconnect without doing a re-INVITE. There is no reason we could not also allow the passive side to do a re-INVITE to force a reconnect from the active side. I did propose to remove that, but perhaps it is useful after all.
See below for my rationale on the re-INVITE.


Third, as described in comedia-fix, NAT's can cause a similar outage to #2 even if each endpoint specifies "both".
The decoupled lifetime, to my knowledge, is unprecedented.
Really? See RFC3261. SIP can run over TCP. When it does, the lifetime of the connection is not strongly coupled to the lifetime of the transaction or dialog. It used to be, in rfc2543. After years of trying to make it work, we found it was a nightmare to maintain. THus, it got disacarded. The result was a nice clean separate of the SIP layer from teh TCP connectivity layer. This change bought us many properties, including improvements in robustness (the ability to recover a SIP transaction even if the tcp connection the request went over had closed).

Its critical in any protocol where the intermediary elements provide a subset of the endpoint functionality.
"Unprecedented" in terms of the rules governing media endpoint behavior when using SDP, not in the entire IETF protocol suite. Sheesh... :-)

The rationale for requiring a re-INVITE was that it was safest and simplest to treat the loss of a connection the in same way as a failure of a media stream. So if the connection goes away, but you didn't get a BYE, and you didn't think you were done yet, then the way to attempt to recover the connection was via a re-INVITE. It was assumed that this mapped closest to how connectionless media endpoints would behave.

So now that the context is clearer: is there any precedent for connectionless media endpoints to just start sending media again even after the stream has deemed to have failed for some reason? See next comment for an example.


In developing comedia I tried to apply the litmus test of "how is this scenario handled in connectionless media, and how do connections differ?". I don't believe that decoupled lifetimes fares very well here. Remember that while "traditional" SDP media transports may be connectionless, they aren't stateless. So the corollary to decoupling connection lifetime from session lifetime is akin to allowing RTP/AVP streams to be stopped, reset, then restarted, all without any additional activity on the signalling channel. Again, I'm not an expert here, but I imagine that this would be considered outside the spec today.
There is no notion to starting/stopping RTP streams, so its hard to say what this would mean.
Well, consider the failure case, where the data sent to that port either (a) drops below some performance threshold (90% packet loss, 25,000ms latency, pick your favorite poison), (b) doesn't relate at all to what came before it, or (c) is just undecipherable garbage. In whatever case, the endpoint is able to discern that something is sufficiently amiss that the stream can be considered failed and therefore ended.


I also don't understand the rationale behind the statement: "If one side wishes to force the other to reconnect, it merely drops the connection. When the other side has data to send, it will establish a new connection." The "reconnect" attribute was never intended to "force" reconnects. Rather, it was intended as a way to recover from an *unintended* disconnection in such as a way as to preserve the existing session state. Is there ever a reason that an endpoint would tear down the connection solely to inconvenience the other endpoint with the task of re-establishing it? (insert half-smiley here, since that's the impression one gets from the language in comedia-fix)
No, of course not.

I must admit I am not sure what I meant here. Per above, I do see how the reconnect parameter would help.
Ok, sounds like that is resolved, reconnect is useful therefore stays.


I guess I'm having trouble understanding why it is so important that the lifetime be decoupled. The only argument put forth is the multiplexing-bridge scenario. If we really are using a multiplexed tunnel to bridge two networks, what's the harm in making the tunneling connection "sticky"?
The intermediaries in this case woudl need to "reference count". That is, they would need to keep track of each session that gets set up, and then torn down, which had media over that connection. This is impossible to do when the media intermediary is decoupled from the SIP proxy. Doing it would require what we call a B2BUA in SIP, along with a colocated media intermediary or a remote one under tight control.

Another problem is disconnects due to some kind of network outage. Lets say users are sending their TCP-media through two relays. The connection between the relays gets lost, but the connections from each user to their relay remains. Neither user will try to reconnect, since as far as they know, there is no TCP connection problem. The relays know there is a problem, and they could reconnect, but comedia prohibits them from doing so. They would have to be a b2bua once more, and issue a new INVITE on their own, just to re-establish it. In other words, the relays need to be the same (in terms of fucntioanlity) as the endpoints. I want to drive such things out of intermediaries, not INTO them.


So, to sum up, with comedia today, building these relays requires a very tight coupling between the SIP proxy (which has to be a b2bua) and the tcp media relay. WIth the comedia-fix, they are decoupled. The SIP intermediary can remain a proxy, and the media-relay can worry about its own connectivity.


On the flip side, since there is a multiplexing function occurring in an intermediary, shouldn't there be a mechanism for determining the lifetime of the feeds into the mux that just makes this problem go away? How today, do the endpoints in this scenario know when to do an initial INVITE, then do the final BYE?
Sure, it can be computed. It just introduces a coupling whcih is not otherwise there.

Sounds like you aren't going to use comedia for this anyway, so I won't pursue the topic.

-Jonathan R.

--
Jonathan D. Rosenberg, Ph.D.                72 Eagle Rock Ave.
Chief Scientist                             First Floor
dynamicsoft                                 East Hanover, NJ 07936
jdrosen@dynamicsoft.com                     FAX:   (973) 952-5050
http://www.jdrosen.net                   PHONE: (973) 952-5000
http://www.dynamicsoft.com


_______________________________________________
mmusic mailing list
mmusic@ietf.org
https://www1.ietf.org/mailman/listinfo/mmusic
_______________________________________________
mmusic mailing list
mmusic@ietf.org
https://www1.ietf.org/mailman/listinfo/mmusic