[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[MMUSIC] comedia-fix-00 comments
This is a rather long and rambling commentary, so let me summarize the
major points up front:
- The scaling issue is misstated.
- The security issues raised are not actually solved by
comedia-fix, but can be mitigated more simply.
- The proposed mechanism for session correlation has
serious side-effects that are not addressed.
- The connection/session lifetime decoupling is a
major paradigm shift, and also has some serious
side-effects.
Details below, arranged by comedia-fix sections:
3.1 Port Multiplicity on Servers
This section is incorrect. Even without a means to correlate incoming
connections to sessions, the mapping of allocated port numbers to clients
is not 1:1. I'm surprised to see this statement in the draft since I
thought I had explained that in detail at earlier meetings.
The size of the port space need only be as large as the NUMBER OF PENDING
OFFERS made to potential clients. This is not the scaling problem
described in comedia-fix (and not really a scaling problem at all, IMO),
and the draft should be updated to reflect this.
3.2 No Connection Re-Use
This references 3.1 and restates the incorrect scaling assumption.
This scenario seems to be the only problem for which connection/session
lifetime decoupling is presented as a solution. I cover this in more
detail below, but the short answer is that any economy of design derived
from this solution is offset by burdening other, more common scenarios with
undesirable restrictions.
3.3 Security
The Man-in-the-Middle attack can be mitigated by simply disallowing
multiple connections to the advertised IP/port during the offer period. If
more than one arrives, then the endpoint assumes an attack and does a
session teardown. (yes, this requires updating the language in comedia to
reflect this new scenario)
The firewall trust issue can be addressed by a chain of trust, can it
not? I.e., presumably the firewall authenticates/trusts a SIP proxy which
in turn authenticates/trusts each of the UA's. The instruction to
open/close pinholes is made by the SIP proxy to the firewall as a result of
the message traffic the proxy is relaying, not by any sort of stateful
packet inspection by the firewall.
As for the "there's no ALG yet" argument for residential gateways: I submit
yet again that without including the source address/port, comedia will
preclude an ALG from ever being written.
4.1 Endpoint-ID
This concerns me greatly. During the extended period when comedia was in
development, I longed for just a "hook" that would allow for reliable
connection correlation. But alas, since we are signaling media streams
that use an open-ended set of protocols, such a hook did not exist. My
working assumption was that shoe-horning additional information into an
arbitrary protocol was outside the bounds of what SDP could impose. Guess
I was wrong. :-)
The suggestion in comedia-fix (to use a header at the start of the
connection) jumbles the protocol stack, and as such may cause some
unintended consequences. At the heart of the issue is "what is it that we
are signaling here?". If we are signaling RTP/AVP over TCP, then we
violate the spec with the initial ID transmission. If we are signaling
RTP/AVP over TLS, where does the ID transmission go, as part of the TLS
stream or prior to the TLS handshake? In either case we end up with a
stream that is neither (a) 100% RTP/AVP and/or (b) 100% TLS.
This problem can manifest itself in implementation complications as well as
middlebox problems.
Take TLS for example: let's say we put the endpoint ID ahead of the TLS
handshake. The OpenSSL folks were kind enough to provide C developers a
simple API where you prepend "ssl_" in front of the normal BSD socket
calls, and for the most part you can write code for SSL/TLS in the same
manner as raw TCP sockets. But with an endpoint ID, suddenly I the
developer have to do a more complicated scheme where I (a) connect a TCP
socket, (b) send/receive the endpoint ID, then (c) construct an SSL session
based on the connected socket. This of course presumes that the
language-specific API to SSL is flexible enough for the developer to juggle
the order of the protocol stack in this manner. Other API's (Java, Perl,
VB, etc.) may not be so forgiving.
While I'm not a streaming media expert, I imagine that the situation for
streaming media libraries may be worse.
Then there are the considerations for a decomposed gateway topology. By
multiplexing signalling information into the media stream, suddenly you
force the a tight coupling between the signalling and media transport
infrastructures by intermixing signalling tokens with the media stream.
For middlebox problems, consider the stateful packet inspection
firewall. The idea of course is to enforce a policy by protocol rather
than by connectivity. So SMTP, HTTP, FTP, and DNS are allowed, but nothing
else, and it doesn't matter what ports or addresses are used. In theory
the sysadmin should also be able to specify SIP, TLS, RTP/AVP, and IM
protocols in this list and allow those forward-looking end-users to utilize
this fancy new SIP stuff. With comedia-fix, every protocol that is
signaled will not, in fact, *be* the protocol described in the SDP. So
when the firewall watches the media stream, it doesn't see a stream that
fits any of the protocol structures it recognizes, and therefore disallows
the stream.
Another case is where TLS is provided by an external accelerator box to
offload the compute load of the server---same issue.
4.2 Decoupled lifetimes
This has a number of side-effects. First, it essentially forces a
single-port architecture on both of the endpoints, as the only other option
is the "this-doesn't-scale" 1:1 client-to-port-number mapping. With
comedia as it stands today, the lifetime of the listener port is decoupled
from the lifetime of the session that is begat from that listener. No
longer the case with comedia-fix.
Second, it forces endpoints to use "both" where one could have opted for
just "active" or "passive". The problem here is that if one endpoint
specified "passive", and the media connection drops, it has no way to
legally reinstate the connection if it needs to send data to the other
endpoint. In essence, comedia-fix "raises the stakes" as to what is an
appropriate connection mode to negotiate. In comedia, the only thing at
stake was the ability to bring up the intial connection. In comedia-fix,
we add the issue of recovering from the scenario that the remote endpoint
will drop the connection.
Third, as described in comedia-fix, NAT's can cause a similar outage to #2
even if each endpoint specifies "both".
The decoupled lifetime, to my knowledge, is unprecedented. In developing
comedia I tried to apply the litmus test of "how is this scenario handled
in connectionless media, and how do connections differ?". I don't believe
that decoupled lifetimes fares very well here. Remember that while
"traditional" SDP media transports may be connectionless, they aren't
stateless. So the corollary to decoupling connection lifetime from session
lifetime is akin to allowing RTP/AVP streams to be stopped, reset, then
restarted, all without any additional activity on the signalling
channel. Again, I'm not an expert here, but I imagine that this would be
considered outside the spec today.
I also don't understand the rationale behind the statement: "If one side
wishes to force the other to reconnect, it merely drops the connection.
When the other side has data to send, it will establish a new
connection." The "reconnect" attribute was never intended to "force"
reconnects. Rather, it was intended as a way to recover from an
*unintended* disconnection in such as a way as to preserve the existing
session state. Is there ever a reason that an endpoint would tear down the
connection solely to inconvenience the other endpoint with the task of
re-establishing it? (insert half-smiley here, since that's the impression
one gets from the language in comedia-fix)
I guess I'm having trouble understanding why it is so important that the
lifetime be decoupled. The only argument put forth is the
multiplexing-bridge scenario. If we really are using a multiplexed tunnel
to bridge two networks, what's the harm in making the tunneling connection
"sticky"? On the flip side, since there is a multiplexing function
occurring in an intermediary, shouldn't there be a mechanism for
determining the lifetime of the feeds into the mux that just makes this
problem go away? How today, do the endpoints in this scenario know when to
do an initial INVITE, then do the final BYE?
Or, a more blunt way of saying this: "Why is this comedia's problem?" This
strikes me as a lot of bending-over-backwards to accommodate a network
transport architecture that should be transparent to the signaling protocol
used by the endpoints.
_______________________________________________
mmusic mailing list
mmusic@ietf.org
https://www1.ietf.org/mailman/listinfo/mmusic