Re: [core] Ben Campbell's Discuss on draft-ietf-core-coap-tcp-tls-08: (with DISCUSS and COMMENT)

Adam Roach <adam@nostrum.com> Thu, 11 May 2017 21:38 UTC

To: Ben Campbell <ben@nostrum.com>, The IESG <iesg@ietf.org>
Cc: jaime.jimenez@ericsson.com, core-chairs@ietf.org, draft-ietf-core-coap-tcp-tls@ietf.org, core@ietf.org
References: <149451567392.16604.5915539927877259790.idtracker@ietfa.amsl.com>
From: Adam Roach <adam@nostrum.com>
Message-ID: <285fccb8-de64-7ea6-1ad0-ac3408ec188c@nostrum.com>
Date: Thu, 11 May 2017 16:32:35 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:52.0) Gecko/20100101 Thunderbird/52.1.0
MIME-Version: 1.0
In-Reply-To: <149451567392.16604.5915539927877259790.idtracker@ietfa.amsl.com>
Content-Type: multipart/alternative; boundary="------------A7D05669EA17F455A37E499B"
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/iKGBgNsHqgi5XTEQSVk2ZEcGVZQ>
Subject: Re: [core] Ben Campbell's Discuss on draft-ietf-core-coap-tcp-tls-08: (with DISCUSS and COMMENT)
Precedence: list

On 5/11/17 10:14, Ben Campbell wrote:
> 2) It seems problematic to encode the transport choice in the URI scheme.
>
> Section 7 says "They are hosted in distinct namespaces because each URI scheme implies a distinct origin server." IIUC, this means any given resource can only be reached over a specific transport. That seems to break the idea of cross-transport proxies as discussed in section 7.
>
> It also does not seem to fit with a primary motivation for this draft. That is, one might want to use TCP because of local NAT/FW issues. But if there is a resource with a "coap" scheme, I cannot switch to TCP when I'm behind a problematic middlebox, and have an expectation of reaching the same resource.

I've been turning this over in my mind, and I think there are two
problems here. The fundamental problem is that this document is encoding
all kinds of protocol options into the URI scheme, which is an
architecturally worrisome precedent for *all* URI-using protocols,
current and future (by which I mean to say I don't think it's in CORE's
unilateral purview to decide this is okay). This problem is compounded
for CoAP in particular by treating the resource spaces associated with
each scheme as distinct.

I don't know whether the second issue is intentional or just a
misunderstanding about whether it is allowable to have different schemes
share a resource space (it is; cf. [1]), but I think we can come up with
a solution to the first problem that allows the second to become moot.

Looking to what we've done elsewhere in the IETF: considering the
present moment, recent past, and near future, we effectively have six
protocols in the wild for securely retrieving hypermedia across the
Internet: HTTP 1.x, SPDY1, SPDY2, SPDY3, H2, and QUIC. Note that these
protocols vary in syntax, semantics, and even transport-layer protocols.
The URI schemes for these six different protocols are https, https,
https, https, https, and https, respectively.

I think that's a good model. One pleasant advantage of this model is
that it completely eliminates the resource space issue I describe above.

Concretely, then, I propose that the URI schemes in this document be
assigned as follows:

For CoAP over TCP: coap

For CoAP over TLS over TCP: coaps

For CoAP over WS: coap

For CoAP over WSS: coaps

We then make UDP support mandatory to implement [2]. Now, let's talk
about TCP. I'm leaving WebSockets out of this description for a moment,
for reasons that I will explain later.

The general process for accessing a CoAP resource would be: try to
contact the remote node over UDP using a Confirmable message. After some
reasonable timespan (I would propose just prior to the third
transmission [second retransmission] of a message), if no ACK had been
received, those nodes that support TCP would try to initiate a TCP
connection (same host, same port, different transport) in parallel with
the third UDP transmission. If the TCP connection succeeds, abandon the
UDP attempts and start the transaction over on the TCP connection [3].

Two additionally proposed minor details on this scheme that I think
increase its appeal:

1. Nodes SHOULD cache, on a per-authority basis, whether UDP failed
(but TCP worked) in the past; and, if such a failure was observed
within a reasonable timeframe (hours or maybe days), use TCP instead
of UDP for subsequent contact. This cache would be flushed whenever
an IP address or routing table change is observed.

2. Nodes SHOULD allow configuration, both globally and on a
per-authority basis, to skip the UDP attempt, so as to optimize
connection times when UDP is administratively known to be nonviable
or otherwise undesirable.

Turning now to WebSockets. As far as I can tell, the use of WebSockets
is intended for a radically different deployment architecture than TCP
-- it looks very much like TCP is meant to accommodate one or both of
the parties being a constrained node, while WebSockets clearly is not:
there is no way it makes sense for a constrained device to implement
CoAP over WebSockets over HTTP on the conceit that it's simpler than
using HTTP. So, this means that CoAP/WebSockets is intended to run only
between two high-powered network hosts.

(I'd like to take a quick pause to say that these radically different
use cases suggest that it would be a Really Good Idea to split
CoAP/WebSockets out into its own document that builds on top of the
CoAP/TCP document.)

Okay, so let's try to back out an architecture of *why* you have two
non-contrained hosts using CoAP over WebSockets. Clearly, one of them is
a browser, or you'd just use straight-up CoAP, right? In fact, the
WebSockets client has to be the browser here. Further, if this is simply
a browser sending and receiving data to and from a server, there are
already a host of well-defined and broadly-deployed web technologies
that they could use instead of CoAP.

That means that the the only sensible conclusion is that the server is
one of: (a) a proxy for constrained device(s) that the browser wants to
contact, (b) a proxy for non-constrained device(s) that speak only CoAP
(presumably so constrained devices can also connect to it), or (c) both.
In all three cases, the proxy needs to have its own mapping that
converts from resources at its own authority to the corresponding
resources on the authority/ies for which is is serving as a proxy. This
is important, and we'll come back to it later.

As far as I can tell, that is the *only* context in which one might use
CoAP/WebSockets. I mean, if there's some other architecture that makes
sense here, I'd like to hear about it, but I think the logic above is
pretty sound.

In that context, then: a CoAP node implemented inside a web browser that
gets ahold of a coap: or coaps: URI can do precisely one thing: use
WebSockets to connect to the authority, using the CoAP/WebSockets
semantics defined in this document. It can't try UDP or TCP, because
these affordances aren't available from a web execution context. So if
the authority in the URL doesn't support CoAP/WebSockets, the client is
done.

Now, what about other, non-browser nodes that happen to get ahold of
these URIs? Well, now we're back to the fact that we made UDP MTI: these
same servers are listening for normal CoAP over UDP (and maybe TCP, if
they want to). So, remember how we said that this thing is necessarily a
proxy fronting other resources, and it has its own mapping from its
local resources to remote ones? That's what makes the whole UDP and TCP
thing work: it just needs to act like the same proxy for CoAP/UDP and
CoAP/TCP as it does for CoAP/WS.

_____

[1] RFC section 4.1: "SIP and SIPS URIs that are identical except for
the scheme itself (e.g., sip:alice@example.com and
sips:alice@example.com) refer to the same resource."

[2] This is, I think, architecturally consistent with what's already
been done. In fact, this may already be the case; I can't quickly find
language covering this.

[3] The operation on the TCP connection would need to use the same
Message ID as the UDP attempts did, to ensure idempotency -- it's
possible that one or more of the transmitted UDP packets did make it to
the remote node, but the ACKs did not arrive back at the originator.
This does necessitate un-removing the Message ID field from TCP, but I
think that's a Good Thing anyway.

[core] Ben Campbell's Discuss on draft-ietf-core-… Ben Campbell
Re: [core] Ben Campbell's Discuss on draft-ietf-c… Adam Roach
Re: [core] Ben Campbell's Discuss on draft-ietf-c… Ben Campbell
Re: [core] Ben Campbell's Discuss on draft-ietf-c… Carsten Bormann
Re: [core] Ben Campbell's Discuss on draft-ietf-c… Adam Roach