Re: [core] Adam Roach's Discuss on draft-ietf-core-coap-tcp-tls-08: (with DISCUSS and COMMENT)

Hannes Tschofenig <hannes.tschofenig@gmx.net> Tue, 09 May 2017 08:10 UTC

To: Adam Roach <adam@nostrum.com>, The IESG <iesg@ietf.org>
References: <149430548476.30014.11810513211435340238.idtracker@ietfa.amsl.com>
Cc: core-chairs@ietf.org, draft-ietf-core-coap-tcp-tls@ietf.org, core@ietf.org
From: Hannes Tschofenig <hannes.tschofenig@gmx.net>
Openpgp: id=071A97A9ECBADCA8E31E678554D9CEEF4D776BC9
Message-ID: <042ec343-be37-9e78-352a-8507c77c3205@gmx.net>
Date: Tue, 09 May 2017 10:09:51 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <149430548476.30014.11810513211435340238.idtracker@ietfa.amsl.com>
Content-Type: multipart/signed; micalg="pgp-sha512"; protocol="application/pgp-signature"; boundary="RqIEUurLPN6cTLcWCMCMTdV1pN4fJMvQp"
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/x5vWm_NfRXFJeTe5elRITymMJ5Q>
Subject: Re: [core] Adam Roach's Discuss on draft-ietf-core-coap-tcp-tls-08: (with DISCUSS and COMMENT)
Precedence: list

Hi Adam,

thanks for your review.

A few comments inline:

> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
> 
> - Part of the document is outside the scope of the charter of the WG
> which requested its publication
> 
> While I understand that this document requires a WebSockets mechanism for
> .well-known, and that such a mechanism doesn’t yet exist, it seems pretty
> far out of scope for the CORE working group to take on defining this
> itself (unless I missed something in its charter, which is entirely
> possible: it’s quite long). Specifically, I fear that this venue is
> unlikely to bring such a change to the attention of those people best
> positioned to comment on whether .well-known is appropriate for
> WebSockets.
> 
> Even if this is in scope for CORE, it really needs to be its own
> document. If some future document comes along at a later point and wants
> to make use of its own .well-known path with WebSockets, it would be
> really quite strange to require it to reference this document in
> describing .well-known for WS.
> 

The authors of the document have different views about the inclusion of
the support of WebSockets in the document. I leave it to the responsible
AD to decide what the best document structure is and what is indeed
covered as part of the CORE working group charter.

> 
> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------

You have a couple of comments, namely

 * Variable length format

The group decided to have a variable length format. I argued for a fixed
size length format and lost the argument.

 * Gateways and their complexity

We are using gateway functionality today in our deployments but they are
not just simple protocol translations, as described in RFC 7252 or in
RFC 8075. Instead they two protocols on each side of the gateway have
different semantic and functionality. As such, the considerations in
those two RFCs don't apply to us and we are not seeing any of that
complexity.

 * Too many transport options

We care only about CoAP over TLS. We are not going to use the WebSockets
part of the document. In practice for many companies there will not be a
problem with too many transports since they will only use specific ones
in their deployment.

 * Block-wise transport with CoAP over TCP

Maybe this needs to be better explained but CoAP is tailored to small
data transmissions only. Unfortunately, there are some larger payloads
to be shuffled around as well, particularly firmware updates.

When RFC 7959 is used with TCP we found out that the performance is
quite bad since the block-wise transfer spec limits the size of the
chunks to a really small size (2048 bytes). The addition in this spec is
to increase the size of the chunks.

I will see whether the text can be improved to get his message across.

More below on your specific comments:

> 
> General — this is a very bespoke approach to what could have been mostly
> solved with a single four-byte “length” header; it is complicated on the
> wire, and in implementation; and the format variations among CoAP over
> UDP, coap+tls, and coap+ws are going to make gateways much harder to
> implement and less efficient (as they will necessarily have to
> disassemble messages and rebuild them to change between formats). The
> protocol itself mentions gateways in several places, but does not discuss
> how they are expected to map among the various flavors of CoAP defined in
> this document. Some of the changes seem unnecessary, but it could be that
> I’m missing the motivation for them. Ideally, the introduction would work
> harder at explaining why CoAP over these transports is as different from
> CoAP over UDP as it is, focusing in particular on why the complexity of
> having three syntactically incompatible headers is justified by the
> benefits provided by such variations.
> 
> Additionally, it’s not clear from the introduction what the motivation
> for using the mechanisms in this document is as compared to the
> techniques described in section 10 (and its subsections) of RFC 7252.
> With the exception of subscribing to resource state (which could be
> added), it seems that such an approach is significantly easier to
> implement and more clearly defined than what is in this document; and it
> appears to provide the combined benefits of all four transports discussed
> in this document. My concern here is that an explosion of transport
> options makes it less likely that a client and server can find two in
> common: the limit of the probability of two implementations having a
> transport in common as the number of transports approaches infinity is
> zero. Due to this likely decrease in interoperability, I’d expect to see
> some pretty powerful motivation in here for defining a third, fourth,
> fifth, and sixth way to carry CoAP when only TCP is available (I count
> RFC 7252 http and https as the first and second ways in this
> accounting).
> 
> I’m also a bit puzzled that CoAP already has an inherent mechanism for
> blocking messages off into chunks, which this document circumvents for
> TCP connections (by allowing Max-Message-Size to be increased), and then
> is forced to offer remedies for the resultant head-of-line blocking
> issues. If you didn’t introduce this feature, messages with a two-byte
> token add six bytes of overhead for every 1024 bytes of content — less
> than 0.6% size inflation. It seems like a lot of complicated machinery —
> which has a built-in foot-gun that you have to warn people about misusing
> — for a very tiny gain. I know it’s relatively late in the process, but
> if these trade-offs haven't had a lot of discussion yet, it’s probably
> worth at least giving them some additional thought.
> 
> I’ll note that the entire BERT mechanism seems to fall into the same trap
> of adding extra complexity for virtually nonexistent savings. CoAP
> headers are, by design, tiny. It seems like a serious over-optimization
> to try to eliminate them in this fashion. In particular, you’re making
> the actual implementation code larger to save a trivial number of bits on
> the wire; I was under the impression that many of the implementation
> environments CoAP is intended for had some serious on-chip restrictions
> that would point away from this kind of additional complexity.
> 
> Specific comments follow.
> 
> Section 3.3, paragraph 3 says that an initiator may send messages prior
> to receiving the remote side’s CSM, even though the message may be larger
> than would be allowed by that CSM.  What should the recipient of an
> oversized message do in this case? In fact, I don’t see in here what a
> recipient of a message larger than it allowed for in its CSM is supposed
> to do in response at *any* stage of the connection. Is it an error? If
> so, how do you indicate it? Or is the Max-Message-Size option just a
> suggestion for the other side? This definitely needs clarification.
> (Aside — it seems odd and somewhat backwards that TCP connections are
> provided an affordance for fine-grained control over message sizes, while
> UDP communications are not.)

I personally would set a minimum requirement for the size of message the
remote site needs to support. Thereby, the initiator can be sure that
messages up to a certain size are supported. If it wants to send larger
messages then it has to wait till the remote site provides their CSM.

In our environment this would not be a problem with the TCP server is
actually not on the IoT device but rather on the cloud-based (or
on-premise-based) server instead. The TCP client is running on the IoT
device.

> 
> Section 4.4 has a prohibition against using WebSockets keepalives in
> favor of using CoAP ping/pong. Section 3.4 has no similar prohibition
> against TCP keepalives, while the rationale would seem to be identical.
> Is this asymmetry intentional? (I’ll also note that the presence of
> keepalive mechanisms in both TCP and WebSockets would seem to make the
> addition of new CoAP primitives for the same purpose unnecessary, but I
> suspect this has already been debated).

The issue was that TCP keepalives are sometimes getting blocked or
modified by firewalls whereas the CoAP ping/pong on top of TLS won't

> 
> Section 5 and its subsections define a new set of message types,
> presumably for use only on connection-oriented protocols, although this
> is only implied, and never stated. For example, some implementors may see
> CSM, Ping, and Pong as potentially useful in UDP; and, finding no
> prohibition in this document against using them, decide to give it a go.
> Is that intended? If not, I strongly suggest an explicit prohibition
> against using these in UDP contexts.

I believe a similar issue came up recently (provided by Jim) when he was
asking whether this mechanism is also applicable to a transport over SMS.

If there is functionality in the document that is useful for other
transports in the future then that's great. I wouldn't rule out such use
just because we cannot imagine it today.

> 
> Section 5.3.2 says that implementations supporting block-wise transfers
> SHOULD indicate the Block-wise Transfer Option. I can't figure out why
> this is anything other than a "MUST". It seems odd that this document
> would define a way to communicate this, and then choose to leave the
> communicated options as “YES” and “YOUR GUESS IS AS GOOD AS MINE” rather
> than the simpler and more useful “YES” and “NO”.

Sounds reasonable to me.

> 
> I find the described operation of the Custody Option in the operation of
> Ping and Pong to be somewhat problematic: it allows the Pong sender to
> unilaterally decide to set the Custody Option, and consequently
> quarantine the Pong for an arbitrary amount of time while it processes
> other operations. This seems impossible to distinguish from a
> failure-due-to-timeout from the perspective of the Ping sender. Why not
> limit this behavior only to Ping messages that include the Custody
> Option?

I push this to Carsten, who I believe wrote this text.
> 
> I find the unmotivated definition of the default port for “coaps+tcp” to
> 443 — a port that is already assigned to https — to be surprising, to put
> it mildly. This definitely needs motivating text, and I suspect it's
> actually wrong.

If we don't do this then we do not get through firewalls.

> 
> I am similarly perplexed by the hard-coded “must do ALPN *unless* the
> designated port takes the magical value 5684” behavior. I don’t think
> I’ve ever seen a protocol that has such variation based on a hard-coded
> port number, and it seems unlikely to be deployed correctly (I’m imaging
> the frustration of: “I changed both the server and the client
> configuration from the default port of 5684 to 49152, and it just stopped
> working. Like, literally the *only* way it works is on port 5684. I've
> checked firewall settings everywhere and don't see any special handling
> for that port -- I just can't figure this out, and it's driving me
> crazy.”). Given the nearly universal availability of ALPN in pretty much
> all modern TLS libraries, it seems much cleaner to just require ALPN
> support and call it done. Or *don’t* require ALPN at all and call it
> done. But *changing* protocol behavior based on magic port numbers seems
> like it’s going to cause a lot of operational heartburn.

It is fine for me to require ALPN always.

> 
> The final paragraph of section 8.1 is very confusing, making it somewhat
> unclear which of the three modes must be implemented on a CoAP client,
> and which must be implemented on a CoAP server. Read naïvely, this sounds
> like clients are required to do only one (but one of their choosing) of
> these three, while servers are required to also do only one (again, of
> their choosing). It seems that the chance of finding devices that could
> interoperate under such circumstances is going to be relatively low: to
> work together, you would have to find a client and a server that happened
> to make the same implementation choice among these three. What I’m used
> to in these kinds of cases is: (a) server must implement all, client can
> choose to implement only one (or more), (b) client must implement all,
> server can choose to implement only one (or more), or (c) client and
> server must implement a specifically named lowest-common denominator, and
> can negotiate up from there. Pretty much anything else (aside from
> strange “everyone must implement two of three” schemes) will end up with
> interop issues.

This follows of what is being said in CoAP. Even the HTTP-spec does not
go into such a level of detail.

In terms of interoperability clearly IoT devices that implement
pre-shared secrets are not going to talk to devices that only implement
certificates. This is, however, not a real interoperability issue since
the use of CoAP will most likely be part of a device management
framework like LwM2M or stuff the OIC is working on. Companies deploying
IoT devices then need to figure out what they want to accomplish and
what security threats they care about.


> 
> Although the document clearly expects the use of gateways and proxies
> between these connection-oriented usages of CoAP and UDP-based CoAP,
> Appendix A seems to omit discussion or consideration of how this
> gatewaying can be performed. The following list of problems is
> illustrative of this larger issue, but likely not exhaustive. (I'll note
> that all of these issues evaporate if you move to a simpler scheme that
> merely frames otherwise unmodified UDP CoAP messages)

As mentioned before, I personally don't see any issue with this at all
since we not shuffling CoAP over UDP on one side to CoAP over TCP on the
other side. In fact, I don't know anyone doing that.

~snip~

Ciao
Hannes

Attachment: signature.asc

[core] Adam Roach's Discuss on draft-ietf-core-co… Adam Roach
Re: [core] Adam Roach's Discuss on draft-ietf-cor… Hannes Tschofenig
Re: [core] Adam Roach's Discuss on draft-ietf-cor… Alexey Melnikov
Re: [core] Adam Roach's Discuss on draft-ietf-cor… Carsten Bormann
[core] Deprecating ports 0-442, 444-65535 (was Re… Adam Roach
Re: [core] Adam Roach's Discuss on draft-ietf-cor… Adam Roach
Re: [core] Adam Roach's Discuss on draft-ietf-cor… Carsten Bormann
Re: [core] Adam Roach's Discuss on draft-ietf-cor… Carsten Bormann

Re: [core] Adam Roach's Discuss on draft-ietf-core-coap-tcp-tls-08: (with DISCUSS and COMMENT)

Attachment: signature.asc