idnits 2.17.1 

draft-ietf-lwig-coap-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (July 02, 2018) is 2124 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: '0x1234' is mentioned on line 352, but not defined

  == Missing Reference: '0x4711' is mentioned on line 366, but not defined

  == Missing Reference: '0x7a10' is mentioned on line 413, but not defined

  == Missing Reference: '0x23bb' is mentioned on line 431, but not defined

  == Missing Reference: '0x7a11' is mentioned on line 426, but not defined

  ** Obsolete normative reference: RFC 7230 (Obsoleted by RFC 9110, RFC 9112)

  == Outdated reference: A later version (-14) exists of
     draft-ietf-core-coap-pubsub-04

  == Outdated reference: A later version (-14) exists of
     draft-ietf-core-echo-request-tag-02


     Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	LWIG Working Group                                           M. Kovatsch
3	Internet-Draft                                                ETH Zurich
4	Intended status: Informational                               O. Bergmann
5	Expires: January 3, 2019                                 C. Bormann, Ed.
6	                                                 Universitaet Bremen TZI
7	                                                           July 02, 2018

9	                      CoAP Implementation Guidance
10	                        draft-ietf-lwig-coap-06

12	Abstract

14	   The Constrained Application Protocol (CoAP) is designed for resource-
15	   constrained nodes and networks such as sensor nodes in a low-power
16	   lossy network (LLN).  Yet to implement this Internet protocol on
17	   Class 1 devices (as per RFC 7228, ~ 10 KiB of RAM and ~ 100 KiB of
18	   ROM) also lightweight implementation techniques are necessary.  This
19	   document provides lessons learned from implementing CoAP for tiny,
20	   battery-operated networked embedded systems.  In particular, it
21	   provides guidance on correct implementation of the CoAP specification
22	   RFC 7252, memory optimizations, and customized protocol parameters.

24	Status of This Memo

26	   This Internet-Draft is submitted in full conformance with the
27	   provisions of BCP 78 and BCP 79.

29	   Internet-Drafts are working documents of the Internet Engineering
30	   Task Force (IETF).  Note that other groups may also distribute
31	   working documents as Internet-Drafts.  The list of current Internet-
32	   Drafts is at https://datatracker.ietf.org/drafts/current/.

34	   Internet-Drafts are draft documents valid for a maximum of six months
35	   and may be updated, replaced, or obsoleted by other documents at any
36	   time.  It is inappropriate to use Internet-Drafts as reference
37	   material or to cite them other than as "work in progress."

39	   This Internet-Draft will expire on January 3, 2019.

41	Copyright Notice

43	   Copyright (c) 2018 IETF Trust and the persons identified as the
44	   document authors.  All rights reserved.

46	   This document is subject to BCP 78 and the IETF Trust's Legal
47	   Provisions Relating to IETF Documents
48	   (https://trustee.ietf.org/license-info) in effect on the date of
49	   publication of this document.  Please review these documents
50	   carefully, as they describe your rights and restrictions with respect
51	   to this document.  Code Components extracted from this document must
52	   include Simplified BSD License text as described in Section 4.e of
53	   the Trust Legal Provisions and are provided without warranty as
54	   described in the Simplified BSD License.

56	Table of Contents

58	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
59	   2.  Protocol Implementation . . . . . . . . . . . . . . . . . . .   4
60	     2.1.  Client/Server Model . . . . . . . . . . . . . . . . . . .   4
61	     2.2.  Message Processing  . . . . . . . . . . . . . . . . . . .   5
62	       2.2.1.  On-the-fly Processing . . . . . . . . . . . . . . . .   5
63	       2.2.2.  Internal Data Structure . . . . . . . . . . . . . . .   6
64	     2.3.  Message ID Usage  . . . . . . . . . . . . . . . . . . . .   7
65	       2.3.1.  Duplicate Rejection . . . . . . . . . . . . . . . . .   7
66	       2.3.2.  MID Namespaces  . . . . . . . . . . . . . . . . . . .   8
67	       2.3.3.  Relaxation on the Server  . . . . . . . . . . . . . .   8
68	       2.3.4.  Relaxation on the Client  . . . . . . . . . . . . . .   9
69	     2.4.  Token Usage . . . . . . . . . . . . . . . . . . . . . . .  10
70	       2.4.1.  Tokens for Observe  . . . . . . . . . . . . . . . . .  11
71	       2.4.2.  Tokens for Blockwise Transfers  . . . . . . . . . . .  12
72	     2.5.  Transmission States . . . . . . . . . . . . . . . . . . .  12
73	       2.5.1.  Request/Response Layer  . . . . . . . . . . . . . . .  12
74	       2.5.2.  Message Layer . . . . . . . . . . . . . . . . . . . .  13
75	     2.6.  Out-of-band Information . . . . . . . . . . . . . . . . .  14
76	     2.7.  Programming Model . . . . . . . . . . . . . . . . . . . .  15
77	       2.7.1.  Client  . . . . . . . . . . . . . . . . . . . . . . .  16
78	       2.7.2.  Server  . . . . . . . . . . . . . . . . . . . . . . .  16
79	   3.  Optimizations . . . . . . . . . . . . . . . . . . . . . . . .  17
80	     3.1.  Message Buffers . . . . . . . . . . . . . . . . . . . . .  17
81	     3.2.  Retransmissions . . . . . . . . . . . . . . . . . . . . .  18
82	     3.3.  Observable Resources  . . . . . . . . . . . . . . . . . .  18
83	     3.4.  Blockwise Transfers . . . . . . . . . . . . . . . . . . .  19
84	       3.4.1.  Generic Proxying of Block Messages  . . . . . . . . .  19
85	       3.4.2.  Atomic Blockwise Operations . . . . . . . . . . . . .  20
86	     3.5.  Deduplication with Sequential MIDs  . . . . . . . . . . .  20
87	   4.  Alternative Configurations  . . . . . . . . . . . . . . . . .  23
88	     4.1.  Transmission Parameters . . . . . . . . . . . . . . . . .  23
89	     4.2.  CoAP over IPv4  . . . . . . . . . . . . . . . . . . . . .  24
90	   5.  Binding to specific lower-layer APIs  . . . . . . . . . . . .  24
91	     5.1.  Berkeley Socket Interface . . . . . . . . . . . . . . . .  24
92	       5.1.1.  Responding from the right address . . . . . . . . . .  24
93	       5.1.2.  Handling ICMP errors  . . . . . . . . . . . . . . . .  25
94	     5.2.  Java  . . . . . . . . . . . . . . . . . . . . . . . . . .  25
95	     5.3.  Multicast detection . . . . . . . . . . . . . . . . . . .  26
96	     5.4.  DTLS  . . . . . . . . . . . . . . . . . . . . . . . . . .  26

98	   6.  CoAP on various transports  . . . . . . . . . . . . . . . . .  26
99	     6.1.  CoAP over reliable transports . . . . . . . . . . . . . .  27
100	     6.2.  Translating between transports  . . . . . . . . . . . . .  27
101	       6.2.1.  Transport translation by proxies  . . . . . . . . . .  27
102	       6.2.2.  One-to-one Transport translation  . . . . . . . . . .  28
103	   7.  IANA considerations . . . . . . . . . . . . . . . . . . . . .  28
104	   8.  Security considerations . . . . . . . . . . . . . . . . . . .  28
105	   9.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  28
106	   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  28
107	     10.1.  Normative References . . . . . . . . . . . . . . . . . .  28
108	     10.2.  Informative References . . . . . . . . . . . . . . . . .  29
109	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  30

111	1.  Introduction

113	   The Constrained Application Protocol [RFC7252] has been designed
114	   specifically for machine-to-machine communication in networks with
115	   very constrained nodes.  Typical application scenarios therefore
116	   include building automation, process optimization, and the Internet
117	   of Things.  The major design objectives have been set on small
118	   protocol overhead, robustness against packet loss, and against high
119	   latency induced by small bandwidth shares or slow request processing
120	   in end nodes.  To leverage integration of constrained nodes with the
121	   world-wide Internet, the protocol design was led by the REST
122	   architectural style that accounts for the scalability and robustness
123	   of the Hypertext Transfer Protocol [RFC7230].

125	   Lightweight implementations benefit from this design in many
126	   respects: First, the use of Uniform Resource Identifiers (URIs) for
127	   naming resources and the transparent forwarding of their
128	   representations in a server-stateless request/response protocol make
129	   protocol translation to HTTP a straightforward task.  Second, the set
130	   of protocol elements that are unavoidable for the core protocol, and
131	   thus must be implemented on every node, has been kept very small,
132	   minimizing the unnecessary accumulation of "optional" features.
133	   Options that - when present - are critical for message processing are
134	   explicitly marked as such to force immediate rejection of messages
135	   with unknown critical options.  Third, the syntax of protocol data
136	   units is easy to parse and is carefully defined to avoid creation of
137	   state in servers where possible.

139	   Although these features enable lightweight implementations of the
140	   Constrained Application Protocol, there is still a tradeoff between
141	   robustness and latency of constrained nodes on one hand and resource
142	   demands on the other.  For constrained nodes of Class 1 or even
143	   Class 2 [RFC7228], the most limiting factors usually are dynamic
144	   memory needs, static code size, and energy.  Most implementations
145	   therefore need to optimize internal buffer usage, omit idle protocol
146	   features, and maximize sleeping cycles.

148	   The present document gives possible strategies to solve this tradeoff
149	   for very constrained nodes (i.e., Class 1).  For this, it provides
150	   guidance on correct implementation of the CoAP specification
151	   [RFC7252], memory optimizations, and customized protocol parameters.

153	2.  Protocol Implementation

155	   In the programming styles supported by very simple operating systems
156	   as found on constrained nodes, preemptive multi-threading is not an
157	   option.  Instead, all operations are triggered by an event loop
158	   system, e.g., in a send-receive-dispatch cycle.  It is also common
159	   practice to allocate memory statically to ensure stable behavior, as
160	   no memory management unit (MMU) or other abstractions are available.
161	   For a CoAP node, the two key parameters for memory usage are the
162	   number of (re)transmission buffers and the maximum message size that
163	   must be supported by each buffer.  Often the maximum message size is
164	   set far below the 1280-byte MTU of 6LoWPAN to allow more than one
165	   open Confirmable transmission at a time (in particular for parallel
166	   observe notifications [RFC7641]).  Note that implementations on
167	   constrained platforms often not even support the full MTU.  Larger
168	   messages must then use blockwise transfers [RFC7959], while a good
169	   tradeoff between 6LoWPAN fragmentation and CoAP header overhead must
170	   be found.  Usually the amount of available free RAM dominates this
171	   decision.  For Class 1 devices, the maximum message size is typically
172	   128 or 256 bytes (blockwise) payload plus an estimate of the maximum
173	   header size for the worst case option setting.

175	2.1.  Client/Server Model

177	   In general, CoAP servers can be implemented more efficiently than
178	   clients.  REST allows them to keep the communication stateless and
179	   piggy-backed responses are not stored for retransmission, saving
180	   buffer space.  The use of idempotent requests also allows to relax
181	   deduplication, which further decreases memory usage.  It is also easy
182	   to estimate the required maximum size of message buffers, since URI
183	   paths, supported options, and maximum payload sizes of the
184	   application are known at compile time.  Hence, when the application
185	   is distributed over constrained and unconstrained nodes, the
186	   constrained ones should preferably have the server role.

188	   HTTP-based applications have established an inverse model because of
189	   the need for simple push notifications: A constrained client uses
190	   POST requests to update resources on an unconstrained server whenever
191	   an event (e.g., a new sensor reading) is triggered.  This requirement
192	   is solved by the Observe option [RFC7641] of CoAP.  It allows servers
193	   to initiate communication and send push notifications to interested
194	   client nodes.  This allows a more efficient and also more natural
195	   model for CoAP-based applications, where the information source is an
196	   origin server, which can also benefit from caching.  Publish-
197	   subscribe brokers [I-D.ietf-core-coap-pubsub] may be deployed to act
198	   in the server role on behalf of constrained clients.

200	2.2.  Message Processing

202	   Apart from the required buffers, message processing is symmetric for
203	   clients and servers.  First the base header has to be parsed and
204	   thereby checked if it is a valid CoAP message.  For UDP datagrams,
205	   the version identifier or a size smaller than four bytes identify
206	   non-CoAP data.  These datagrams need to be silently ignored.  Other
207	   message format errors, such as an incomplete datagram or the usage of
208	   reserved values, may need to be rejected with a Reset (RST) message
209	   (see Section 4.2 and 4.3 of [RFC7252] for details).

211	   As CoAP over TCP has a different base header, the Length field must
212	   be parsed to determine the message size.  As this field may have up
213	   to five bytes, it may be extend over TCP segment boundaries.  For
214	   CoAP over WebSockets the actual message length is given by the
215	   WebSocket frame hence the Length field is always zero.

217	   Next, the token length is read based on the TKL field which is for
218	   all transports contained in the four least significant bits of the
219	   first byte.  The (possibly empty) Token itself is located immediately
220	   after the four-byte base header for UDP, while for TCP and
221	   WebSockets, it follows the variable Length field and Code byte.

223	   For the options following the Token, there are two alternatives:
224	   either process them on the fly when an option is accessed or
225	   initially parse all values into an internal data structure.

227	2.2.1.  On-the-fly Processing

229	   The advantage of on-the-fly processing is that no additional memory
230	   needs to be allocated to store the option values, which are stored
231	   efficiently inline in the buffer for incoming messages.  Once the
232	   message is accepted for further processing, the set of options
233	   contained in the received message must be decoded to check for
234	   unknown critical options.  To avoid multiple passes through the
235	   option list, the option parser might maintain a bit-vector where each
236	   bit represents an option number that is present in the received
237	   request.  With the wide and sparse range of option numbers, the
238	   number itself cannot be used to indicate the number of left-shift
239	   operations to mask the corresponding bit.  Hence, an implementation-
240	   specific enum of supported options should be used to mask the present
241	   options of a message in the bitmap.  In addition, the byte index of
242	   every option (a direct pointer) can be added to a sparse list (e.g.,
243	   a one-dimensional array) for fast retrieval.

245	   This particularly enables efficient handling of options that might
246	   occur more than once such as Uri-Path.  In this implementation
247	   strategy, the delta is zero for any subsequent path segment, hence
248	   the stored byte index for this option (e.g., 11 for Uri-Path) would
249	   be overwritten to hold a pointer to only the last occurrence of that
250	   option.  The Uri-Path can be resolved on the fly, though, and a
251	   pointer to the targeted resource stored directly in the sparse list.

253	   Once the option list has been processed, all known critical option
254	   and all elective options can be masked out in the bit-vector to
255	   determine if any unknown critical option was present.  If this is the
256	   case, this information can be used to create a 4.02 response
257	   accordingly.  Note that full processing must only be done up to the
258	   highest supported option number.  Beyond that, only the least
259	   significant bit (Critical or Elective) needs to be checked.
260	   Otherwise, if all critical options are supported, the sparse list of
261	   option pointers is used for further handling of the message.

263	2.2.2.  Internal Data Structure

265	   Using an internal data structure for all parsed options has an
266	   advantage when working on the option values, as they are already in a
267	   variable of corresponding type (e.g., an integer in host byte order).
268	   The incoming payload and byte strings of the header can be accessed
269	   directly in the buffer for incoming messages using pointers (similar
270	   to on-the-fly processing).  This approach also benefits from a
271	   bitmap.  Otherwise special values must be reserved to encode an unset
272	   option, which might require a larger type than required for the
273	   actual value range (e.g., a 32-bit integer instead of 16-bit).

275	   Many of the byte strings (e.g., the URI) are usually not required
276	   when generating the response.  When all important values are copied
277	   (e.g., the Token, which needs to be mirrored), the internal data
278	   structure facilitates using the buffer for incoming messages also for
279	   the assembly of outgoing messages - which can be the shared IP buffer
280	   provided by the operating system.

282	   Setting options for outgoing messages is also easier with an internal
283	   data structure.  Application developers can set options independent
284	   from the option number and do not need to care about the order for
285	   the delta encoding.  The CoAP encoding is applied in a serialization
286	   step before sending.  In contrast, assembling outgoing messages with
287	   on-the-fly processing requires either extensive memmove operations to
288	   insert new options, or restrictions for developers to set options in
289	   their correct order.

291	2.3.  Message ID Usage

293	   Many applications of CoAP use unreliable transports, in particular
294	   UDP, which can lose, reorder, and duplicate messages.  Although
295	   DTLS's replay protection deals with duplication by the network,
296	   losses are addressed with DTLS retransmissions only for the handshake
297	   protocol and not for the application data protocol.  Furthermore,
298	   CoAP implementations usually send CON retransmissions in new DTLS
299	   records, which are not considered duplicates at the DTLS layer.

301	2.3.1.  Duplicate Rejection

303	   CoAP's messaging sub-layer has been designed with protocol
304	   functionality such that rejection of duplicate messages is always
305	   possible.  It is realized through the Message IDs (MIDs) and their
306	   lifetimes with regard to the message type.

308	   Duplicate detection is under the discretion of the recipient (see
309	   Section 4.5 of [RFC7252], Section 2.3.3, Section 2.3.4).  Where it is
310	   desired, the receiver needs to keep track of MIDs to filter the
311	   duplicates for at least NON_LIFETIME (145 s).  This time also holds
312	   for CON messages, since it equals the possible reception window of
313	   MAX_TRANSMIT_SPAN + MAX_LATENCY.

315	   On the sender side, MIDs of CON messages must not be re-used within
316	   the EXCHANGE_LIFETIME; MIDs of NONs respectively within the
317	   NON_LIFETIME.  In typical scenarios, however, senders will re-use
318	   MIDs with intervals far larger than these lifetimes: with sequential
319	   assignment of MIDs, coming close to them would require 250 messages
320	   per second, much more than the bandwidth of constrained networks
321	   would usually allow for.

323	   In cases where senders might come closer to the maximum message rate,
324	   it is recommended to use more conservative timings for the re-use of
325	   MIDs.  Otherwise, opposite inaccuracies in the clocks of sender and
326	   recipient may lead to obscure message loss.  If needed, higher rates
327	   can be achieved by using multiple endpoints for sending requests and
328	   managing the local MID per remote endpoint instead of a single
329	   counter per system (essentially extending the 16-bit message ID by a
330	   16-bit port number and/or an 128-bit IP address).  In controlled
331	   scenarios, such as real-time applications over industrial Ethernet,
332	   the protocol parameters can also be tweaked to achieve higher message
333	   rates (Section 4.1).

335	2.3.2.  MID Namespaces

337	   MIDs are assigned under the control of the originator of CON and NON
338	   messages, and they do not mix with the MIDs assigned by the peer for
339	   CON and NON in the opposite direction.  Hence, CoAP implementors need
340	   to make sure to manage different namespaces for the MIDs used for
341	   deduplication.  MIDs of outgoing CONs and NONs belong to the local
342	   endpoint; so do the MIDs of incoming ACKs and RSTs.  Accordingly,
343	   MIDs of incoming CONs and NONs and outgoing ACKs and RSTs belong to
344	   the corresponding remote endpoint.  Figure 1 depicts a scenario where
345	   mixing the namespaces would cause erroneous filtering.

347	                    Client              Server
348	                       |                  |
349	                       |   CON [0x1234]   |
350	                       +----------------->|
351	                       |                  |
352	                       |   ACK [0x1234]   |
353	                       |<-----------------+
354	                       |                  |
355	                       |   CON [0x4711]   |
356	                       |<-----------------+ Separate response
357	                       |                  |
358	                       |   ACK [0x4711]   |
359	                       +----------------->|
360	                       |                  |
361	  A request follows that uses the same MID as the last separate response
362	                       |                  |
363	                       |   CON [0x4711]   |
364	                       +----------------->|
365	  Response is filtered |                  |
366	    because MID 0x4711 |   ACK [0x4711]   |
367	       is still in the X<-----------------+ Piggy-backed response
368	    deduplication list |                  |

370	    Figure 1: Deduplication must manage the MIDs in different namespace
371	                 corresponding to their origin endpoints.

373	2.3.3.  Relaxation on the Server

375	   Using the de-duplication functionality is at the discretion of the
376	   receiver: Processing of duplicate messages comes at a cost, but so
377	   does the management of the state associated with duplicate rejection.
378	   The number of remote endpoints that need to be managed might be vast.
379	   This can be costly in particular for less constrained nodes that have
380	   throughput in the order of hundreds of thousands requests per second
381	   (which needs about 16 GiB of RAM just for duplicate rejection).
382	   Deduplication is also heavy for servers on Class 1 devices, as also
383	   piggy-backed responses need to be stored for the case that the ACK
384	   message is lost.  Hence, a receiver may have good reasons to decide
385	   not to perform deduplication.  This behavior is possible when the
386	   application is designed with idempotent operations only and makes
387	   good use of the If-Match/If-None-Match options.

389	   If duplicate rejection is indeed necessary (e.g., for non-idempotent
390	   requests) it is important to control the amount of state that needs
391	   to be stored.  It can be reduced, for instance, by deduplication at
392	   resource level: Knowledge of the application and supported
393	   representations can minimize the amount of state that needs to be
394	   kept.

396	2.3.4.  Relaxation on the Client

398	   Duplicate rejection on the client side can be simplified by choosing
399	   clever Tokens that are virtually not re-used (e.g., through an
400	   obfuscated sequence number in the Token value) and only filter based
401	   on the list of open Tokens.  If a client wants to re-use Tokens
402	   (e.g., the empty Token for optimizations), it requires strict
403	   duplicate rejection based on MIDs to avoid the scenario outlined in
404	   Figure 2.

406	                           Client              Server
407	                              |                  |
408	                              |   CON [0x7a10]   |
409	                              |    GET /temp     |
410	                              |   (Token 0x23)   |
411	                              +----------------->|
412	                              |                  |
413	                              |   ACK [0x7a10]   |
414	                              |<-----------------+
415	                              |                  |
416	                              ... Time Passes  ...
417	                              |                  |
418	                              |   CON [0x23bb]   |
419	                              |  4.04 Not Found  |
420	                              |   (Token 0x23)   |
421	                              |<-----------------+
422	                              |                  |
423	                              |   ACK [0x23bb]   |
424	                              +--------X         |
425	                              |                  |
426	                              |   CON [0x7a11]   |
427	                              |   GET /resource  |
428	                              |   (Token 0x23)   |
429	                              +----------------->|
430	                              |                  |
431	                              |   CON [0x23bb]   |
432	          Causing an implicit |  4.04 Not Found  |
433	           acknowledgement if |   (Token 0x23)   |
434	         not filtered through X<-----------------+ Retransmission
435	          duplicate rejection |                  |

437	      Figure 2: Re-using Tokens requires strict duplicate rejection.

439	2.4.  Token Usage

441	   Tokens are chosen by the client and help to identify request/response
442	   pairs that span several message exchanges (e.g., a separate response,
443	   which has a new MID).  Servers do not generate Tokens and only mirror
444	   what they receive from the clients.  Tokens must be unique within the
445	   namespace of a client throughout their lifetime.  This begins when
446	   being assigned to a request and ends when the open request is closed
447	   by receiving and matching the final response.  Neither empty ACKs nor
448	   notifications (i.e., responses carrying the Observe option) terminate
449	   the lifetime of a Token.

451	   As already mentioned, a clever assignment of Tokens can help to
452	   simplify duplicate rejection.  Yet this is also important for coping
453	   with client crashes.  When a client restarts during an open request
454	   and (unknowingly) re-uses the same Token, it might match the response
455	   from the previous request to the current one.  Hence, when only the
456	   Token is used for matching, which is always the case for separate
457	   responses, randomized Tokens with enough entropy should be used.  The
458	   8-byte range for Tokens can even allow for one-time usage throughout
459	   the lifetime of a client node.  When DTLS is used, client crashes/
460	   restarts will lead to a new security handshake, thereby solving the
461	   problem of mismatching responses and/or notifications.

463	2.4.1.  Tokens for Observe

465	   In the case of Observe [RFC7641], a request will be answered with
466	   multiple notifications and it is important to continue keeping track
467	   of the Token that was used for the request - its lifetime will end
468	   much later.  Upon establishing an Observe relationship, the Token is
469	   registered at the server.  Hence, the client's use of that specific
470	   Token is now limited to controlling the Observation relationship.  A
471	   client can use it to cancel the relationship, which frees the Token
472	   upon success (i.e., the message with an Observe Option with the value
473	   set to 'deregister' (1) is confirmed with a response; see [RFC7641]
474	   section 3.6).  However, the client might never receive the response
475	   due to a temporary network outage or worse, a server crash.  Although
476	   a network outage will also affect notifications so that the Observe
477	   garbage collection could apply, the server might simply happen not to
478	   send CON notifications during that time.  Alternative Observe
479	   lifetime models such as Stubbornness(tm) might also keep
480	   relationships alive for longer periods.

482	   Thus, it is best to carefully choose the Token value used with
483	   Observe requests.  (The empty value will rarely be applicable.)  One
484	   option is to assign and re-use dedicated Tokens for each Observe
485	   relationship the client will establish.  The choice of Token values
486	   also is critical in NoSec mode, to limit the effectiveness of
487	   spoofing attacks.  Here, the recommendation is to use randomized
488	   Tokens with a length of at least four bytes (see Section 5.3.1 of
489	   [RFC7252]).  Thus, dedicated ranges within the 8-byte Token space
490	   should be used when in NoSec mode.  This also solves the problem of
491	   mismatching notifications after a client crash/restart.

493	   When the client wishes to reinforce its interest in a resource, maybe
494	   not really being sure whether the server has forgotten it or not, the
495	   Token value allocated to the Observe relationship is used to re-
496	   register that observation (see Section 3.3.1 of [RFC7641] for
497	   details): If the server is still aware of the relationship (an entry
498	   with a matching endpoint and token is already present in its list of
499	   observers for the resource), it will not add a new relationship but
500	   will replace or update the existing one (Section 4.1 of [RFC7641]).

502	   If not, it will simply establish a new registration which of course
503	   also uses the Token value.

505	   If the client sends an Observe request for the same resource with a
506	   new Token, this is not a protocol violation, because the
507	   specification allows the client to observe the same resource in a
508	   different Observe relationship if the cache-key is different (e.g.,
509	   requesting a different Content-Format).  If the cache-key is not
510	   different, though, an additional Observe relationship just wastes the
511	   server's resources, and is therefore not allowed; the server might
512	   rely on this for its housekeeping.

514	2.4.2.  Tokens for Blockwise Transfers

516	   In general, blockwise transfers are independent from the Token and
517	   are correlated through client endpoint address and server address and
518	   resource path (destination URI).  Thus, each block may be transferred
519	   using a different Token.  Still it can be beneficial to use the same
520	   Token (it is freed upon reception of a response block) for all
521	   blocks, e.g., to easily route received blocks to the same response
522	   handler.

524	   When Block2 is combined with Observe, notifications only carry the
525	   first block and it is up to the client to retrieve the remaining
526	   ones.  These GET requests do not carry the Observe option and need to
527	   use a different Token, since the Token from the notification is still
528	   in use.

530	2.5.  Transmission States

532	   CoAP endpoints must keep transmission state to manage open requests,
533	   to handle the different response modes, and to implement reliable
534	   delivery at the message layer.  The following finite state machines
535	   (FSMs) model the transmissions of a CoAP exchange at the request/
536	   response layer and the message layer.  These layers are linked
537	   through actions.  The M_CMD() action triggers a corresponding
538	   transition at the message layer and the RR_EVT() action triggers a
539	   transition at the request/response layer.  The FSMs also use guard
540	   conditions to distinguish between information that is only available
541	   through the other layer (e.g., whether a request was sent using a CON
542	   or NON message).

544	2.5.1.  Request/Response Layer

546	   Figure 3 depicts the two states at the request/response layer of a
547	   CoAP client.  When a request is issued, a "reliable_send" or
548	   "unreliable_send" is triggered at the message layer.  The WAITING
549	   state can be left through three transitions: Either the client
550	   cancels the request and triggers cancellation of a CON transmission
551	   at the message layer, the client receives a failure event from the
552	   message layer, or a receive event containing a response.

554	            +------------CANCEL-------------------------------+
555	            |        / M_CMD(cancel)                          |
556	            |                                                 V
557	            |                                              +------+
558	        +-------+ -------RR_EVT(fail)--------------------> |      |
559	        |WAITING|                                          | IDLE |
560	        +-------+ -------RR_EVT(rx)[is Response]---------> |      |
561	            ^                / M_CMD(accept)               +------+
562	            |                                                 |
563	            +--------------------REQUEST----------------------+
564	                       / M_CMD((un)reliable_send)

566	             Figure 3: CoAP Client Request/Response Layer FSM

568	   A server resource can decide at the request/response layer whether to
569	   respond with a piggy-backed or a separate response.  Thus, there are
570	   two busy states in Figure 4, SERVING and SEPARATE.  An incoming
571	   receive event with a NON request directly triggers the transition to
572	   the SEPARATE state.

574	        +--------+ <----------RR_EVT(rx)[is NON]---------- +------+
575	        |SEPARATE|                                         |      |
576	        +--------+ ----------------RESPONSE--------------> | IDLE |
577	            ^            / M_CMD((un)reliable_send)        |      |
578	            |                                        +---> +------+
579	            |EMPTY_ACK                               |         |
580	            |/M_CMD(accept)                          |         |
581	            |                                        |         |
582	            |                                        |         |
583	        +--------+                                   |         |
584	        |SERVING | --------------RESPONSE------------+         |
585	        +--------+          / M_CMD(accept)                    |
586	            ^                                                  |
587	            +------------------------RR_EVT(rx)[is CON]--------+

589	             Figure 4: CoAP Server Request/Response Layer FSM

591	2.5.2.  Message Layer

593	   Figure 5 shows the different states of a CoAP endpoint per message
594	   exchange.  Besides the linking action RR_EVT(), the message layer has
595	   a TX action to send a message.  For sending and receiving NONs, the
596	   endpoint remains in its CLOSED state.  When sending a CON, the
597	   endpoint remains in RELIABLE_TX and keeps retransmitting until the
598	   transmission times out, it receives a matching RST, the request/
599	   response layer cancels the transmission, or the endpoint receives an
600	   implicit acknowledgement through a matching NON or CON.  Whenever the
601	   endpoint receives a CON, it transitions into the ACK_PENDING state,
602	   which can be left by sending the corresponding ACK.

604	   +-----------+ <-------M_CMD(reliable_send)-----+
605	   |           |            / TX(con)              \
606	   |           |                                +--------------+
607	   |           | ---TIMEOUT(RETX_WINDOW)------> |              |
608	   |RELIABLE_TX|     / RR_EVT(fail)             |              |
609	   |           | ---------------------RX_RST--> |              | <----+
610	   |           |               / RR_EVT(fail)   |              |      |
611	   +-----------+ ----M_CMD(cancel)------------> |    CLOSED    |      |
612	    ^  |  |  \  \                               |              | --+  |
613	    |  |  |   \  +-------------------RX_ACK---> |              |   |  |
614	    +*1+  |    \                / RR_EVT(rx)    |              |   |  |
615	          |     +----RX_NON-------------------> +--------------+   |  |
616	          |       / RR_EVT(rx)                  ^ ^ ^ ^  | | | |   |  |
617	          |                                     | | | |  | | | |   |  |
618	          |                                     | | | +*2+ | | |   |  |
619	          |                                     | | +--*3--+ | |   |  |
620	          |                                     | +----*4----+ |   |  |
621	          |                                     +------*5------+   |  |
622	          |                +---------------+                       |  |
623	          |                |  ACK_PENDING  | <--RX_CON-------------+  |
624	          +----RX_CON----> |               |  / RR_EVT(rx)            |
625	            / RR_EVT(rx)   +---------------+ ---------M_CMD(accept)---+
626	                                                        / TX(ack)

628	   *1: TIMEOUT(RETX_TIMEOUT) / TX(con)
629	   *2: M_CMD(unreliable_send) / TX(non)
630	   *3: RX_NON / RR_EVT(rx)
631	   *4: RX_RST / REMOVE_OBSERVER
632	   *5: RX_ACK

634	                     Figure 5: CoAP Message Layer FSM

636	   T.B.D.: (i) Rejecting messages (can be triggered at message and
637	   request/response layer). (ii) ACKs can also be triggered at both
638	   layers.

640	2.6.  Out-of-band Information

642	   The CoAP implementation can also leverage out-of-band information,
643	   that might also trigger some of the transitions shown in Section 2.5.
644	   In particular ICMP messages can inform about unreachable remote
645	   endpoints or whole network outages.  This information can be used to
646	   pause or cancel ongoing transmission to conserve energy.  Providing
647	   ICMP information to the CoAP implementation is easier in constrained
648	   environments, where developers usually can adapt the underlying OS
649	   (or firmware).  This is not the case on general purpose platforms
650	   that have full-fledged OSes and make use of high-level programming
651	   frameworks.

653	   The most important ICMP messages are host, network, port, or protocol
654	   unreachable errors.  After appropriate vetting (cf.  [RFC5927]), they
655	   should cause the cancellation of ongoing CON transmissions and
656	   clearing (or deferral) of Observe relationships.  Requests to this
657	   destination should be paused for a sensible interval.  In addition,
658	   the device could indicate of this error through a notification to a
659	   management endpoint or external status indicator, since the cause
660	   could be a misconfiguration or general unavailability of the required
661	   service.  Problems reported through the Parameter Problem message are
662	   usually caused through a similar fundamental problem.

664	   The CoAP specification recommends to ignore Source Quench and Time
665	   Exceeded ICMP messages, though.  Source Quench messages were
666	   originally intended to inform the sender to reduce the rate of
667	   packets.  However, this mechanism is deprecated through [RFC6633].
668	   CoAP also comes with its own congestion control mechanism, which is
669	   already designed conservatively.  One advanced mechanism that can be
670	   employed for better network utilization is CoCoA,
671	   [I-D.ietf-core-cocoa].  Time Exceeded messages often occur during
672	   transient routing loops (unless they are caused by a too small
673	   initial Hop Limit value).

675	2.7.  Programming Model

677	   The event-driven approach, which is common in event-loop-based
678	   firmware, has also proven very efficient for embedded operating
679	   systems [TinyOS], [Contiki].  Note that an OS is not necessarily
680	   required and a traditional firmware approach can suffice for Class 1
681	   devices.  Event-driven systems use split-phase operations (i.e.,
682	   there are no blocking functions, but functions return and an event
683	   handler is called once a long-lasting operation completes) to enable
684	   cooperative multi-threading with a single stack.

686	   Bringing a Web transfer protocol to constrained environments does not
687	   only change the networking of the corresponding systems, but also the
688	   programming model.  The complexity of event-driven systems can be
689	   hidden through APIs that resemble classic RESTful Web service
690	   implementations.

692	2.7.1.  Client

694	   An API for asynchronous requests with response handler functions goes
695	   hand-in-hand with the event-driven approach.  Synchronous requests
696	   with a blocking send function can facilitate applications that
697	   require strictly ordered, sequential request execution (e.g., to
698	   control a physical process) or other checkpointing (e.g., starting
699	   operation only after registration with the resource directory was
700	   successful).  However, this can also be solved by triggering the next
701	   operation in the response handlers.  Furthermore, as mentioned in
702	   Section 2.1, it is more like that complex control flow is done by
703	   more powerful devices and Class 1 devices predominantly run a CoAP
704	   server (which might include a minimal client to communicate with a
705	   resource directory).

707	2.7.2.  Server

709	   On CoAP servers, the event-driven nature can be hidden through
710	   resource handler abstractions as known from traditional REST
711	   frameworks.  The following types of RESTful resources have proven
712	   useful to provide an intuitive API on constrained event-driven
713	   systems:

715	   NORMAL  A normal resource defined by a static Uri-Path and an
716	      associated resource handler function.  Allowed methods could
717	      already be filtered by the implementation based on flags.  This is
718	      the basis for all other resource types.

720	   PARENT  A parent resource manages several sub-resources under a given
721	      base path by programmatically evaluating the Uri-Path.  Defining a
722	      URI template (see [RFC6570]) would be a convenient way to pre-
723	      parse arguments given in the Uri-Path.

725	   PERIODIC  A resource that has an additional handler function that is
726	      triggered periodically by the CoAP implementation with a resource-
727	      specific interval.  It can be used to sample a sensor or perform
728	      similar periodic updates of its state.  Usually, a periodic
729	      resource is observable and sends the notifications by triggering
730	      its normal resource handler from the periodic handler.  These
731	      periodic tasks are quite common for sensor nodes, thus it makes
732	      sense to provide this functionality in the CoAP implementation and
733	      avoid redundant code in every resource.

735	   EVENT  An event resource is similar to an periodic resource, only
736	      that the second handler is called by an irregular event such as a
737	      button.

739	   SEPARATE  Separate responses are usually used when handling a request
740	      takes more time, e.g., due to a slow sensor or UART-based
741	      subsystems.  To not fully block the system during this time, the
742	      handler should also employ split-phase execution: The resource
743	      handler returns as soon as possible and an event handler resumes
744	      responding when the result is ready.  The separate resource type
745	      can abstract from the split-phase operation and take care of
746	      temporarily storing the request information that is required later
747	      in the result handler to send the response (e.g., source address
748	      and Token).

750	3.  Optimizations

752	3.1.  Message Buffers

754	   The cooperative multi-threading of an event loop system allows to
755	   optimize memory usage through in-place processing and reuse of
756	   buffers, in particular the IP buffer provided by the OS or firmware.

758	   CoAP servers can significantly benefit from in-place processing, as
759	   they can create responses directly in the incoming IP buffer.  Note
760	   that an embedded OS usually only has a single buffer for incoming and
761	   outgoing IP packets.  The first few bytes of the basic header are
762	   usually parsed into an internal data structure and can be overwritten
763	   without harm.  Thus, empty ACKs and RST messages can promptly be
764	   assembled and sent using the IP buffer.  Also when a CoAP server only
765	   sends piggy-backed or Non-confirmable responses, no additional buffer
766	   is required at the application layer.  This, however, requires
767	   careful timing so that no incoming data is overwritten before it was
768	   processed.  Because of cooperative multi-threading, this requirement
769	   is relaxed, though.  Once the message is sent, the IP buffer can
770	   accept new messages again.  This does not work for Confirmable
771	   messages, however.  They need to be stored for retransmission and
772	   would block any further IP communication.

774	   Depending on the number of requests that can be handled in parallel,
775	   an implementation might create a stub response filled with any option
776	   that has to be copied from the original request to the separate
777	   response, especially the Token option.  The drawback of this
778	   technique is that the server must be prepared to receive
779	   retransmissions of the previous (Confirmable) request to which a new
780	   acknowledgement must be generated.  If memory is an issue, a single
781	   buffer can be used for both tasks: Only the message type and code
782	   must be updated, changing the message id is optional.  Once the
783	   resource representation is known, it is added as new payload at the
784	   end of the stub response.  Acknowledgements still can be sent as
785	   described before as long as no additional options are required to
786	   describe the payload.

788	3.2.  Retransmissions

790	   CoAP's reliable transmissions require the before-mentioned
791	   retransmission buffers.  Messages, such as the requests of a client,
792	   should be stored in serialized form.  For servers, retransmissions
793	   apply for Confirmable separate responses and Confirmable
794	   notifications [RFC7641].  As separate responses stem from long-
795	   lasting resource handlers, the response should be stored for
796	   retransmission instead of re-dispatching a stored request (which
797	   would allow for updating the representation).  For Confirmable
798	   notifications, please see Section 2.6, as simply storing the response
799	   can break the concept of eventual consistency.

801	   String payloads such as JSON require a buffer to print to.  By
802	   splitting the retransmission buffer into header and payload part, it
803	   can be reused.  First to generate the payload and then storing the
804	   CoAP message by serializing into the same memory.  Thus, providing a
805	   retransmission for any message type can save the need for a separate
806	   application buffer.  This, however, requires an estimation about the
807	   maximum expected header size to split the buffer and a memmove to
808	   concatenate the two parts.

810	   For platforms that disable clock tick interrupts in sleep states, the
811	   application must take into consideration the clock deviation that
812	   occurs during sleep (or ensure to remain in idle state until the
813	   message has been acknowledged or the maximum number of
814	   retransmissions is reached).  Since CoAP allows up to four
815	   retransmissions with a binary exponential back-off it could take up
816	   to 45 seconds until the send operation is complete.  Even in idle
817	   state, this means substantial energy consumption for low-power nodes.
818	   Implementers therefore might choose a two-step strategy: First, do
819	   one or two retransmissions and then, in the later phases of back-off,
820	   go to sleep until the next retransmission is due.  In the meantime,
821	   the node could check for new messages including the acknowledgement
822	   for any Confirmable message to send.

824	3.3.  Observable Resources

826	   For each observer, the server needs to store at least address, port,
827	   token, and the last outgoing message ID.  The latter is needed to
828	   match incoming RST messages and cancel the observe relationship.

830	   It is favorable to have one retransmission buffer per observable
831	   resource that is shared among all observers.  Each notification is
832	   serialized once into this buffer and only address, port, and token
833	   are changed when iterating over the observer list (note that
834	   different token lengths might require realignment).  The advantage
835	   becomes clear for Confirmable notifications: Instead of one
836	   retransmission buffer per observer, only one buffer and only
837	   individual retransmission counters and timers in the list entry need
838	   to be stored.  When the notifications can be sent fast enough, even a
839	   single timer would suffice.  Furthermore, per-resource buffers
840	   simplify the update with a new resource state during open deliveries.

842	3.4.  Blockwise Transfers

844	   Blockwise transfers have the main purpose of providing fragmentation
845	   at the application layer, where partial information can be processed.
846	   This is not possible at lower layers such as 6LoWPAN, as only
847	   assembled packets can be passed up the stack.  While [RFC7959] also
848	   anticipates atomic handling of blocks, i.e., only fully received CoAP
849	   messages, this is not possible on Class 1 devices.

851	   When receiving a blockwise transfer, each block is usually passed to
852	   a handler function that for instance performs stream processing or
853	   writes the blocks to external memory such as flash.  Although there
854	   are no restrictions in [RFC7959], it is beneficial for Class 1
855	   devices to only allow ordered transmission of blocks.  Otherwise on-
856	   the-fly processing would not be possible.

858	   When sending a blockwise transfer out of dynamically generated
859	   information, Class 1 devices usually do not have sufficient memory to
860	   print the full message into a buffer, and slice and send it in a
861	   second step.  For instance, if the CoRE Link Format at /.well-known/
862	   core is dynamically generated, a generator function is required that
863	   generates slices of a large string with a specific offset length (a
864	   'sonprintf()').  This functionality is required recurrently and
865	   should be included in a library.

867	3.4.1.  Generic Proxying of Block Messages

869	   Proxies cannot ignore the Block options by specification, because the
870	   options Block1 and Block2 are not safe-to-forward.  The rationale
871	   behind this design decision is that servers might not be able to
872	   distinguish blocks originating from different senders once they have
873	   been forwarded by a CoAP proxy.  For atomic operations where all
874	   blocks are assembled before actually executing the desired operation,
875	   this could lead to inconsistent state on the server side.

877	   To ensure that this does not happen, a proxy can add the Request-Tag
878	   option (see [I-D.ietf-core-echo-request-tag]) containing data that
879	   uniquely identifies the originating endpoint in the proxy namespace.

881	3.4.2.  Atomic Blockwise Operations

883	   When an implementation needs to assemble blocks from block-wise
884	   transfers, applications need to create an identifier to group
885	   messages that belong together.  This "Block Key" at least contains:

887	   o  The source endpoint (e.g., IP address and port in the UDP case),

889	   o  the destination endpoint,

891	   o  the Cache-Key (as updated in [RFC7252]), and

893	   o  all options that are proxy unsafe and not explicitly described as
894	      safe for block-wise assembly.

896	   The only known options safe for block-wise assembly are the options
897	   Block1 and Block2 [RFC7959].

899	   For the Block1 phase, the request payload is excluded from the
900	   identifier generation as it is just being assembled.

902	   If a message is received that is not the start of a block-wise
903	   operation has a Block Key that is not known, and the implementation
904	   needs to act atomically on a request body, it must answer 4.08
905	   (Request Entity Incomplete).

907	   Conversely, clients should be aware that requests whose Block Key
908	   matches can be interpreted by the server atomically.  This especially
909	   affects proxies (see Section 3.4.1).

911	3.5.  Deduplication with Sequential MIDs

913	   CoAP's duplicate rejection functionality can be straightforwardly
914	   implemented in a CoAP endpoint by storing, for each remote CoAP
915	   endpoint ("peer") that it communicates with, a list of recently
916	   received CoAP Message IDs (MIDs) along with some timing information.
917	   A CoAP message from a peer with a MID that is in the list for that
918	   peer can simply be discarded.

920	   The timing information in the list can then be used to time out
921	   entries that are older than the _expected extent of the re-ordering_,
922	   an upper bound for which can be estimated by adding the _potential
923	   retransmission window_ ([RFC7252] section "Reliable Messages") and
924	   the time packets can stay alive in the network.

926	   Such a straightforward implementation is suitable in case other CoAP
927	   endpoints generate random MIDs.  However, this storage method may
928	   consume substantial RAM in specific cases, such as:

930	   o  many clients are making periodic, non-idempotent requests to a
931	      single CoAP server;

933	   o  one client makes periodic requests to a large number of CoAP
934	      servers and/or requests a large number of resources; where servers
935	      happen to mostly generate separate CoAP responses (not piggy-
936	      backed);

938	   For example, consider the first case where the expected extent of re-
939	   ordering is 50 seconds, and N clients are sending periodic POST
940	   requests to a single CoAP server during a period of high system
941	   activity, each on average sending one client request per second.  The
942	   server would need 100 * N bytes of RAM to store the MIDs only.  This
943	   amount of RAM may be significant on a RAM-constrained platform.  On a
944	   number of platforms, it may be easier to allocate some extra program
945	   memory (e.g.  Flash or ROM) to the CoAP protocol handler process than
946	   to allocate extra RAM.  Therefore, one may try to reduce RAM usage of
947	   a CoAP implementation at the cost of some additional program memory
948	   usage and implementation complexity.

950	   Some CoAP clients generate MID values by a using a Message ID
951	   variable [RFC7252] that is incremented by one each time a new MID
952	   needs to be generated.  (After the maximum value 65535 it wraps back
953	   to 0.)  We call this behavior "sequential" MIDs.  One approach to
954	   reduce RAM use exploits the redundancy in sequential MIDs for a more
955	   efficient MID storage in CoAP servers.

957	   Naturally such an approach requires, in order to actually reduce RAM
958	   usage in an implementation, that a large part of the peers follow the
959	   sequential MID behavior.  To realize this optimization, the authors
960	   therefore RECOMMEND that CoAP endpoint implementers employ the
961	   "sequential MID" scheme if there are no reasons to prefer another
962	   scheme, such as randomly generated MID values.

964	   Security considerations might call for a choice for
965	   (pseudo)randomized MIDs.  Note however that with truly randomly
966	   generated MIDs the probability of MID collision is rather high in use
967	   cases as mentioned before, following from the Birthday Paradox.  For
968	   example, in a sequence of 52 randomly drawn 16-bit values the
969	   probability of finding at least two identical values is about 2
970	   percent.

972	   From here on we consider efficient storage implementations for MIDs
973	   in CoAP endpoints, that are optimized to store "sequential" MIDs.
974	   Because CoAP messages may be lost or arrive out-of-order, a solution
975	   has to take into account that received MIDs of CoAP messages are not
976	   actually arriving in a sequential fashion, due to lost or reordered
977	   messages.  Also a peer might reset and lose its MID counter(s) state.

979	   In addition, a peer may have a single Message ID variable used in
980	   messages to many CoAP endpoints it communicates with, which partly
981	   breaks sequentiality from the receiving CoAP endpoint's perspective.
982	   Finally, some peers might use a randomly generated MID values
983	   approach.  Due to these specific conditions, existing sliding window
984	   bitfield implementations for storing received sequence numbers are
985	   typically not directly suitable for efficiently storing MIDs.

987	   Table 1 shows one example for a per-peer MID storage design: a table
988	   with a bitfield of a defined length _K_ per entry to store received
989	   MIDs (one per bit) that have a value in the range [MID_i + 1 , MID_i
990	   + K].

992	              +----------+----------------+-----------------+
993	              | MID base | K-bit bitfield | base time value |
994	              +----------+----------------+-----------------+
995	              | MID_0    | 010010101001   | t_0             |
996	              |          |                |                 |
997	              | MID_1    | 111101110111   | t_1             |
998	              |          |                |                 |
999	              | ... etc. |                |                 |
1000	              +----------+----------------+-----------------+

1002	         Table 1: A per-peer table for storing MIDs based on MID_i

1004	   The presence of a table row with base MID_i (regardless of the
1005	   bitfield values) indicates that a value MID_i has been received at a
1006	   time t_i.  Subsequently, each bitfield bit k (0...K-1) in a row i
1007	   corresponds to a received MID value of MID_i + k + 1.  If a bit k is
1008	   0, it means a message with corresponding MID has not yet been
1009	   received.  A bit 1 indicates such a message has been received already
1010	   at approximately time t_i.  This storage structure allows e.g. with
1011	   k=64 to store in best case up to 130 MID values using 20 bytes, as
1012	   opposed to 260 bytes that would be needed for a non-sequential
1013	   storage scheme.

1015	   The time values t_i are used for removing rows from the table after a
1016	   preset timeout period, to keep the MID store small in size and enable
1017	   these MIDs to be safely re-used in future communications.  (Note that
1018	   the table only stores one time value per row, which therefore needs
1019	   to be updated on receipt of another MID that is stored as a single
1020	   bit in this row.  As a consequence of only storing one time value per
1021	   row, older MID entries typically time out later than with a simple
1022	   per-MID time value storage scheme.  The endpoint therefore needs to
1023	   ensure that this additional delay before MID entries are removed from
1024	   the table is much smaller than the time period after which a peer
1025	   starts to re-use MID values due to wrap-around of a peer's MID
1026	   variable.  One solution is to check that a value t_i in a table row
1027	   is still recent enough, before using the row and updating the value
1028	   t_i to current time.  If not recent enough, e.g. older than N
1029	   seconds, a new row with an empty bitfield is created.)  [Clearly,
1030	   these optimizations would benefit if the peer were much more
1031	   conservative about re-using MIDs than currently required in the
1032	   protocol specification.]

1034	   The optimization described is less efficient for storing randomized
1035	   MIDs that a CoAP endpoint may encounter from certain peers.  To solve
1036	   this, a storage algorithm may start in a simple MID storage mode,
1037	   first assuming that the peer produces non-sequential MIDs.  While
1038	   storing MIDs, a heuristic is then applied based on monitoring some
1039	   "hit rate", for example, the number of MIDs received that have a Most
1040	   Significant Byte equal to that of the previous MID divided by the
1041	   total number of MIDs received.  If the hit rate tends towards 1 over
1042	   a period of time, the MID store may decide that this particular CoAP
1043	   endpoint uses sequential MIDs and in response improve efficiency by
1044	   switching its mode to the bitfield based storage.

1046	4.  Alternative Configurations

1048	4.1.  Transmission Parameters

1050	   When a constrained network of CoAP nodes is not communicating over
1051	   the Internet, for instance because it is shielded by a proxy or a
1052	   closed deployment, alternative transmission parameters can be used.
1053	   Consequently, the derived time values provided in [RFC7252] section
1054	   4.8.2 will also need to be adjusted, since most implementations will
1055	   encode their absolute values.

1057	   Static adjustments require a fixed deployment with a constant number
1058	   or upper bound for the number of nodes, number of hops, and expected
1059	   concurrent transmissions.  Furthermore, the stability of the wireless
1060	   links should be evaluated.  ACK_TIMEOUT should be chosen above the
1061	   xx% percentile of the round-trip time distribution.
1062	   ACK_RANDOM_FACTOR depends on the number of nodes on the network.
1063	   MAX_RETRANSMIT should be chosen suitable for the targeted
1064	   application.  A lower bound for LEISURE can be calculated as

1066	   lb_Leisure = S * G / R

1068	   where S is the estimated response size, G the group size, and R the
1069	   target data transfer rate (see [RFC7252] section 8.2).  NSTART and
1070	   PROBING_RATE depend on estimated network utilization.  If the main
1071	   cause for loss are weak links, higher values can be chosen.

1073	   Dynamic adjustments will be performed by advanced congestion control
1074	   mechanisms such as [I-D.ietf-core-cocoa].  They are required if the
1075	   main cause for message loss is network or endpoint congestion.  Semi-
1076	   dynamic adjustments could be implemented by disseminating new static
1077	   transmission parameters to all nodes when the network configuration
1078	   changes (e.g., new nodes are added or long-lasting interference is
1079	   detected).

1081	4.2.  CoAP over IPv4

1083	   CoAP was designed for the properties of IPv6, which is dominating in
1084	   constrained environments because of the 6LoWPAN adaption layer
1085	   [RFC6282].  In particular, the size limitations of CoAP are tailored
1086	   to the minimal MTU of 1280 bytes.  Until the transition towards IPv6
1087	   converges, CoAP nodes might also communicate over IPv4, though.
1088	   Sections 4.2 and 4.6 of the base specification [RFC7252] already
1089	   provide guidance and implementation notes to handle the smaller
1090	   minimal MTUs of IPv4.

1092	   Another deployment issue in legacy IPv4 deployments is caused by
1093	   Network Address Translators (NATs).  The session timeouts are
1094	   unpredictable and NATs may close UDP sessions with timeout as short
1095	   as 60 seconds.  This makes CoAP endpoints behind NATs practically
1096	   unreachable, even when they contact the remote endpoint with a public
1097	   IP address first.  Incorrect behavior may also arise when the NAT
1098	   session heuristic changes the external port between two successive
1099	   CoAP messages.  For the remote endpoint, this will look like two
1100	   different CoAP endpoints on the same IP address.  Such behavior can
1101	   be fatal for the resource directory registration interface.

1103	5.  Binding to specific lower-layer APIs

1105	   Implementing CoAP on specific lower-layer APIs appears to
1106	   consistently bring up certain less-known aspects of these APIs.  This
1107	   section is intended to alert implementers to such aspects.

1109	5.1.  Berkeley Socket Interface

1111	5.1.1.  Responding from the right address

1113	   In order for a client to recognize a reply (response or
1114	   acknowledgement) as coming from the endpoint to which the initiating
1115	   packet was addressed, the source IPv6 address of the reply needs to
1116	   match the destination address of the initiating packet.

1118	   Implementers that have previously written TCP-based applications are
1119	   used to binding their server sockets to INADDR_ANY.  Any TCP
1120	   connection received over such a socket is then more specifically
1121	   bound to the source address from which the TCP connection setup was
1122	   received; no programmer action is needed for this.

1124	   For stateless UDP sockets, more manual work is required.  Simply
1125	   receiving a packet from a UDP socket bound to INADDR_ANY loses the
1126	   information about the destination address; replying to it through the
1127	   same socket will use the default address established by the kernel.
1128	   Two strategies are available:

1130	   o  Only use sockets bound to a specific address (not INADDR_ANY).  A
1131	      system with multiple interfaces (or addresses) will thus need to
1132	      bind multiple sockets and send replies back on the same socket the
1133	      initiating packet was received on.

1135	   o  Use IPV6_RECVPKTINFO [RFC3542] to configure the socket, and mirror
1136	      back the IPV6_PKTINFO information for the reply (see also
1137	      Section 5.1.1.1).

1139	5.1.1.1.  Managing interfaces

1141	   For some applications, it may further be relevant what interface is
1142	   chosen to send to an endpoint, beyond the kernel choosing one that
1143	   has a routing table entry for the destination address.  E.g., it may
1144	   be natural to send out a response or acknowledgment on the same
1145	   interface that the packet prompting it was received.  The end of the
1146	   introduction to section 6 of [RFC3542] describes a simple technique
1147	   for this, where that RFC's API (IPV6_PKTINFO) is available.  The same
1148	   data structure can be used for indicating an interface to send a
1149	   packet that is initiating an exchange.  (Choosing that interface is
1150	   too application-specific to be in scope for the present document.)

1152	5.1.2.  Handling ICMP errors

1154	   Sockets that use the connect and send functions usually receive ICMP
1155	   errors in the form of error codes, sockets that use sendto or sendmsg
1156	   do not receive them at all.

1158	   Neither is sufficient to implement the guidance in Section 2.6, as
1159	   the vetting of the message requires access to the CoAP headers in the
1160	   ICMP error.  The necessary information can be obtained by using the
1161	   IPV6_RECVERR option.

1163	5.2.  Java

1165	   Java provides a wildcard address (0.0.0.0) to bind a socket to all
1166	   network interface.  This is useful when a server is supposed to
1167	   listen on any available interface including the lookback address.
1168	   For UDP, and hence CoAP this poses a problem, however, because the
1169	   DatagramPacket class does not provide the information to which
1170	   address it was sent.  When replying through the wildcard socket, the
1171	   JVM will pick the default address, which can break the correlation of
1172	   messages when the remote endpoint did not send the message to the
1173	   default address.  This is in particular precarious for IPv6 where it
1174	   is common to have multiple IP addresses per network interface.  Thus,
1175	   it is recommended to bind to all adresses explicitly and manage the
1176	   destination address of incoming messages within the CoAP
1177	   implementation.

1179	5.3.  Multicast detection

1181	   Similar to the considerations above, Section 8 of [RFC7252] requires
1182	   a node to detect whether a packet that it is going to reply to was
1183	   sent to a unicast or to a multicast address.  On most platforms,
1184	   binding a UDP socket to a unicast address ensures that it only
1185	   receives packets addressed to that address.  Programmers relying on
1186	   this property should ensure that it indeed applies to the platform
1187	   they are using.  If it does not, IPV6_PKTINFO may, again, help for
1188	   Berkeley Socket Interfaces.  For Java, explicit management of
1189	   different sockets (in this case a MulticastSocket) is required.

1191	5.4.  DTLS

1193	   CoAPS implementations require access to the authenticated user/device
1194	   prinicipal to realize access control for resources.  How this
1195	   information can be accessed heavily depends on the DTLS
1196	   implementation used.  Generic and portable CoAP implementations might
1197	   want to provide an abstraction layer that can be used by application
1198	   developers that implement resource handlers.  It is recommended to
1199	   keep the API of such an application layer close to popular HTTPS
1200	   solutions that are available for the targeted platform, for instance,
1201	   mod_ssl or the Java Servlet API.

1203	6.  CoAP on various transports

1205	   As specified in [RFC7252], CoAP is defined for two underlying
1206	   transports: UDP and DTLS.  These transports are relatively similar in
1207	   terms of the properties they expose to their users.  (The main
1208	   difference, apart from the increased security, is that DTLS provides
1209	   an abstraction of a connection, into which the endpoint abstraction
1210	   is placed; in contrast, the UDP endpoint abstraction is based on
1211	   four-tuples of IP addresses and ports.)

1213	   Recently, the need to carry CoAP over other transports
1214	   [I-D.silverajan-core-coap-alternative-transports] has led to
1215	   specifications such as CoAP over TLS or TCP or WebSockets[RFC8323],
1216	   or even over non-IP transports such as SMS
1217	   [I-D.becker-core-coap-sms-gprs].  This section discusses
1218	   considerations that arise when handling these different transports in
1219	   an implementation.

1221	6.1.  CoAP over reliable transports

1223	   To cope with transports without reliable delivery (such as UDP and
1224	   DTLS), CoAP defines its own message layer, with acknowledgments,
1225	   timers, and retransmission.  When CoAP is run over a transport that
1226	   provides its own reliability (such as TCP or TLS), running this
1227	   machinery would be redundant.  Worse, keeping the machinery in place
1228	   is likely to lead to interoperability problems as it is unlikely to
1229	   be tested as well as on unreliable transports.  Therefore,
1230	   [I-D.silverajan-core-coap-alternative-transports] was defined by
1231	   removing the message layer from CoAP and just running the request/
1232	   response layer directly on top of the reliable transport.  This also
1233	   leads to a reduced (from the UDP/DTLS 4-byte header) header format.

1235	   Conversely, where reliable transports provide a byte stream
1236	   abstraction, some form of message delimiting had to be added, which
1237	   now needs to be handled in the CoAP implementation.  The use of
1238	   reliable transports may reduce the disincentive for using messages
1239	   larger than optimal link layer packet sizes.  Where different message
1240	   sizes are chosen by an application for reliable and for unreliable
1241	   transports, this can pose additional challenges for translators
1242	   (Section 6.2).

1244	   Where existing CoAP APIs expose details of the the message layer
1245	   (e.g., CON vs. NON, or assigning application layer semantics to
1246	   ACKs), using a reliable transport may require additional adjustments.

1248	6.2.  Translating between transports

1250	   One obvious way to convey CoAP exchanges between different transports
1251	   is to run a CoAP proxy that supports both transports.  The usual
1252	   considerations for proxies apply.  Section 6.2.1 discusses some
1253	   additional considerations.

1255	   Where not much of the functionality of CoAP proxies (such as caching)
1256	   is required, a simpler 1:1 translation may be possible, as discussed
1257	   in Section 6.2.2.

1259	6.2.1.  Transport translation by proxies

1261	   (TBD.  In particular, point out the obvious: fan-in/fan-out means
1262	   that separate message ID and token spaces need to be maintained at
1263	   the ends of the proxy.)

1265	   One more CoAP specific function of a transport translator proxy may
1266	   be to convert between different block sizes, e.g. between a TCP
1267	   connection that can tolerate large blocks and UDP over a constrained
1268	   node network.

1270	6.2.2.  One-to-one Transport translation

1272	   A translator with reduced requirements for state maintenance can be
1273	   constructed when no fan-in or fan-out is required, and when the
1274	   namespace lifetimes of the two sides can be made to coincide.  For
1275	   this one-to-one translation, there is no need to manage message-ID
1276	   and Token value spaces for both sides separately.  So, a simple UDP-
1277	   to-UDP one-to-one translator could simply copy the messages (among
1278	   other applications, this might be useful for translation between IPv4
1279	   and IPv6 spaces).  Similarly, a DTLS-to-TCP translator could be built
1280	   that executes the message layer (deduplication, retransmission) on
1281	   the DTLS side, and repackages the CoAP header (add/remove the length
1282	   information, and remove/add the message ID and message type) between
1283	   the DTLS and the TCP side.

1285	   By definition, such a simple one-to-one translator needs to shut down
1286	   the connection on one side when the connection on the other side
1287	   terminates.  However, a UDP-to-TCP one-to-one translator cannot
1288	   simply shut down the UDP endpoint when the TCP endpoint vanishes
1289	   because the TCP connection closes, so some additional management of
1290	   state will be necessary.

1292	7.  IANA considerations

1294	   This document has no actions for IANA.

1296	8.  Security considerations

1298	   TBD

1300	9.  Acknowledgements

1302	   Esko Dijk contributed the sequential MID optimization.  Xuan He
1303	   provided help creating and improved the state machine charts.
1304	   Christian Amsuess provided input on forwarding block messages by
1305	   proxies and usage of the Request-Tag option.

1307	10.  References

1309	10.1.  Normative References

1311	   [I-D.ietf-core-cocoa]
1312	              Bormann, C., Betzler, A., Gomez, C., and I. Demirkol,
1313	              "CoAP Simple Congestion Control/Advanced", draft-ietf-
1314	              core-cocoa-03 (work in progress), February 2018.

1316	   [RFC6282]  Hui, J., Ed. and P. Thubert, "Compression Format for IPv6
1317	              Datagrams over IEEE 802.15.4-Based Networks", RFC 6282,
1318	              DOI 10.17487/RFC6282, September 2011,
1319	              <https://www.rfc-editor.org/info/rfc6282>.

1321	   [RFC6570]  Gregorio, J., Fielding, R., Hadley, M., Nottingham, M.,
1322	              and D. Orchard, "URI Template", RFC 6570,
1323	              DOI 10.17487/RFC6570, March 2012,
1324	              <https://www.rfc-editor.org/info/rfc6570>.

1326	   [RFC6633]  Gont, F., "Deprecation of ICMP Source Quench Messages",
1327	              RFC 6633, DOI 10.17487/RFC6633, May 2012,
1328	              <https://www.rfc-editor.org/info/rfc6633>.

1330	   [RFC7230]  Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
1331	              Protocol (HTTP/1.1): Message Syntax and Routing",
1332	              RFC 7230, DOI 10.17487/RFC7230, June 2014,
1333	              <https://www.rfc-editor.org/info/rfc7230>.

1335	   [RFC7252]  Shelby, Z., Hartke, K., and C. Bormann, "The Constrained
1336	              Application Protocol (CoAP)", RFC 7252,
1337	              DOI 10.17487/RFC7252, June 2014,
1338	              <https://www.rfc-editor.org/info/rfc7252>.

1340	   [RFC7641]  Hartke, K., "Observing Resources in the Constrained
1341	              Application Protocol (CoAP)", RFC 7641,
1342	              DOI 10.17487/RFC7641, September 2015,
1343	              <https://www.rfc-editor.org/info/rfc7641>.

1345	   [RFC7959]  Bormann, C. and Z. Shelby, Ed., "Block-Wise Transfers in
1346	              the Constrained Application Protocol (CoAP)", RFC 7959,
1347	              DOI 10.17487/RFC7959, August 2016,
1348	              <https://www.rfc-editor.org/info/rfc7959>.

1350	10.2.  Informative References

1352	   [Contiki]  Dunkels, A., Groenvall, B., and T. Voigt, "Contiki - a
1353	              Lightweight and Flexible Operating System for Tiny
1354	              Networked Sensors", Proceedings of the First IEEE
1355	              Workshop on Embedded Networked Sensors, November 2004.

1357	   [I-D.becker-core-coap-sms-gprs]
1358	              Kuladinithi, K., Becker, M., Li, K., and T. Poetsch,
1359	              "Transport of CoAP over SMS", draft-becker-core-coap-sms-
1360	              gprs-06 (work in progress), February 2017.

1362	   [I-D.ietf-core-coap-pubsub]
1363	              Koster, M., Keranen, A., and J. Jimenez, "Publish-
1364	              Subscribe Broker for the Constrained Application Protocol
1365	              (CoAP)", draft-ietf-core-coap-pubsub-04 (work in
1366	              progress), March 2018.

1368	   [I-D.ietf-core-echo-request-tag]
1369	              Amsuess, C., Mattsson, J., and G. Selander, "Echo and
1370	              Request-Tag", draft-ietf-core-echo-request-tag-02 (work in
1371	              progress), June 2018.

1373	   [I-D.silverajan-core-coap-alternative-transports]
1374	              Silverajan, B. and T. Savolainen, "CoAP Communication with
1375	              Alternative Transports", draft-silverajan-core-coap-
1376	              alternative-transports-11 (work in progress), March 2018.

1378	   [RFC3542]  Stevens, W., Thomas, M., Nordmark, E., and T. Jinmei,
1379	              "Advanced Sockets Application Program Interface (API) for
1380	              IPv6", RFC 3542, DOI 10.17487/RFC3542, May 2003,
1381	              <https://www.rfc-editor.org/info/rfc3542>.

1383	   [RFC5927]  Gont, F., "ICMP Attacks against TCP", RFC 5927,
1384	              DOI 10.17487/RFC5927, July 2010,
1385	              <https://www.rfc-editor.org/info/rfc5927>.

1387	   [RFC7228]  Bormann, C., Ersue, M., and A. Keranen, "Terminology for
1388	              Constrained-Node Networks", RFC 7228,
1389	              DOI 10.17487/RFC7228, May 2014,
1390	              <https://www.rfc-editor.org/info/rfc7228>.

1392	   [RFC8323]  Bormann, C., Lemay, S., Tschofenig, H., Hartke, K.,
1393	              Silverajan, B., and B. Raymor, Ed., "CoAP (Constrained
1394	              Application Protocol) over TCP, TLS, and WebSockets",
1395	              RFC 8323, DOI 10.17487/RFC8323, February 2018,
1396	              <https://www.rfc-editor.org/info/rfc8323>.

1398	   [TinyOS]   Levis, P., Madden, S., Polastre, J., Szewczyk, R.,
1399	              Whitehouse, K., Woo, A., Gay, D., Woo, A., Hill, J.,
1400	              Welsh, M., Brewer, E., and D. Culler, "TinyOS: An
1401	              Operating System for Sensor Networks", Ambient
1402	              intelligence, Springer (Berlin Heidelberg),
1403	              ISBN 978-3-540-27139-0, 2005.

1405	Authors' Addresses
1406	   Matthias Kovatsch
1407	   ETH Zurich
1408	   Universitaetstrasse 6
1409	   CH-8092 Zurich
1410	   Switzerland

1412	   Email: kovatsch@inf.ethz.ch

1414	   Olaf Bergmann
1415	   Universitaet Bremen TZI
1416	   Postfach 330440
1417	   D-28359 Bremen
1418	   Germany

1420	   Email: bergmann@tzi.org

1422	   Carsten Bormann (editor)
1423	   Universitaet Bremen TZI
1424	   Postfach 330440
1425	   D-28359 Bremen
1426	   Germany

1428	   Phone: +49-421-218-63921
1429	   Email: cabo@tzi.org