idnits 2.17.1 

draft-hartke-dice-practical-issues-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (April 8, 2014) is 3665 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'ChangeCipherSpec' is mentioned on line 384, but not
     defined

  ** Obsolete normative reference: RFC 5246 (Obsoleted by RFC 8446)

  ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147)

  == Outdated reference: A later version (-17) exists of
     draft-ietf-dice-profile-00

  == Outdated reference: A later version (-23) exists of
     draft-ietf-tls-cached-info-16


     Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	DICE Working Group                                             K. Hartke
3	Internet-Draft                                   Universitaet Bremen TZI
4	Intended status: Informational                             April 8, 2014
5	Expires: October 10, 2014

7	                         Practical Issues with
8	     Datagram Transport Layer Security in Constrained Environments
9	                 draft-hartke-dice-practical-issues-01

11	Abstract

13	   This document investigates practical issues around the implementation
14	   of Datagram Transport Layer Security (DTLS) 1.2 in constrained
15	   environments, and explores some ideas for an optimized version of
16	   DTLS 1.2 that is more friendly to constrained nodes and networks.

18	Status of this Memo

20	   This Internet-Draft is submitted in full conformance with the
21	   provisions of BCP 78 and BCP 79.

23	   Internet-Drafts are working documents of the Internet Engineering
24	   Task Force (IETF).  Note that other groups may also distribute
25	   working documents as Internet-Drafts.  The list of current Internet-
26	   Drafts is at http://datatracker.ietf.org/drafts/current/.

28	   Internet-Drafts are draft documents valid for a maximum of six months
29	   and may be updated, replaced, or obsoleted by other documents at any
30	   time.  It is inappropriate to use Internet-Drafts as reference
31	   material or to cite them other than as "work in progress."

33	   This Internet-Draft will expire on October 10, 2014.

35	Copyright Notice

37	   Copyright (c) 2014 IETF Trust and the persons identified as the
38	   document authors.  All rights reserved.

40	   This document is subject to BCP 78 and the IETF Trust's Legal
41	   Provisions Relating to IETF Documents
42	   (http://trustee.ietf.org/license-info) in effect on the date of
43	   publication of this document.  Please review these documents
44	   carefully, as they describe your rights and restrictions with respect
45	   to this document.  Code Components extracted from this document must
46	   include Simplified BSD License text as described in Section 4.e of
47	   the Trust Legal Provisions and are provided without warranty as
48	   described in the Simplified BSD License.

50	Table of Contents

52	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
53	     1.1.  Background . . . . . . . . . . . . . . . . . . . . . . . .  3
54	     1.2.  Overview . . . . . . . . . . . . . . . . . . . . . . . . .  3
55	     1.3.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .  4
56	   2.  Potential Problems and Possible Solutions  . . . . . . . . . .  4
57	     2.1.  Handshake Reliability and Fragmentation  . . . . . . . . .  4
58	     2.2.  Timer Values . . . . . . . . . . . . . . . . . . . . . . .  7
59	     2.3.  Connection Initiation  . . . . . . . . . . . . . . . . . .  8
60	     2.4.  Connection Closure . . . . . . . . . . . . . . . . . . . .  9
61	     2.5.  Data Size  . . . . . . . . . . . . . . . . . . . . . . . . 10
62	     2.6.  Code Size  . . . . . . . . . . . . . . . . . . . . . . . . 10
63	     2.7.  Application Data Fragmentation . . . . . . . . . . . . . . 11
64	     2.8.  Application Layer Protocol . . . . . . . . . . . . . . . . 12
65	   3.  A Comparison of Strategies for Handshake Reliability . . . . . 13
66	   4.  A Strawman for Stateless Header Compression  . . . . . . . . . 16
67	     4.1.  Records  . . . . . . . . . . . . . . . . . . . . . . . . . 16
68	     4.2.  Handshake Messages . . . . . . . . . . . . . . . . . . . . 17
69	   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 19
70	   6.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 19
71	   7.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19
72	   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 19
73	     8.1.  Normative References . . . . . . . . . . . . . . . . . . . 19
74	     8.2.  Informative References . . . . . . . . . . . . . . . . . . 19
75	   Appendix A.  Templates . . . . . . . . . . . . . . . . . . . . . . 22
76	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 23

78	1.  Introduction

80	1.1.  Background

82	   Nodes taking part in the "Internet of Things" often have strict
83	   limitations regarding their computational power, memory size (both
84	   RAM and ROM) and power management [I-D.ietf-lwig-terminology].
85	   Network communication, in particular if wireless, also imposes
86	   constraints that need to be considered during protocol design, such
87	   as low bitrates, variable delays and and possibly high packet loss.

89	   Moreover, frames at the link layer might be much smaller than the
90	   IPv6 minimum MTU of 1280 bytes and therefore require additional
91	   adaptation mechanisms such as 6LoWPAN [RFC4944] for IEEE 802.15.4
92	   wireless networks [IEEE.802-15-4], which in turn may exacerbate the
93	   limitations of the network: for instance, as high loss rates are
94	   anticipated by design, application protocols usually try to avoid
95	   fragmentation at the network layer.

97	   However, application protocols often delegate security mechanisms to
98	   transport layer security protocols.  More often than not, the
99	   protocol overhead from securing the communication is highly relevant
100	   to the overall performance of the systems.

102	   One protocol that has received significant attention recently for
103	   constrained node/network applications is Datagram Transport Layer
104	   Security (DTLS) [RFC6347].  DTLS is derived from and inherits some
105	   characteristics from TLS [RFC5246].  Although it has clearly not been
106	   designed with constrained devices and lossy networks in mind, it is
107	   thought to be usable in these environments [RFC6574].  There are
108	   still a few challenges when it comes to actually implement DTLS.

110	1.2.  Overview

112	   The present document investigates practical issues around the
113	   implementation of DTLS 1.2 in constrained environments, and explores
114	   a few ideas that could lead to an optimized version of DTLS that is
115	   more friendly to constrained nodes and networks.

117	   The ideas generally fall into one of the following categories:

119	   Implementation guidance:  Implementation techniques for achieving
120	      light-weight implementations of DTLS, without affecting
121	      conformance to the relevant specifications or interoperability
122	      with other implementations.  This includes techniques for reducing
123	      complexity, memory footprint, or power usage.  The result may
124	      eventually be incorporated into [I-D.ietf-lwig-guidance].

126	   Protocol profile:  Use of DTLS in a particular way, for example, by
127	      changing certain "MAY"s into "MUST"s or "MUST NOT"s, or by
128	      prescribing or precluding certain extensions and cipher suites.
129	      DTLS implementations ought to be usable without change if they can
130	      be configured accordingly.  See also [I-D.ietf-dice-profile].

132	   Stateless header compression:  Compression of DTLS records without
133	      explicitly building any compression context state.  This is done
134	      by using shorter forms to represent the same bits of information
135	      or relying on information that is already shared by the client and
136	      server.  Existing DTLS implementations can continue to be used if
137	      a thin layer is added that handles compression and decompression.

139	   Breaking changes:  New implementations are required that do not
140	      interoperate with implementations of DTLS, though there is no
141	      intention in this document to change the overall operation of TLS.

143	1.3.  Terminology

145	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
146	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
147	   document are to be interpreted as described in RFC 2119 [RFC2119].
148	   Note that this document itself is informational, but it is discussing
149	   normative statements.

151	2.  Potential Problems and Possible Solutions

153	2.1.  Handshake Reliability and Fragmentation

155	   DTLS records can be large in size for a single 6LoWPAN [RFC4944]
156	   payload: IEEE 802.15.4 [IEEE.802-15-4] specifies a physical layer MTU
157	   of only 127 bytes, which yields about 60-80 bytes of payload after
158	   adding MAC layer and adaptation layer headers.  Although 6LoWPAN
159	   supports the fragmentation of IPv6 packets into small link-layer
160	   frames, this is generally tried to be avoided in low-power, lossy
161	   networks.

163	   DTLS offers fragmentation at the handshake layer and hence can help
164	   to prevent IP fragmentation.  However, this can add a significant
165	   overhead on the number of datagrams and bytes transferred (see
166	   Table 1 below).  Packet loss is also still a big problem for the
167	   constrained nodes: since fragments may arrive in any order, buffers
168	   must be large enough to hold all messages after reassembly, and
169	   losing a single fragment will cause all fragments of a message flight
170	   to be retransmitted.  This is very likely especially during key and
171	   certificate exchanges as these will not fit within a packet without
172	   fragmentation in most 6LoWPANs.

174	   +--------------+-----------------+------------------+---------------+
175	   |     UDP data |       Number of |  Total number of | Proportion of |
176	   |   size limit |       datagrams |            bytes |   header data |
177	   |      (bytes) |     transferred |      transferred |               |
178	   +--------------+-----------------+------------------+---------------+
179	   |           50 |              27 |            1,182 |          55 % |
180	   |           55 |              21 |            1,037 |          49 % |
181	   |           60 |              20 |            1,081 |          51 % |
182	   |           65 |              18 |            1,003 |          47 % |
183	   |           70 |              15 |              912 |          42 % |
184	   |           75 |              14 |              875 |          39 % |
185	   |           80 |              13 |              874 |          39 % |
186	   |           85 |              12 |              849 |          37 % |
187	   |           90 |              12 |              849 |          37 % |
188	   |        1,152 |               6 |              802 |          34 % |
189	   +--------------+-----------------+------------------+---------------+

191	    Table 1: Number of datagrams and bytes transferred using different
192	        limits for DTLS fragmentation in an example DTLS handshake
193	   (TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8 with Raw Public Key Certificate)

195	   Possible Solutions include:

197	   o  Perform the handshake using alternative mechanisms for reliability
198	      and fragmentation over UDP:

200	      *  Use IP fragmentation.  If no X.509 certificates are involved,
201	         the handshake messages of one flight typically require less
202	         than 400 bytes combined.  Since all messages of a flight in
203	         DTLS are retransmitted anyway when a single fragment is lost,
204	         the difference between performing the fragmentation at the DTLS
205	         layer and at the IP layer is probably not huge.

207	      *  Use DTLS fragmentation.  When compared to, for example, the
208	         reliability mechanism of CoAP over UDP [I-D.ietf-core-coap]
209	         (where the receipt of each data fragment is confirmed by one
210	         acknowledgement message, and an acknowledgement message may
211	         opportunistically piggyback data in the opposite direction),
212	         DTLS actually performs better for a typical DTLS handshake in
213	         both lossy and non-lossy network environments (cf. Section 3).

215	      *  Extend DTLS with acknowledgment messages that confirm the
216	         receipt of fragments and allow an implementation to retransmit
217	         only the fragments that are missing.  Section 3 explores a
218	         number of strategies for the reliable transmission of DTLS
219	         handshake messages with acknowledgements, including CoAP-style
220	         acknowledgements and cumulative acknowledgements.

222	   +--------------+-----------------+------------------+---------------+
223	   |     UDP data |       Number of |  Total number of | Proportion of |
224	   |   size limit |       datagrams |            bytes |   header data |
225	   |      (bytes) |     transferred |      transferred |               |
226	   +--------------+-----------------+------------------+---------------+
227	   |           50 |       15 (56 %) |       592 (50 %) |          10 % |
228	   |           55 |       13 (62 %) |       585 (56 %) |           9 % |
229	   |           60 |       13 (65 %) |       621 (57 %) |          14 % |
230	   |           65 |       11 (61 %) |       588 (59 %) |          10 % |
231	   |           70 |       11 (73 %) |       573 (63 %) |           7 % |
232	   |           75 |       11 (79 %) |       573 (65 %) |           7 % |
233	   |           80 |       10 (77 %) |       567 (65 %) |           6 % |
234	   |           85 |       10 (83 %) |       567 (67 %) |           6 % |
235	   |           90 |       10 (83 %) |       567 (67 %) |           6 % |
236	   |        1,152 |       6 (100 %) |       617 (77 %) |          14 % |
237	   +--------------+-----------------+------------------+---------------+

239	      Table 2: Number of datagrams and bytes transferred in the same
240	      example DTLS handshake as in Table 1 but using the strawman for
241	            Stateless Header Compression described in Section 4

243	   o  Reduce the number of bytes to be transferred, so fewer packets
244	      need to be transmitted that could potentially be lost:

246	      *  Exchange large blobs using an out-of-band mechanism.  The TLS
247	         Cached Information Extension [I-D.ietf-tls-cached-info], for
248	         example, allows to omit the exchange of fairly static data such
249	         as the server certificate, if this data is already available.

251	      *  Perform a DTLS-specific kind of Stateless Header Compression,
252	         as explored in Section 4.  This can significantly reduce the
253	         number of datagrams and bytes transferred, and in particular
254	         also the proportion of header data within the number of bytes
255	         transferred (see Table 2 above).

257	      *  Compress DTLS headers with 6LoWPAN General Header Compression
258	         [I-D.bormann-6lo-ghc], or a specific DTLS format for 6LoWPAN
259	         Next Header Compression [I-D.raza-dice-compressed-dtls].

261	      *  Recover the Raw Public Key Certificate
262	         [I-D.ietf-tls-oob-pubkey] from the ECDSA signature in a
263	         ECDHE_ECDSA handshake, instead of transmitting both the public
264	         key and the signature.  This is decribed in Section 4.1.6 of
265	         [SEC1]:

267	            "This is also useful in bandwidth constrained environments,
268	            when transmission of public keys cannot be afforded.  Entity
269	            U could send a signature to entity V, who recovers QU.

271	            Entity V can look up the public key in some certificate or
272	            directory, and if it matches then the signature can be
273	            accepted."

275	      *  Mandate the use compressed point formats for elliptic curves.

277	      *  Transmit only the low-order N bits of the 48 bit sequence
278	         numbers and reconstruct the (48-N) high-order bits, as is
279	         similarly done for extended sequence numbers in IPsec (see
280	         Appendix B of RFC 4302 [RFC4302]).

282	      *  Use self-delimiting numeric values [RFC6256] instead of fixed-
283	         sized fields.

285	      *  Use a single bit field instead of multiple type fields to
286	         indicate which handshake messages are present in a record.

288	2.2.  Timer Values

290	   RFC 6347 [RFC6347] leaves the choice of timer values to the
291	   implementation, but makes the following recommendation:

293	      "Implementations SHOULD use an initial timer value of 1 second
294	      (the minimum defined in RFC 6298 [RFC6298]) and double the value
295	      at each retransmission, up to no less than the RFC 6298 maximum of
296	      60 seconds."  [RFC6347]

298	   Given the time required by some algorithms when executed on a
299	   constrained devices (see Table 3), an initial timer value of 1 second
300	   can easily lead to spurious retransmissions.

302	   +-------------+--------------+-----------+------------+-------------+
303	   | Algorithm   | Library      |    Memory |  Execution |  Comparable |
304	   |             |              | footprint |       time |     RSA key |
305	   |             |              |   (bytes) |  (seconds) |      length |
306	   +-------------+--------------+-----------+------------+-------------+
307	   | RSA 1024    | AvrCryptolib |       640 |      199.7 |             |
308	   | RSA 2048    | AvrCryptolib |     1,280 |    1,587.6 |             |
309	   | ECDSA 160r1 | TinyECC      |       892 |        2.3 |        1024 |
310	   | ECDSA 192r1 | TinyECC      |     1,008 |        3.6 |        1536 |
311	   | ECDSA 160r1 | Wiselib      |       842 |       20.2 |        1024 |
312	   | ECDSA 192r1 | Wiselib      |       952 |       34.6 |        1536 |
313	   | ECDSA 163k1 | Relic        |     2,804 |        0.3 |        1024 |
314	   | ECDSA 233k1 | Relic        |     3,675 |        1.8 |        2048 |
315	   +-------------+--------------+-----------+------------+-------------+

317	    Table 3: RSA private key operation and ECDSA signature performance
318	                      (from [I-D.aks-crypto-sensors])

320	   Possible Solutions include:

322	   o  Adjust the timer value to meet the conditions of constrained nodes
323	      and low-power, lossy networks.

325	   o  Add acknowledgment messages to DTLS that allow an implementation
326	      to confirm the receipt of a message before starting to prepare its
327	      response message flight; see Section 3.

329	2.3.  Connection Initiation

331	   Nodes with very constrained main memory also suffer from the
332	   complexity of the DTLS handshake protocol.  We envision that the
333	   acceptance of DTLS as security protocol for embedded devices would
334	   significantly increase if a less complex connection initiation
335	   procedure with a smaller number of handshake messages was defined.

337	   Compared to TLS, DTLS exacerbates the connection initiation: A DTLS
338	   handshake has an additional roundtrip that results from the addition
339	   of a stateless cookie exchange.  This exchange is designed to prevent
340	   certain denial-of-service attacks: consumption of excessive server
341	   resources caused by the transmission of a series of handshake
342	   initiation requests, and use of the server as an amplifier by sending
343	   connection initiation messages with a forged source of the victim.

345	   Possible Solutions include:

347	   o  Create the DTLS connection before it is needed, so it doesn't take
348	      a long time to set it up when it's actually needed.  This works if
349	      a server has do deal with a relatively small overall number of
350	      clients that wish to interact with the server.  Care must be taken
351	      such that not all clients perform their handshake at the same
352	      time, as a handshake requires considerably more memory than
353	      keeping a connection open.  (See also Section 2.4 below.)

355	   o  Shorten the handshake to four flights.  This may be possible
356	      without losing the denial-of-service roundtrip if the cipher suite
357	      permits that the server remains stateless after sending the
358	      ServerHello and if the flight fits in one datagram (see Figure 1).

360	   o  As an alternative, client puzzles could be used as a mechanism for
361	      mitigating denial-of-service attacks, resulting in a four-flight
362	      exchange similar to the one in HIP DEX [I-D.moskowitz-hip-rg-dex].
363	      The application of client puzzles to TLS has been shown
364	      [USENIX01].  However, a puzzle would be needed that ideally takes
365	      less effort for a constrained device and more effort for an
366	      unconstrained device.

368	    Client                                          Server
369	    ------                                          ------

371	    ClientHello             -------->                           Flight 1

373	                                        HelloVerifyRequest    \
374	                                               ServerHello      Flight 2
375	                            <--------      ServerHelloDone    /
376	                                        (remain stateless)

378	    ClientHello                                               \
379	    "ServerHello"                                              \
380	    ClientKeyExchange                                           Flight 3
381	    [ChangeCipherSpec]                                         /
382	    Finished                -------->                         /

384	                                        [ChangeCipherSpec]    \ Flight 4
385	                            <--------             Finished    /

387	   Figure 1: Artist's impression of a four-flight DTLS handshake with a
388	                              Pre-Shared Key

390	2.4.  Connection Closure

392	   Although a connection needs considerably less memory after a
393	   handshake has finished, it still requires, for example, around 80
394	   bytes with AES-128-CCM [RFC6655] for the keys, sequence numbers and
395	   anti-replay window.  More memory is needed if session resumption is
396	   supported, to remember the 48-byte master secret and negotiated
397	   connection parameters.  This limits how many connections a
398	   constrained device can maintain at a given time.  Often, constrained
399	   devices will have a fixed number of "slots" for connections rather
400	   than allocating memory dynamically for each connection.

402	   DTLS provides a facility for secure connection closure.  When a valid
403	   closure alert is received, an implementation can be assured that no
404	   further data will be received on that connection.  It is noteworthy,
405	   though, that the closure alert is not a handshake message and thus is
406	   not retransmitted when packet loss occurs.

408	   Possible Solutions include:

410	   o  Maintain the session for as long as possible.  When the server
411	      runs out of resources, it can close connections, e.g., using a
412	      Least Frequently Used (LFU) eviction policy.  The client simply
413	      assumes that the connection is active until the server rejects its
414	      application data, in which case the client initiates a new
415	      connection.

417	   o  Use the DTLS Heartbeat Extension [RFC6520] to figure out from time
418	      to time if the connection is still active.

420	2.5.  Data Size

422	   As fragmented handshake messages can arrive at a constrained node in
423	   any order, the receiver must provide a message buffer that is large
424	   enough to hold multiple fragments.  When several handshake messages
425	   forming a single flight are sent out in parallel, it is likely that
426	   the receiver's resources are too limited to order fragments from
427	   distinct handshake messages.  Avoiding this might require additional
428	   resources on the server side to ensure serialization of a flight's
429	   messages.

431	   Furthermore, since handshake messages can be fragmented arbitrarily
432	   and with overlaps, the receiver must, in addition to the message
433	   buffer, keep track of the fragments received so far.  This also makes
434	   the computation of the Finished MAC difficult, which is computed as
435	   if each handshake message had been sent as a single fragment.

437	   Application-level retransmissions require even more buffer space as
438	   replay-protection requires encryption of every single packet that is
439	   to be transmitted.  In particular, this renders destructive in-place
440	   encryption impossible as the source data must be preserved.

442	   Possible Solutions include:

444	   o  Use the same sequence number when retransmitting application data,
445	      so the plaintext can be encrypted in-place without the need for a
446	      second buffer.  Note: The security implications of this change
447	      need to be carefully analyzed.

449	   o  Extend the exchange of handshake messages with acknowledgments
450	      that allow a receiver to confirm the receipt of fragments, and let
451	      the sender wait for the acknowledgment before it sends the next
452	      part of the flight; see also Section 3.

454	   o  Mandate non-overlapping handshake message fragments.

456	   o  Favour cryptographic algorithms that use less memory, possibly
457	      resulting in a slower performance.

459	2.6.  Code Size

461	   Although probably not as severe as data size limits, the code size of
462	   a DTLS implementation also can play a role, in particular for
463	   constrained devices at the lower bound of Class 1 devices.

465	   Possible Solutions include:

467	   o  Use pre-composed messages instead of writing code for encoding or
468	      decoding ASN.1 structures, as shown for example in Appendix A.

470	   o  Avoid static tables for cryptographic functions where possible, as
471	      typical embedded platforms are more restricted in RAM than in non-
472	      volatile memory such as flash ROM.  Instead, their procedural
473	      equivalent is to be used, although less efficient during run-time.

475	2.7.  Application Data Fragmentation

477	   Messages larger than an IP fragment result in undesired packet
478	   fragmentation.  DTLS does not support fragmentation of application
479	   data.  If an implementation of an application layer protocol such as
480	   CoAP [I-D.ietf-core-coap] wants to avoid IP fragmentation, it must
481	   fit the application data (e.g., a CoAP message) and all headers in a
482	   single IP packet.

484	   DTLS has a per-record overhead of 13 bytes for the record header.
485	   AEAD ciphers such as AES-CCM [RFC6655] eat up additional space to
486	   carry the explicit nonce and the authentication tag.  Thus, cipher
487	   suites like TLS_PSK_WITH_AES_128_CCM_8 or
488	   TLS_ECDHE_ECDSA_AES_128_CCM_8 requires 16 additional bytes, leading
489	   to an overall overhead of 29 bytes for the header of each encrypted
490	   DTLS packet.  With packet sizes of 60-80 bytes, this takes a
491	   considerable portion of the available packet size away (see Table 4
492	   below).

494	   +------------------+------------------------+-----------------------+
495	   |    UDP data size |   Number of bytes left |    ... with Stateless |
496	   |    limit (bytes) |   for application data |    Header Compression |
497	   +------------------+------------------------+-----------------------+
498	   |               50 |              21 (42 %) |             39 (78 %) |
499	   |               55 |              26 (47 %) |             44 (80 %) |
500	   |               60 |              31 (52 %) |             49 (82 %) |
501	   |               65 |              36 (55 %) |             54 (83 %) |
502	   |               70 |              41 (59 %) |             59 (84 %) |
503	   |               75 |              46 (61 %) |             64 (85 %) |
504	   |               80 |              51 (64 %) |             69 (86 %) |
505	   |               85 |              56 (66 %) |             74 (87 %) |
506	   |               90 |              61 (68 %) |             79 (88 %) |
507	   |            1,152 |           1,123 (97 %) |          1,141 (99 %) |
508	   +------------------+------------------------+-----------------------+

510	    Table 4: Number of bytes left for data in an ApplicationData record
511	     using DTLS and DTLS with Stateless Header Compression (Section 4)

513	   Possible Solutions include:

515	   o  Elide the GenericAEADCipher.nonce_explicit field when AES-CCM is
516	      used.  The GenericAEADCipher.nonce_explicit field is set to the
517	      16-bit epoch concatenated with the 48-bit sequence number, which
518	      means that the epoch and sequence number are unnecessarily
519	      included twice in each record.

521	   o  Elide the DTLS version field where it is implicitly clear.  Since
522	      the DTLS version is negotiated in the handshake, there should not
523	      be a need to specify the DTLS version in each and every record.

525	   o  Elide the length field of the last record in a datagram.  DTLS
526	      records specify their length, so multiple records can be
527	      transmitted in a single datagram.  When DTLS is used with UDP
528	      (which preserves the boundaries of all message sent), the length
529	      field of the last record in a datagram can be calculated from the
530	      UDP payload length.

532	   For example, when using the Stateless Header Compression presented in
533	   Section 4 and eliminating the redundant epoch and sequence number
534	   information, the number of bytes left in an ApplicationData record
535	   for application data can be significantly increased (see Table 4).

537	2.8.  Application Layer Protocol

539	   When DTLS is used to secure a non-trivial application layer protocol,
540	   there is potential for synergies that can arise from optimizing the
541	   stack of both protocols.

543	   For example, an implementation of CoAP [I-D.ietf-core-coap] with DTLS
544	   security will need to implement both the reliability mechanism for
545	   the DTLS handshake and the reliability mechanism of CoAP.  This not
546	   only increases code size, but also prevents efficient retransmissions
547	   as each CoAP retransmission of the same data is a new transmission in
548	   DTLS.

550	   Possible Solutions include:

552	   o  Make DTLS reliability and fragmentation available to applications.

554	   Accordingly, the application should take advantage of DTLS record
555	   information where possible.  For example, since DTLS sequence numbers
556	   uniquely identify a message in a connection, the 6-byte sequence
557	   number could be used in CoAP to correlate CoAP acknowledgements with
558	   CoAP messages (Message ID, 2 bytes), to correlate CoAP responses with
559	   CoAP requests (Token, 0-8 bytes), to provide an order among CoAP
560	   notifications (3 bytes), and to enable message deduplication.

562	3.  A Comparison of Strategies for Handshake Reliability

564	   A DTLS handshake consists of multiple messages that are fragmented
565	   and grouped in so-called "flights".  As the previous sections have
566	   shown, the strategy employed by DTLS to transmit these flights can
567	   lead to circumstances that are acceptable for existing uses of DTLS
568	   but pose a challenge in constrained environments:

570	   o  The loss of a single packet causes the whole flight of fragments
571	      to be retransmitted, and not just the fragments that were lost.

573	   o  Long processing times can lead to spurious retransmissions.

575	   o  The possibility of arbitrarily reordered fragments requires the
576	      recipient to maintain potentially large buffers.

578	   This section compares the following strategies for reliability:

580	   Bulk without acknowledgements (illustrated in Figure 2 below):
581	      All fragments are retransmitted in exponentially increasing
582	      intervals until the first fragment of the next flight from the
583	      other side is received.  This is the reliability mechanism used in
584	      DTLS 1.2 [RFC6347].

586	   Stop-and-wait with one acknowledgement per fragment (Figure 3):
587	      Each fragment is retransmitted individually until a matching
588	      acknowledgement for the fragment is received.  Only one fragment
589	      is transmitted at a time, and each acknowledgement messages
590	      confirms the receipt of one fragment.  This is the reliability
591	      mechanism used in CoAP [I-D.ietf-core-coap].

593	   Bulk with one cumulative acknowledgement per flight (Figure 4):
594	      Unacknowledged fragments of the flight are transmitted using a
595	      sliding window until all fragments have been acknowledged.
596	      Acknowledgements specify all fragments that have been received so
597	      far (highest sequence number seen + a bit field).

599	   Table 5 shows the average number of transmissions needed for these
600	   three strategies to successfully complete an example DTLS handshake.
601	   (Every DTLS handshake is eventually successful if no side gives up
602	   after a number of retransmission attempts.)

604	   The results were obtained using a very simple network simulator that
605	   randomly drops packets according to the given loss rate, but provides
606	   ideal conditions otherwise.  To avoid spurious retransmissions, timer
607	   values were selected larger than the processing times for flights;
608	   this may be impractical if sensible retransmission intervals and
609	   processing times differ in orders of magnitudes.

611	              +-----------+----------+----------+----------+
612	              | Loss rate | Figure 2 | Figure 3 | Figure 4 |
613	              +-----------+----------+----------+----------+
614	              |       0 % |     18.0 |     36.0 |     19.0 |
615	              |       5 % |     22.2 |     39.7 |     20.5 |
616	              |      10 % |     25.9 |     41.8 |     23.8 |
617	              |      15 % |     27.6 |     44.7 |     25.1 |
618	              |      20 % |     33.3 |     51.6 |     27.1 |
619	              |      25 % |     40.0 |     57.2 |     33.3 |
620	              |      30 % |     39.2 |     64.0 |     37.4 |
621	              |      35 % |     45.6 |     66.4 |     44.0 |
622	              |      40 % |     55.4 |     74.7 |     46.2 |
623	              |      45 % |     54.4 |     90.0 |     47.9 |
624	              |      50 % |     67.2 |    102.2 |     57.2 |
625	              |      55 % |     76.8 |    124.3 |     62.3 |
626	              |      60 % |     96.9 |    151.3 |     74.4 |
627	              |      65 % |    109.4 |    170.5 |     86.4 |
628	              |      70 % |    115.8 |    248.2 |    106.8 |
629	              |      75 % |    159.1 |    348.5 |    141.5 |
630	              |      80 % |    199.6 |    528.6 |    169.9 |
631	              |      85 % |    343.4 |    804.4 |    278.0 |
632	              +-----------+----------+----------+----------+

634	   Table 5: Average number of transmissions for different strategies in
635	     an example ECDHE_ECDSA handshake with Raw Public Key Certificate

637	                               Sender   Recipient
638	                                 |          |
639	                     Fragment 0  +--------->|
640	                     Fragment 1  +-----X    |
641	                     Fragment 2  +-----X    |
642	                     Fragment 3  +--------->|
643	                                 |          |
644	                     Fragment 0  +-----X    |
645	                     Fragment 1  +--------->|
646	                     Fragment 2  +--------->|
647	                     Fragment 3  +--------->|
648	                                 |    X-----+  Fragment 0
649	                                 |          |
650	                     Fragment 0  +--------->|
651	                     Fragment 1  +-----X    |
652	                     Fragment 2  +--------->|
653	                     Fragment 3  +-----X    |
654	                                 |<---------+  Fragment 0
655	                                 |          |

657	        Figure 2: Bulk transmission without acknowledgements (DTLS)
658	                               Sender   Recipient
659	                                 |          |
660	                     Fragment 0  +--------->|
661	                                 |<---------+  Acknowledge 0
662	                                 |          |
663	                     Fragment 1  +-----X    |
664	                                 |          |
665	                     Fragment 1  +-----X    |
666	                                 |          |
667	                     Fragment 1  +--------->|
668	                                 |<---------+  Acknowledge 1
669	                                 |          |
670	                     Fragment 2  +--------->|
671	                                 |<---------+  Acknowledge 2
672	                                 |          |
673	                     Fragment 3  +--------->|
674	                                 |    X-----+  Acknowledge 3
675	                                 |          |
676	                     Fragment 3  +--------->|
677	                                 |<---------+  Acknowledge 3
678	                                 |          |

680	     Figure 3: Stop-and-wait transmission with one acknowledgement per
681	                                 fragment

683	                               Sender   Recipient
684	                                 |          |
685	                     Fragment 0  +--------->|
686	                     Fragment 1  +-----X    |
687	                     Fragment 2  +-----X    |
688	                     Fragment 3  +--------->|
689	                                 |<---------+  Acknowledge 0, 3
690	                                 |          |
691	                     Fragment 1  +-----X    |
692	                     Fragment 2  +--------->|
693	                                 |    X-----+  Acknowledge 0, 2, 3
694	                                 |          |
695	                     Fragment 1  +--------->|
696	                     Fragment 2  +--------->|
697	                                 |    X-----+  Acknowledge 0, 1, 2, 3
698	                                 |          |
699	                     Fragment 1  +--------->|
700	                     Fragment 2  +-----X    |
701	                                 |<---------+  Acknowledge 0, 1, 2, 3
702	                                 |          |

704	      Figure 4: Bulk transmission with one acknowledgement per flight

706	4.  A Strawman for Stateless Header Compression

708	   Stateless Header Compression compresses the headers of DTLS 1.2
709	   records and handshake messages.  The compression is lossless, does
710	   not increase the record length and is done without explicitly
711	   building any compression context state.

713	   The Finished MAC is computed as if each handshake message was sent
714	   uncompressed.

716	4.1.  Records

718	   Records are compressed by specifying the type, version, epoch,
719	   sequence_number and length fields using a variable number of bytes.
720	   A prefix is added in front of the structure to indicate the length of
721	   each field or to specify the value of the field directly.  If the
722	   value is specified directly, the field itself is elided.  The format
723	   of the prefix is as follows:

725	                       0                   1
726	                      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
727	                     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
728	                     |0| T | V |  E  |1 1 0|  S  | L |
729	                     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

731	   The fields in the prefix are defined as follows:

733	   T: Describes the type field.

735	      0 - Content Type 20 (ChangeCipherSpec)
736	      1 - 8-bit type field
737	      2 - Content Type 22 (Handshake)
738	      3 - Content Type 23 (Application Data)

740	   V: Describes the version field.

742	      0 - Version 254.255 (DTLS 1.0)
743	      1 - 16-bit version field
744	      2 - Version 254.253 (DTLS 1.2)
745	      3 - Reserved for future use

747	   E: Describes the epoch field.

749	      0 - Epoch 0
750	      1 - Epoch 1
751	      2 - Epoch 2
752	      3 - Epoch 3
753	      4 - Epoch 4
754	      5 - 8-bit epoch field
755	      6 - 16-bit epoch field
756	      7 - Implicit -- same as previous record in the datagram

758	   S: Describes the sequence_number field.

760	      0 - Sequence number 0
761	      1 - 8-bit sequence_number field
762	      2 - 16-bit sequence_number field
763	      3 - 24-bit sequence_number field
764	      4 - 32-bit sequence_number field
765	      5 - 40-bit sequence_number field
766	      6 - 48-bit sequence_number field
767	      7 - Implicit -- number of previous record in the datagram + 1

769	   L: Describes the length field.

771	      0 - Length 0
772	      1 - 8-bit length field
773	      2 - 16-bit length field
774	      3 - Implicit -- last record in the datagram

776	4.2.  Handshake Messages

778	   Handshake messages are compressed in a similar way.  A prefix is
779	   added in front of the structure to indicate the length of each field
780	   or to specify the value of the field directly.  If the value is
781	   specified directly, the field itself is elided.  The format of the
782	   prefix is as follows:

784	                       0                   1
785	                      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
786	                     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
787	                     |0 0|   T   | L |   S   | O | C |
788	                     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

790	   The fields in the prefix are defined as follows:

792	   T: Describes the msg_type field.

794	      0 - 8-bit msg_type field
795	      1 - Handshake Type 1 (Client Hello)
796	      2 - Handshake Type 2 (Server Hello)
797	      3 - Handshake Type 3 (Hello Verify Request)
798	      4 - Reserved for future use
799	      5 - Reserved for future use
800	      6 - Reserved for future use
801	      7 - Handshake Type 11 (Certificate)
802	      8 - Handshake Type 12 (Server Key Exchange)
803	      9 - Handshake Type 13 (Certificate Request)
804	      10 - Handshake Type 14 (Server Hello Done)
805	      11 - Handshake Type 15 (Certificate Verify)
806	      12 - Handshake Type 16 (Client Key Exchange)
807	      13 - Reserved for future use
808	      14 - Reserved for future use
809	      15 - Handshake Type 20 (Finished)

811	   L: Describes the length field.

813	      0 - Implicit -- last message in the record
814	      1 - 8-bit length field
815	      2 - 16-bit length field
816	      3 - 24-bit length field

818	   S: Describes the message_seq field.

820	      0 - Message sequence number 0
821	      1 - Message sequence number 1
822	      2 - Message sequence number 2
823	      3 - Message sequence number 3
824	      4 - Message sequence number 4
825	      5 - Message sequence number 5
826	      6 - Message sequence number 6
827	      7 - Message sequence number 7
828	      8 - Message sequence number 8
829	      9 - Message sequence number 9
830	      10 - Message sequence number 10
831	      11 - Message sequence number 11
832	      12 - Message sequence number 12
833	      13 - 8-bit message_seq field
834	      14 - 16-bit message_seq field
835	      15 - Implicit -- number of previous message in the record + 1

837	   O: Describes the fragment_offset field.

839	      0 - Offset 0
840	      1 - 8-bit fragment_offset field
841	      2 - 16-bit fragment_offset field
842	      3 - 24-bit fragment_offset field

844	   C: Describes the fragment_length field.

846	      0 - Implicit -- message length minus fragment_offset
847	      1 - 8-bit fragment_length field
848	      2 - 16-bit fragment_length field
849	      3 - 24-bit fragment_length field

851	5.  Security Considerations

853	   Beyond implementation techniques and stateless header compression,
854	   any changes to the TLS/DTLS protocol need to be performed extremely
855	   carefully.  No analysis has been done in the present version of this
856	   draft.

858	6.  IANA Considerations

860	   This draft includes no request to IANA.

862	7.  Acknowledgements

864	   Olaf Bergmann was an original author of this draft and is
865	   acknowledged for significant contribution to this document.

867	   Thanks to Angelo P. Castellani, Stefan Jucker, Shahid Raza, and Silke
868	   Schaefer for helpful comments and discussions that have shaped the
869	   document.

871	8.  References

873	8.1.  Normative References

875	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
876	              Requirement Levels", BCP 14, RFC 2119, March 1997.

878	   [RFC5246]  Dierks, T. and E. Rescorla, "The Transport Layer Security
879	              (TLS) Protocol Version 1.2", RFC 5246, August 2008.

881	   [RFC6347]  Rescorla, E. and N. Modadugu, "Datagram Transport Layer
882	              Security Version 1.2", RFC 6347, January 2012.

884	8.2.  Informative References

886	   [I-D.aks-crypto-sensors]
887	              Sethi, M., Arkko, J., Keranen, A., and H. Rissanen,
888	              "Practical Considerations and Implementation Experiences
889	              in Securing Smart Object Networks",
890	              draft-aks-crypto-sensors-02 (work in progress),
891	              March 2012.

893	   [I-D.bormann-6lo-ghc]
894	              Bormann, C., "6LoWPAN Generic Compression of Headers and
895	              Header-like Payloads", draft-bormann-6lo-ghc-00 (work in
896	              progress), October 2013.

898	   [I-D.ietf-core-coap]
899	              Shelby, Z., Hartke, K., and C. Bormann, "Constrained
900	              Application Protocol (CoAP)", draft-ietf-core-coap-18
901	              (work in progress), June 2013.

903	   [I-D.ietf-dice-profile]
904	              Hartke, K. and H. Tschofenig, "A DTLS 1.2 Profile for the
905	              Internet of Things", draft-ietf-dice-profile-00 (work in
906	              progress), March 2014.

908	   [I-D.ietf-lwig-guidance]
909	              Bormann, C., "Guidance for Light-Weight Implementations of
910	              the Internet Protocol Suite", draft-ietf-lwig-guidance-03
911	              (work in progress), February 2013.

913	   [I-D.ietf-lwig-terminology]
914	              Bormann, C., Ersue, M., and A. Keranen, "Terminology for
915	              Constrained Node Networks", draft-ietf-lwig-terminology-07
916	              (work in progress), February 2014.

918	   [I-D.ietf-tls-cached-info]
919	              Santesson, S. and H. Tschofenig, "Transport Layer Security
920	              (TLS) Cached Information Extension",
921	              draft-ietf-tls-cached-info-16 (work in progress),
922	              February 2014.

924	   [I-D.ietf-tls-oob-pubkey]
925	              Wouters, P., Tschofenig, H., Gilmore, J., Weiler, S., and
926	              T. Kivinen, "Using Raw Public Keys in Transport Layer
927	              Security (TLS) and Datagram Transport Layer Security
928	              (DTLS)", draft-ietf-tls-oob-pubkey-11 (work in progress),
929	              January 2014.

931	   [I-D.mcgrew-tls-aes-ccm-ecc]
932	              McGrew, D., Bailey, D., Campagna, M., and R. Dugal, "AES-
933	              CCM ECC Cipher Suites for TLS",
934	              draft-mcgrew-tls-aes-ccm-ecc-08 (work in progress),
935	              February 2014.

937	   [I-D.moskowitz-hip-rg-dex]
938	              Moskowitz, R., "HIP Diet EXchange (DEX)",
939	              draft-moskowitz-hip-rg-dex-06 (work in progress),
940	              May 2012.

942	   [I-D.raza-dice-compressed-dtls]
943	              Raza, S., Shafagh, H., and O. Dupont, "Compression of
944	              Record and Handshake Headers for Constrained
945	              Environments", draft-raza-dice-compressed-dtls-00 (work in
946	              progress), March 2014.

948	   [IEEE.802-15-4]
949	              "Information technology - Telecommunications and
950	              information exchange between systems - Local and
951	              metropolitan area networks - Specific requirements - Part
952	              15.4: Wireless Medium Access Control (MAC) and Physical
953	              Layer (PHY) Specifications for Low-Rate Wireless Personal
954	              Area Networks (WPANs)", IEEE Standard 802.15.4,
955	              September 2006, <http://standards.ieee.org/getieee802/
956	              download/802.15.4-2006.pdf>.

958	   [RFC4302]  Kent, S., "IP Authentication Header", RFC 4302,
959	              December 2005.

961	   [RFC4944]  Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler,
962	              "Transmission of IPv6 Packets over IEEE 802.15.4
963	              Networks", RFC 4944, September 2007.

965	   [RFC6256]  Eddy, W. and E. Davies, "Using Self-Delimiting Numeric
966	              Values in Protocols", RFC 6256, May 2011.

968	   [RFC6298]  Paxson, V., Allman, M., Chu, J., and M. Sargent,
969	              "Computing TCP's Retransmission Timer", RFC 6298,
970	              June 2011.

972	   [RFC6520]  Seggelmann, R., Tuexen, M., and M. Williams, "Transport
973	              Layer Security (TLS) and Datagram Transport Layer Security
974	              (DTLS) Heartbeat Extension", RFC 6520, February 2012.

976	   [RFC6574]  Tschofenig, H. and J. Arkko, "Report from the Smart Object
977	              Workshop", RFC 6574, April 2012.

979	   [RFC6655]  McGrew, D. and D. Bailey, "AES-CCM Cipher Suites for
980	              Transport Layer Security (TLS)", RFC 6655, July 2012.

982	   [SEC1]     Brown, D., "Standards for Efficient Cryptography 1 (SEC
983	              1): Elliptic Curve Cryptography", Version 2.0, May 2009.

985	   [USENIX01]
986	              Dean, D. and A. Stubblefield, "Using Client Puzzles to
987	              Protect TLS", 10th USENIX Security Symposium, August 2001,
988	              <http://static.usenix.org/events/sec01/full_papers/dean/
989	              dean.pdf>.

991	Appendix A.  Templates

993	   When elliptic curve cryptography is used, building and parsing the
994	   bodies of Certificate, ServerKeyExchange and ClientKeyExchange
995	   messages mainly involves the encoding and decoding of elliptic curve
996	   points.  The points are encapsulated in a mix of DTLS structures and
997	   ASN.1 sequences.  For a given elliptic curve, some parts of a message
998	   body are static, which allows using pre-composed messages instead of
999	   writing lots of memory consuming code pertaining to DTLS and ASN.1.

1001	   This appendix provides templates for the SubjectPublicKeyInfo
1002	   structures for the named curves secp256r1, secp384r1 and secp521r1,
1003	   also known as NIST P-256, P-384 and P-521, respectively.  These
1004	   curves are the ones required in [I-D.mcgrew-tls-aes-ccm-ecc].  The
1005	   points are represented in uncompressed point format.

1007	      Note: Previous versions of the document provided templates for
1008	      ServerKeyExchange and ClientKeyExchange messages.  These templates
1009	      were not correct, as the messages are actually variable in length
1010	      depending on the sign of the encoded points.

1012	   SubjectPublicKeyInfo: secp256r1

1014	              30 59 30 13 06 07 2a 86  48 ce 3d 02 01 06 08 2a
1015	              86 48 ce 3d 03 01 07 03  42 00 04 __ __ __ __ __
1016	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1017	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1018	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1019	              __ __ __ __ __ __ __ __  __ __ __

1021	   SubjectPublicKeyInfo: secp384r1

1023	              30 76 30 10 06 07 2a 86  48 ce 3d 02 01 06 05 2b
1024	              81 04 00 22 03 62 00 04  __ __ __ __ __ __ __ __
1025	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1026	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1027	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1028	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1029	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1030	              __ __ __ __ __ __ __ __

1032	   SubjectPublicKeyInfo: secp521r1

1034	              30 81 9b 30 10 06 07 2a  86 48 ce 3d 02 01 06 05
1035	              2b 81 04 00 23 03 81 86  00 04 __ __ __ __ __ __
1036	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1037	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1038	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1039	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1040	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1041	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1042	              __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
1043	              __ __ __ __ __ __ __ __  __ __ __ __ __ __

1045	Author's Address

1047	   Klaus Hartke
1048	   Universitaet Bremen TZI
1049	   Postfach 330440
1050	   Bremen  D-28359
1051	   Germany

1053	   Phone: +49-421-218-63905
1054	   Email: hartke@tzi.org