idnits 2.17.1 draft-hartke-dice-practical-issues-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 8, 2014) is 3665 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'ChangeCipherSpec' is mentioned on line 384, but not defined ** Obsolete normative reference: RFC 5246 (Obsoleted by RFC 8446) ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147) == Outdated reference: A later version (-17) exists of draft-ietf-dice-profile-00 == Outdated reference: A later version (-23) exists of draft-ietf-tls-cached-info-16 Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 DICE Working Group K. Hartke 3 Internet-Draft Universitaet Bremen TZI 4 Intended status: Informational April 8, 2014 5 Expires: October 10, 2014 7 Practical Issues with 8 Datagram Transport Layer Security in Constrained Environments 9 draft-hartke-dice-practical-issues-01 11 Abstract 13 This document investigates practical issues around the implementation 14 of Datagram Transport Layer Security (DTLS) 1.2 in constrained 15 environments, and explores some ideas for an optimized version of 16 DTLS 1.2 that is more friendly to constrained nodes and networks. 18 Status of this Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at http://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on October 10, 2014. 35 Copyright Notice 37 Copyright (c) 2014 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 53 1.1. Background . . . . . . . . . . . . . . . . . . . . . . . . 3 54 1.2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 56 2. Potential Problems and Possible Solutions . . . . . . . . . . 4 57 2.1. Handshake Reliability and Fragmentation . . . . . . . . . 4 58 2.2. Timer Values . . . . . . . . . . . . . . . . . . . . . . . 7 59 2.3. Connection Initiation . . . . . . . . . . . . . . . . . . 8 60 2.4. Connection Closure . . . . . . . . . . . . . . . . . . . . 9 61 2.5. Data Size . . . . . . . . . . . . . . . . . . . . . . . . 10 62 2.6. Code Size . . . . . . . . . . . . . . . . . . . . . . . . 10 63 2.7. Application Data Fragmentation . . . . . . . . . . . . . . 11 64 2.8. Application Layer Protocol . . . . . . . . . . . . . . . . 12 65 3. A Comparison of Strategies for Handshake Reliability . . . . . 13 66 4. A Strawman for Stateless Header Compression . . . . . . . . . 16 67 4.1. Records . . . . . . . . . . . . . . . . . . . . . . . . . 16 68 4.2. Handshake Messages . . . . . . . . . . . . . . . . . . . . 17 69 5. Security Considerations . . . . . . . . . . . . . . . . . . . 19 70 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 71 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 72 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 73 8.1. Normative References . . . . . . . . . . . . . . . . . . . 19 74 8.2. Informative References . . . . . . . . . . . . . . . . . . 19 75 Appendix A. Templates . . . . . . . . . . . . . . . . . . . . . . 22 76 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 23 78 1. Introduction 80 1.1. Background 82 Nodes taking part in the "Internet of Things" often have strict 83 limitations regarding their computational power, memory size (both 84 RAM and ROM) and power management [I-D.ietf-lwig-terminology]. 85 Network communication, in particular if wireless, also imposes 86 constraints that need to be considered during protocol design, such 87 as low bitrates, variable delays and and possibly high packet loss. 89 Moreover, frames at the link layer might be much smaller than the 90 IPv6 minimum MTU of 1280 bytes and therefore require additional 91 adaptation mechanisms such as 6LoWPAN [RFC4944] for IEEE 802.15.4 92 wireless networks [IEEE.802-15-4], which in turn may exacerbate the 93 limitations of the network: for instance, as high loss rates are 94 anticipated by design, application protocols usually try to avoid 95 fragmentation at the network layer. 97 However, application protocols often delegate security mechanisms to 98 transport layer security protocols. More often than not, the 99 protocol overhead from securing the communication is highly relevant 100 to the overall performance of the systems. 102 One protocol that has received significant attention recently for 103 constrained node/network applications is Datagram Transport Layer 104 Security (DTLS) [RFC6347]. DTLS is derived from and inherits some 105 characteristics from TLS [RFC5246]. Although it has clearly not been 106 designed with constrained devices and lossy networks in mind, it is 107 thought to be usable in these environments [RFC6574]. There are 108 still a few challenges when it comes to actually implement DTLS. 110 1.2. Overview 112 The present document investigates practical issues around the 113 implementation of DTLS 1.2 in constrained environments, and explores 114 a few ideas that could lead to an optimized version of DTLS that is 115 more friendly to constrained nodes and networks. 117 The ideas generally fall into one of the following categories: 119 Implementation guidance: Implementation techniques for achieving 120 light-weight implementations of DTLS, without affecting 121 conformance to the relevant specifications or interoperability 122 with other implementations. This includes techniques for reducing 123 complexity, memory footprint, or power usage. The result may 124 eventually be incorporated into [I-D.ietf-lwig-guidance]. 126 Protocol profile: Use of DTLS in a particular way, for example, by 127 changing certain "MAY"s into "MUST"s or "MUST NOT"s, or by 128 prescribing or precluding certain extensions and cipher suites. 129 DTLS implementations ought to be usable without change if they can 130 be configured accordingly. See also [I-D.ietf-dice-profile]. 132 Stateless header compression: Compression of DTLS records without 133 explicitly building any compression context state. This is done 134 by using shorter forms to represent the same bits of information 135 or relying on information that is already shared by the client and 136 server. Existing DTLS implementations can continue to be used if 137 a thin layer is added that handles compression and decompression. 139 Breaking changes: New implementations are required that do not 140 interoperate with implementations of DTLS, though there is no 141 intention in this document to change the overall operation of TLS. 143 1.3. Terminology 145 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 146 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 147 document are to be interpreted as described in RFC 2119 [RFC2119]. 148 Note that this document itself is informational, but it is discussing 149 normative statements. 151 2. Potential Problems and Possible Solutions 153 2.1. Handshake Reliability and Fragmentation 155 DTLS records can be large in size for a single 6LoWPAN [RFC4944] 156 payload: IEEE 802.15.4 [IEEE.802-15-4] specifies a physical layer MTU 157 of only 127 bytes, which yields about 60-80 bytes of payload after 158 adding MAC layer and adaptation layer headers. Although 6LoWPAN 159 supports the fragmentation of IPv6 packets into small link-layer 160 frames, this is generally tried to be avoided in low-power, lossy 161 networks. 163 DTLS offers fragmentation at the handshake layer and hence can help 164 to prevent IP fragmentation. However, this can add a significant 165 overhead on the number of datagrams and bytes transferred (see 166 Table 1 below). Packet loss is also still a big problem for the 167 constrained nodes: since fragments may arrive in any order, buffers 168 must be large enough to hold all messages after reassembly, and 169 losing a single fragment will cause all fragments of a message flight 170 to be retransmitted. This is very likely especially during key and 171 certificate exchanges as these will not fit within a packet without 172 fragmentation in most 6LoWPANs. 174 +--------------+-----------------+------------------+---------------+ 175 | UDP data | Number of | Total number of | Proportion of | 176 | size limit | datagrams | bytes | header data | 177 | (bytes) | transferred | transferred | | 178 +--------------+-----------------+------------------+---------------+ 179 | 50 | 27 | 1,182 | 55 % | 180 | 55 | 21 | 1,037 | 49 % | 181 | 60 | 20 | 1,081 | 51 % | 182 | 65 | 18 | 1,003 | 47 % | 183 | 70 | 15 | 912 | 42 % | 184 | 75 | 14 | 875 | 39 % | 185 | 80 | 13 | 874 | 39 % | 186 | 85 | 12 | 849 | 37 % | 187 | 90 | 12 | 849 | 37 % | 188 | 1,152 | 6 | 802 | 34 % | 189 +--------------+-----------------+------------------+---------------+ 191 Table 1: Number of datagrams and bytes transferred using different 192 limits for DTLS fragmentation in an example DTLS handshake 193 (TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8 with Raw Public Key Certificate) 195 Possible Solutions include: 197 o Perform the handshake using alternative mechanisms for reliability 198 and fragmentation over UDP: 200 * Use IP fragmentation. If no X.509 certificates are involved, 201 the handshake messages of one flight typically require less 202 than 400 bytes combined. Since all messages of a flight in 203 DTLS are retransmitted anyway when a single fragment is lost, 204 the difference between performing the fragmentation at the DTLS 205 layer and at the IP layer is probably not huge. 207 * Use DTLS fragmentation. When compared to, for example, the 208 reliability mechanism of CoAP over UDP [I-D.ietf-core-coap] 209 (where the receipt of each data fragment is confirmed by one 210 acknowledgement message, and an acknowledgement message may 211 opportunistically piggyback data in the opposite direction), 212 DTLS actually performs better for a typical DTLS handshake in 213 both lossy and non-lossy network environments (cf. Section 3). 215 * Extend DTLS with acknowledgment messages that confirm the 216 receipt of fragments and allow an implementation to retransmit 217 only the fragments that are missing. Section 3 explores a 218 number of strategies for the reliable transmission of DTLS 219 handshake messages with acknowledgements, including CoAP-style 220 acknowledgements and cumulative acknowledgements. 222 +--------------+-----------------+------------------+---------------+ 223 | UDP data | Number of | Total number of | Proportion of | 224 | size limit | datagrams | bytes | header data | 225 | (bytes) | transferred | transferred | | 226 +--------------+-----------------+------------------+---------------+ 227 | 50 | 15 (56 %) | 592 (50 %) | 10 % | 228 | 55 | 13 (62 %) | 585 (56 %) | 9 % | 229 | 60 | 13 (65 %) | 621 (57 %) | 14 % | 230 | 65 | 11 (61 %) | 588 (59 %) | 10 % | 231 | 70 | 11 (73 %) | 573 (63 %) | 7 % | 232 | 75 | 11 (79 %) | 573 (65 %) | 7 % | 233 | 80 | 10 (77 %) | 567 (65 %) | 6 % | 234 | 85 | 10 (83 %) | 567 (67 %) | 6 % | 235 | 90 | 10 (83 %) | 567 (67 %) | 6 % | 236 | 1,152 | 6 (100 %) | 617 (77 %) | 14 % | 237 +--------------+-----------------+------------------+---------------+ 239 Table 2: Number of datagrams and bytes transferred in the same 240 example DTLS handshake as in Table 1 but using the strawman for 241 Stateless Header Compression described in Section 4 243 o Reduce the number of bytes to be transferred, so fewer packets 244 need to be transmitted that could potentially be lost: 246 * Exchange large blobs using an out-of-band mechanism. The TLS 247 Cached Information Extension [I-D.ietf-tls-cached-info], for 248 example, allows to omit the exchange of fairly static data such 249 as the server certificate, if this data is already available. 251 * Perform a DTLS-specific kind of Stateless Header Compression, 252 as explored in Section 4. This can significantly reduce the 253 number of datagrams and bytes transferred, and in particular 254 also the proportion of header data within the number of bytes 255 transferred (see Table 2 above). 257 * Compress DTLS headers with 6LoWPAN General Header Compression 258 [I-D.bormann-6lo-ghc], or a specific DTLS format for 6LoWPAN 259 Next Header Compression [I-D.raza-dice-compressed-dtls]. 261 * Recover the Raw Public Key Certificate 262 [I-D.ietf-tls-oob-pubkey] from the ECDSA signature in a 263 ECDHE_ECDSA handshake, instead of transmitting both the public 264 key and the signature. This is decribed in Section 4.1.6 of 265 [SEC1]: 267 "This is also useful in bandwidth constrained environments, 268 when transmission of public keys cannot be afforded. Entity 269 U could send a signature to entity V, who recovers QU. 271 Entity V can look up the public key in some certificate or 272 directory, and if it matches then the signature can be 273 accepted." 275 * Mandate the use compressed point formats for elliptic curves. 277 * Transmit only the low-order N bits of the 48 bit sequence 278 numbers and reconstruct the (48-N) high-order bits, as is 279 similarly done for extended sequence numbers in IPsec (see 280 Appendix B of RFC 4302 [RFC4302]). 282 * Use self-delimiting numeric values [RFC6256] instead of fixed- 283 sized fields. 285 * Use a single bit field instead of multiple type fields to 286 indicate which handshake messages are present in a record. 288 2.2. Timer Values 290 RFC 6347 [RFC6347] leaves the choice of timer values to the 291 implementation, but makes the following recommendation: 293 "Implementations SHOULD use an initial timer value of 1 second 294 (the minimum defined in RFC 6298 [RFC6298]) and double the value 295 at each retransmission, up to no less than the RFC 6298 maximum of 296 60 seconds." [RFC6347] 298 Given the time required by some algorithms when executed on a 299 constrained devices (see Table 3), an initial timer value of 1 second 300 can easily lead to spurious retransmissions. 302 +-------------+--------------+-----------+------------+-------------+ 303 | Algorithm | Library | Memory | Execution | Comparable | 304 | | | footprint | time | RSA key | 305 | | | (bytes) | (seconds) | length | 306 +-------------+--------------+-----------+------------+-------------+ 307 | RSA 1024 | AvrCryptolib | 640 | 199.7 | | 308 | RSA 2048 | AvrCryptolib | 1,280 | 1,587.6 | | 309 | ECDSA 160r1 | TinyECC | 892 | 2.3 | 1024 | 310 | ECDSA 192r1 | TinyECC | 1,008 | 3.6 | 1536 | 311 | ECDSA 160r1 | Wiselib | 842 | 20.2 | 1024 | 312 | ECDSA 192r1 | Wiselib | 952 | 34.6 | 1536 | 313 | ECDSA 163k1 | Relic | 2,804 | 0.3 | 1024 | 314 | ECDSA 233k1 | Relic | 3,675 | 1.8 | 2048 | 315 +-------------+--------------+-----------+------------+-------------+ 317 Table 3: RSA private key operation and ECDSA signature performance 318 (from [I-D.aks-crypto-sensors]) 320 Possible Solutions include: 322 o Adjust the timer value to meet the conditions of constrained nodes 323 and low-power, lossy networks. 325 o Add acknowledgment messages to DTLS that allow an implementation 326 to confirm the receipt of a message before starting to prepare its 327 response message flight; see Section 3. 329 2.3. Connection Initiation 331 Nodes with very constrained main memory also suffer from the 332 complexity of the DTLS handshake protocol. We envision that the 333 acceptance of DTLS as security protocol for embedded devices would 334 significantly increase if a less complex connection initiation 335 procedure with a smaller number of handshake messages was defined. 337 Compared to TLS, DTLS exacerbates the connection initiation: A DTLS 338 handshake has an additional roundtrip that results from the addition 339 of a stateless cookie exchange. This exchange is designed to prevent 340 certain denial-of-service attacks: consumption of excessive server 341 resources caused by the transmission of a series of handshake 342 initiation requests, and use of the server as an amplifier by sending 343 connection initiation messages with a forged source of the victim. 345 Possible Solutions include: 347 o Create the DTLS connection before it is needed, so it doesn't take 348 a long time to set it up when it's actually needed. This works if 349 a server has do deal with a relatively small overall number of 350 clients that wish to interact with the server. Care must be taken 351 such that not all clients perform their handshake at the same 352 time, as a handshake requires considerably more memory than 353 keeping a connection open. (See also Section 2.4 below.) 355 o Shorten the handshake to four flights. This may be possible 356 without losing the denial-of-service roundtrip if the cipher suite 357 permits that the server remains stateless after sending the 358 ServerHello and if the flight fits in one datagram (see Figure 1). 360 o As an alternative, client puzzles could be used as a mechanism for 361 mitigating denial-of-service attacks, resulting in a four-flight 362 exchange similar to the one in HIP DEX [I-D.moskowitz-hip-rg-dex]. 363 The application of client puzzles to TLS has been shown 364 [USENIX01]. However, a puzzle would be needed that ideally takes 365 less effort for a constrained device and more effort for an 366 unconstrained device. 368 Client Server 369 ------ ------ 371 ClientHello --------> Flight 1 373 HelloVerifyRequest \ 374 ServerHello Flight 2 375 <-------- ServerHelloDone / 376 (remain stateless) 378 ClientHello \ 379 "ServerHello" \ 380 ClientKeyExchange Flight 3 381 [ChangeCipherSpec] / 382 Finished --------> / 384 [ChangeCipherSpec] \ Flight 4 385 <-------- Finished / 387 Figure 1: Artist's impression of a four-flight DTLS handshake with a 388 Pre-Shared Key 390 2.4. Connection Closure 392 Although a connection needs considerably less memory after a 393 handshake has finished, it still requires, for example, around 80 394 bytes with AES-128-CCM [RFC6655] for the keys, sequence numbers and 395 anti-replay window. More memory is needed if session resumption is 396 supported, to remember the 48-byte master secret and negotiated 397 connection parameters. This limits how many connections a 398 constrained device can maintain at a given time. Often, constrained 399 devices will have a fixed number of "slots" for connections rather 400 than allocating memory dynamically for each connection. 402 DTLS provides a facility for secure connection closure. When a valid 403 closure alert is received, an implementation can be assured that no 404 further data will be received on that connection. It is noteworthy, 405 though, that the closure alert is not a handshake message and thus is 406 not retransmitted when packet loss occurs. 408 Possible Solutions include: 410 o Maintain the session for as long as possible. When the server 411 runs out of resources, it can close connections, e.g., using a 412 Least Frequently Used (LFU) eviction policy. The client simply 413 assumes that the connection is active until the server rejects its 414 application data, in which case the client initiates a new 415 connection. 417 o Use the DTLS Heartbeat Extension [RFC6520] to figure out from time 418 to time if the connection is still active. 420 2.5. Data Size 422 As fragmented handshake messages can arrive at a constrained node in 423 any order, the receiver must provide a message buffer that is large 424 enough to hold multiple fragments. When several handshake messages 425 forming a single flight are sent out in parallel, it is likely that 426 the receiver's resources are too limited to order fragments from 427 distinct handshake messages. Avoiding this might require additional 428 resources on the server side to ensure serialization of a flight's 429 messages. 431 Furthermore, since handshake messages can be fragmented arbitrarily 432 and with overlaps, the receiver must, in addition to the message 433 buffer, keep track of the fragments received so far. This also makes 434 the computation of the Finished MAC difficult, which is computed as 435 if each handshake message had been sent as a single fragment. 437 Application-level retransmissions require even more buffer space as 438 replay-protection requires encryption of every single packet that is 439 to be transmitted. In particular, this renders destructive in-place 440 encryption impossible as the source data must be preserved. 442 Possible Solutions include: 444 o Use the same sequence number when retransmitting application data, 445 so the plaintext can be encrypted in-place without the need for a 446 second buffer. Note: The security implications of this change 447 need to be carefully analyzed. 449 o Extend the exchange of handshake messages with acknowledgments 450 that allow a receiver to confirm the receipt of fragments, and let 451 the sender wait for the acknowledgment before it sends the next 452 part of the flight; see also Section 3. 454 o Mandate non-overlapping handshake message fragments. 456 o Favour cryptographic algorithms that use less memory, possibly 457 resulting in a slower performance. 459 2.6. Code Size 461 Although probably not as severe as data size limits, the code size of 462 a DTLS implementation also can play a role, in particular for 463 constrained devices at the lower bound of Class 1 devices. 465 Possible Solutions include: 467 o Use pre-composed messages instead of writing code for encoding or 468 decoding ASN.1 structures, as shown for example in Appendix A. 470 o Avoid static tables for cryptographic functions where possible, as 471 typical embedded platforms are more restricted in RAM than in non- 472 volatile memory such as flash ROM. Instead, their procedural 473 equivalent is to be used, although less efficient during run-time. 475 2.7. Application Data Fragmentation 477 Messages larger than an IP fragment result in undesired packet 478 fragmentation. DTLS does not support fragmentation of application 479 data. If an implementation of an application layer protocol such as 480 CoAP [I-D.ietf-core-coap] wants to avoid IP fragmentation, it must 481 fit the application data (e.g., a CoAP message) and all headers in a 482 single IP packet. 484 DTLS has a per-record overhead of 13 bytes for the record header. 485 AEAD ciphers such as AES-CCM [RFC6655] eat up additional space to 486 carry the explicit nonce and the authentication tag. Thus, cipher 487 suites like TLS_PSK_WITH_AES_128_CCM_8 or 488 TLS_ECDHE_ECDSA_AES_128_CCM_8 requires 16 additional bytes, leading 489 to an overall overhead of 29 bytes for the header of each encrypted 490 DTLS packet. With packet sizes of 60-80 bytes, this takes a 491 considerable portion of the available packet size away (see Table 4 492 below). 494 +------------------+------------------------+-----------------------+ 495 | UDP data size | Number of bytes left | ... with Stateless | 496 | limit (bytes) | for application data | Header Compression | 497 +------------------+------------------------+-----------------------+ 498 | 50 | 21 (42 %) | 39 (78 %) | 499 | 55 | 26 (47 %) | 44 (80 %) | 500 | 60 | 31 (52 %) | 49 (82 %) | 501 | 65 | 36 (55 %) | 54 (83 %) | 502 | 70 | 41 (59 %) | 59 (84 %) | 503 | 75 | 46 (61 %) | 64 (85 %) | 504 | 80 | 51 (64 %) | 69 (86 %) | 505 | 85 | 56 (66 %) | 74 (87 %) | 506 | 90 | 61 (68 %) | 79 (88 %) | 507 | 1,152 | 1,123 (97 %) | 1,141 (99 %) | 508 +------------------+------------------------+-----------------------+ 510 Table 4: Number of bytes left for data in an ApplicationData record 511 using DTLS and DTLS with Stateless Header Compression (Section 4) 513 Possible Solutions include: 515 o Elide the GenericAEADCipher.nonce_explicit field when AES-CCM is 516 used. The GenericAEADCipher.nonce_explicit field is set to the 517 16-bit epoch concatenated with the 48-bit sequence number, which 518 means that the epoch and sequence number are unnecessarily 519 included twice in each record. 521 o Elide the DTLS version field where it is implicitly clear. Since 522 the DTLS version is negotiated in the handshake, there should not 523 be a need to specify the DTLS version in each and every record. 525 o Elide the length field of the last record in a datagram. DTLS 526 records specify their length, so multiple records can be 527 transmitted in a single datagram. When DTLS is used with UDP 528 (which preserves the boundaries of all message sent), the length 529 field of the last record in a datagram can be calculated from the 530 UDP payload length. 532 For example, when using the Stateless Header Compression presented in 533 Section 4 and eliminating the redundant epoch and sequence number 534 information, the number of bytes left in an ApplicationData record 535 for application data can be significantly increased (see Table 4). 537 2.8. Application Layer Protocol 539 When DTLS is used to secure a non-trivial application layer protocol, 540 there is potential for synergies that can arise from optimizing the 541 stack of both protocols. 543 For example, an implementation of CoAP [I-D.ietf-core-coap] with DTLS 544 security will need to implement both the reliability mechanism for 545 the DTLS handshake and the reliability mechanism of CoAP. This not 546 only increases code size, but also prevents efficient retransmissions 547 as each CoAP retransmission of the same data is a new transmission in 548 DTLS. 550 Possible Solutions include: 552 o Make DTLS reliability and fragmentation available to applications. 554 Accordingly, the application should take advantage of DTLS record 555 information where possible. For example, since DTLS sequence numbers 556 uniquely identify a message in a connection, the 6-byte sequence 557 number could be used in CoAP to correlate CoAP acknowledgements with 558 CoAP messages (Message ID, 2 bytes), to correlate CoAP responses with 559 CoAP requests (Token, 0-8 bytes), to provide an order among CoAP 560 notifications (3 bytes), and to enable message deduplication. 562 3. A Comparison of Strategies for Handshake Reliability 564 A DTLS handshake consists of multiple messages that are fragmented 565 and grouped in so-called "flights". As the previous sections have 566 shown, the strategy employed by DTLS to transmit these flights can 567 lead to circumstances that are acceptable for existing uses of DTLS 568 but pose a challenge in constrained environments: 570 o The loss of a single packet causes the whole flight of fragments 571 to be retransmitted, and not just the fragments that were lost. 573 o Long processing times can lead to spurious retransmissions. 575 o The possibility of arbitrarily reordered fragments requires the 576 recipient to maintain potentially large buffers. 578 This section compares the following strategies for reliability: 580 Bulk without acknowledgements (illustrated in Figure 2 below): 581 All fragments are retransmitted in exponentially increasing 582 intervals until the first fragment of the next flight from the 583 other side is received. This is the reliability mechanism used in 584 DTLS 1.2 [RFC6347]. 586 Stop-and-wait with one acknowledgement per fragment (Figure 3): 587 Each fragment is retransmitted individually until a matching 588 acknowledgement for the fragment is received. Only one fragment 589 is transmitted at a time, and each acknowledgement messages 590 confirms the receipt of one fragment. This is the reliability 591 mechanism used in CoAP [I-D.ietf-core-coap]. 593 Bulk with one cumulative acknowledgement per flight (Figure 4): 594 Unacknowledged fragments of the flight are transmitted using a 595 sliding window until all fragments have been acknowledged. 596 Acknowledgements specify all fragments that have been received so 597 far (highest sequence number seen + a bit field). 599 Table 5 shows the average number of transmissions needed for these 600 three strategies to successfully complete an example DTLS handshake. 601 (Every DTLS handshake is eventually successful if no side gives up 602 after a number of retransmission attempts.) 604 The results were obtained using a very simple network simulator that 605 randomly drops packets according to the given loss rate, but provides 606 ideal conditions otherwise. To avoid spurious retransmissions, timer 607 values were selected larger than the processing times for flights; 608 this may be impractical if sensible retransmission intervals and 609 processing times differ in orders of magnitudes. 611 +-----------+----------+----------+----------+ 612 | Loss rate | Figure 2 | Figure 3 | Figure 4 | 613 +-----------+----------+----------+----------+ 614 | 0 % | 18.0 | 36.0 | 19.0 | 615 | 5 % | 22.2 | 39.7 | 20.5 | 616 | 10 % | 25.9 | 41.8 | 23.8 | 617 | 15 % | 27.6 | 44.7 | 25.1 | 618 | 20 % | 33.3 | 51.6 | 27.1 | 619 | 25 % | 40.0 | 57.2 | 33.3 | 620 | 30 % | 39.2 | 64.0 | 37.4 | 621 | 35 % | 45.6 | 66.4 | 44.0 | 622 | 40 % | 55.4 | 74.7 | 46.2 | 623 | 45 % | 54.4 | 90.0 | 47.9 | 624 | 50 % | 67.2 | 102.2 | 57.2 | 625 | 55 % | 76.8 | 124.3 | 62.3 | 626 | 60 % | 96.9 | 151.3 | 74.4 | 627 | 65 % | 109.4 | 170.5 | 86.4 | 628 | 70 % | 115.8 | 248.2 | 106.8 | 629 | 75 % | 159.1 | 348.5 | 141.5 | 630 | 80 % | 199.6 | 528.6 | 169.9 | 631 | 85 % | 343.4 | 804.4 | 278.0 | 632 +-----------+----------+----------+----------+ 634 Table 5: Average number of transmissions for different strategies in 635 an example ECDHE_ECDSA handshake with Raw Public Key Certificate 637 Sender Recipient 638 | | 639 Fragment 0 +--------->| 640 Fragment 1 +-----X | 641 Fragment 2 +-----X | 642 Fragment 3 +--------->| 643 | | 644 Fragment 0 +-----X | 645 Fragment 1 +--------->| 646 Fragment 2 +--------->| 647 Fragment 3 +--------->| 648 | X-----+ Fragment 0 649 | | 650 Fragment 0 +--------->| 651 Fragment 1 +-----X | 652 Fragment 2 +--------->| 653 Fragment 3 +-----X | 654 |<---------+ Fragment 0 655 | | 657 Figure 2: Bulk transmission without acknowledgements (DTLS) 658 Sender Recipient 659 | | 660 Fragment 0 +--------->| 661 |<---------+ Acknowledge 0 662 | | 663 Fragment 1 +-----X | 664 | | 665 Fragment 1 +-----X | 666 | | 667 Fragment 1 +--------->| 668 |<---------+ Acknowledge 1 669 | | 670 Fragment 2 +--------->| 671 |<---------+ Acknowledge 2 672 | | 673 Fragment 3 +--------->| 674 | X-----+ Acknowledge 3 675 | | 676 Fragment 3 +--------->| 677 |<---------+ Acknowledge 3 678 | | 680 Figure 3: Stop-and-wait transmission with one acknowledgement per 681 fragment 683 Sender Recipient 684 | | 685 Fragment 0 +--------->| 686 Fragment 1 +-----X | 687 Fragment 2 +-----X | 688 Fragment 3 +--------->| 689 |<---------+ Acknowledge 0, 3 690 | | 691 Fragment 1 +-----X | 692 Fragment 2 +--------->| 693 | X-----+ Acknowledge 0, 2, 3 694 | | 695 Fragment 1 +--------->| 696 Fragment 2 +--------->| 697 | X-----+ Acknowledge 0, 1, 2, 3 698 | | 699 Fragment 1 +--------->| 700 Fragment 2 +-----X | 701 |<---------+ Acknowledge 0, 1, 2, 3 702 | | 704 Figure 4: Bulk transmission with one acknowledgement per flight 706 4. A Strawman for Stateless Header Compression 708 Stateless Header Compression compresses the headers of DTLS 1.2 709 records and handshake messages. The compression is lossless, does 710 not increase the record length and is done without explicitly 711 building any compression context state. 713 The Finished MAC is computed as if each handshake message was sent 714 uncompressed. 716 4.1. Records 718 Records are compressed by specifying the type, version, epoch, 719 sequence_number and length fields using a variable number of bytes. 720 A prefix is added in front of the structure to indicate the length of 721 each field or to specify the value of the field directly. If the 722 value is specified directly, the field itself is elided. The format 723 of the prefix is as follows: 725 0 1 726 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 727 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 728 |0| T | V | E |1 1 0| S | L | 729 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 731 The fields in the prefix are defined as follows: 733 T: Describes the type field. 735 0 - Content Type 20 (ChangeCipherSpec) 736 1 - 8-bit type field 737 2 - Content Type 22 (Handshake) 738 3 - Content Type 23 (Application Data) 740 V: Describes the version field. 742 0 - Version 254.255 (DTLS 1.0) 743 1 - 16-bit version field 744 2 - Version 254.253 (DTLS 1.2) 745 3 - Reserved for future use 747 E: Describes the epoch field. 749 0 - Epoch 0 750 1 - Epoch 1 751 2 - Epoch 2 752 3 - Epoch 3 753 4 - Epoch 4 754 5 - 8-bit epoch field 755 6 - 16-bit epoch field 756 7 - Implicit -- same as previous record in the datagram 758 S: Describes the sequence_number field. 760 0 - Sequence number 0 761 1 - 8-bit sequence_number field 762 2 - 16-bit sequence_number field 763 3 - 24-bit sequence_number field 764 4 - 32-bit sequence_number field 765 5 - 40-bit sequence_number field 766 6 - 48-bit sequence_number field 767 7 - Implicit -- number of previous record in the datagram + 1 769 L: Describes the length field. 771 0 - Length 0 772 1 - 8-bit length field 773 2 - 16-bit length field 774 3 - Implicit -- last record in the datagram 776 4.2. Handshake Messages 778 Handshake messages are compressed in a similar way. A prefix is 779 added in front of the structure to indicate the length of each field 780 or to specify the value of the field directly. If the value is 781 specified directly, the field itself is elided. The format of the 782 prefix is as follows: 784 0 1 785 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 786 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 787 |0 0| T | L | S | O | C | 788 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 790 The fields in the prefix are defined as follows: 792 T: Describes the msg_type field. 794 0 - 8-bit msg_type field 795 1 - Handshake Type 1 (Client Hello) 796 2 - Handshake Type 2 (Server Hello) 797 3 - Handshake Type 3 (Hello Verify Request) 798 4 - Reserved for future use 799 5 - Reserved for future use 800 6 - Reserved for future use 801 7 - Handshake Type 11 (Certificate) 802 8 - Handshake Type 12 (Server Key Exchange) 803 9 - Handshake Type 13 (Certificate Request) 804 10 - Handshake Type 14 (Server Hello Done) 805 11 - Handshake Type 15 (Certificate Verify) 806 12 - Handshake Type 16 (Client Key Exchange) 807 13 - Reserved for future use 808 14 - Reserved for future use 809 15 - Handshake Type 20 (Finished) 811 L: Describes the length field. 813 0 - Implicit -- last message in the record 814 1 - 8-bit length field 815 2 - 16-bit length field 816 3 - 24-bit length field 818 S: Describes the message_seq field. 820 0 - Message sequence number 0 821 1 - Message sequence number 1 822 2 - Message sequence number 2 823 3 - Message sequence number 3 824 4 - Message sequence number 4 825 5 - Message sequence number 5 826 6 - Message sequence number 6 827 7 - Message sequence number 7 828 8 - Message sequence number 8 829 9 - Message sequence number 9 830 10 - Message sequence number 10 831 11 - Message sequence number 11 832 12 - Message sequence number 12 833 13 - 8-bit message_seq field 834 14 - 16-bit message_seq field 835 15 - Implicit -- number of previous message in the record + 1 837 O: Describes the fragment_offset field. 839 0 - Offset 0 840 1 - 8-bit fragment_offset field 841 2 - 16-bit fragment_offset field 842 3 - 24-bit fragment_offset field 844 C: Describes the fragment_length field. 846 0 - Implicit -- message length minus fragment_offset 847 1 - 8-bit fragment_length field 848 2 - 16-bit fragment_length field 849 3 - 24-bit fragment_length field 851 5. Security Considerations 853 Beyond implementation techniques and stateless header compression, 854 any changes to the TLS/DTLS protocol need to be performed extremely 855 carefully. No analysis has been done in the present version of this 856 draft. 858 6. IANA Considerations 860 This draft includes no request to IANA. 862 7. Acknowledgements 864 Olaf Bergmann was an original author of this draft and is 865 acknowledged for significant contribution to this document. 867 Thanks to Angelo P. Castellani, Stefan Jucker, Shahid Raza, and Silke 868 Schaefer for helpful comments and discussions that have shaped the 869 document. 871 8. References 873 8.1. Normative References 875 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 876 Requirement Levels", BCP 14, RFC 2119, March 1997. 878 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 879 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 881 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 882 Security Version 1.2", RFC 6347, January 2012. 884 8.2. Informative References 886 [I-D.aks-crypto-sensors] 887 Sethi, M., Arkko, J., Keranen, A., and H. Rissanen, 888 "Practical Considerations and Implementation Experiences 889 in Securing Smart Object Networks", 890 draft-aks-crypto-sensors-02 (work in progress), 891 March 2012. 893 [I-D.bormann-6lo-ghc] 894 Bormann, C., "6LoWPAN Generic Compression of Headers and 895 Header-like Payloads", draft-bormann-6lo-ghc-00 (work in 896 progress), October 2013. 898 [I-D.ietf-core-coap] 899 Shelby, Z., Hartke, K., and C. Bormann, "Constrained 900 Application Protocol (CoAP)", draft-ietf-core-coap-18 901 (work in progress), June 2013. 903 [I-D.ietf-dice-profile] 904 Hartke, K. and H. Tschofenig, "A DTLS 1.2 Profile for the 905 Internet of Things", draft-ietf-dice-profile-00 (work in 906 progress), March 2014. 908 [I-D.ietf-lwig-guidance] 909 Bormann, C., "Guidance for Light-Weight Implementations of 910 the Internet Protocol Suite", draft-ietf-lwig-guidance-03 911 (work in progress), February 2013. 913 [I-D.ietf-lwig-terminology] 914 Bormann, C., Ersue, M., and A. Keranen, "Terminology for 915 Constrained Node Networks", draft-ietf-lwig-terminology-07 916 (work in progress), February 2014. 918 [I-D.ietf-tls-cached-info] 919 Santesson, S. and H. Tschofenig, "Transport Layer Security 920 (TLS) Cached Information Extension", 921 draft-ietf-tls-cached-info-16 (work in progress), 922 February 2014. 924 [I-D.ietf-tls-oob-pubkey] 925 Wouters, P., Tschofenig, H., Gilmore, J., Weiler, S., and 926 T. Kivinen, "Using Raw Public Keys in Transport Layer 927 Security (TLS) and Datagram Transport Layer Security 928 (DTLS)", draft-ietf-tls-oob-pubkey-11 (work in progress), 929 January 2014. 931 [I-D.mcgrew-tls-aes-ccm-ecc] 932 McGrew, D., Bailey, D., Campagna, M., and R. Dugal, "AES- 933 CCM ECC Cipher Suites for TLS", 934 draft-mcgrew-tls-aes-ccm-ecc-08 (work in progress), 935 February 2014. 937 [I-D.moskowitz-hip-rg-dex] 938 Moskowitz, R., "HIP Diet EXchange (DEX)", 939 draft-moskowitz-hip-rg-dex-06 (work in progress), 940 May 2012. 942 [I-D.raza-dice-compressed-dtls] 943 Raza, S., Shafagh, H., and O. Dupont, "Compression of 944 Record and Handshake Headers for Constrained 945 Environments", draft-raza-dice-compressed-dtls-00 (work in 946 progress), March 2014. 948 [IEEE.802-15-4] 949 "Information technology - Telecommunications and 950 information exchange between systems - Local and 951 metropolitan area networks - Specific requirements - Part 952 15.4: Wireless Medium Access Control (MAC) and Physical 953 Layer (PHY) Specifications for Low-Rate Wireless Personal 954 Area Networks (WPANs)", IEEE Standard 802.15.4, 955 September 2006, . 958 [RFC4302] Kent, S., "IP Authentication Header", RFC 4302, 959 December 2005. 961 [RFC4944] Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler, 962 "Transmission of IPv6 Packets over IEEE 802.15.4 963 Networks", RFC 4944, September 2007. 965 [RFC6256] Eddy, W. and E. Davies, "Using Self-Delimiting Numeric 966 Values in Protocols", RFC 6256, May 2011. 968 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 969 "Computing TCP's Retransmission Timer", RFC 6298, 970 June 2011. 972 [RFC6520] Seggelmann, R., Tuexen, M., and M. Williams, "Transport 973 Layer Security (TLS) and Datagram Transport Layer Security 974 (DTLS) Heartbeat Extension", RFC 6520, February 2012. 976 [RFC6574] Tschofenig, H. and J. Arkko, "Report from the Smart Object 977 Workshop", RFC 6574, April 2012. 979 [RFC6655] McGrew, D. and D. Bailey, "AES-CCM Cipher Suites for 980 Transport Layer Security (TLS)", RFC 6655, July 2012. 982 [SEC1] Brown, D., "Standards for Efficient Cryptography 1 (SEC 983 1): Elliptic Curve Cryptography", Version 2.0, May 2009. 985 [USENIX01] 986 Dean, D. and A. Stubblefield, "Using Client Puzzles to 987 Protect TLS", 10th USENIX Security Symposium, August 2001, 988 . 991 Appendix A. Templates 993 When elliptic curve cryptography is used, building and parsing the 994 bodies of Certificate, ServerKeyExchange and ClientKeyExchange 995 messages mainly involves the encoding and decoding of elliptic curve 996 points. The points are encapsulated in a mix of DTLS structures and 997 ASN.1 sequences. For a given elliptic curve, some parts of a message 998 body are static, which allows using pre-composed messages instead of 999 writing lots of memory consuming code pertaining to DTLS and ASN.1. 1001 This appendix provides templates for the SubjectPublicKeyInfo 1002 structures for the named curves secp256r1, secp384r1 and secp521r1, 1003 also known as NIST P-256, P-384 and P-521, respectively. These 1004 curves are the ones required in [I-D.mcgrew-tls-aes-ccm-ecc]. The 1005 points are represented in uncompressed point format. 1007 Note: Previous versions of the document provided templates for 1008 ServerKeyExchange and ClientKeyExchange messages. These templates 1009 were not correct, as the messages are actually variable in length 1010 depending on the sign of the encoded points. 1012 SubjectPublicKeyInfo: secp256r1 1014 30 59 30 13 06 07 2a 86 48 ce 3d 02 01 06 08 2a 1015 86 48 ce 3d 03 01 07 03 42 00 04 __ __ __ __ __ 1016 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1017 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1018 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1019 __ __ __ __ __ __ __ __ __ __ __ 1021 SubjectPublicKeyInfo: secp384r1 1023 30 76 30 10 06 07 2a 86 48 ce 3d 02 01 06 05 2b 1024 81 04 00 22 03 62 00 04 __ __ __ __ __ __ __ __ 1025 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1026 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1027 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1028 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1029 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1030 __ __ __ __ __ __ __ __ 1032 SubjectPublicKeyInfo: secp521r1 1034 30 81 9b 30 10 06 07 2a 86 48 ce 3d 02 01 06 05 1035 2b 81 04 00 23 03 81 86 00 04 __ __ __ __ __ __ 1036 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1037 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1038 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1039 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1040 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1041 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1042 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1043 __ __ __ __ __ __ __ __ __ __ __ __ __ __ 1045 Author's Address 1047 Klaus Hartke 1048 Universitaet Bremen TZI 1049 Postfach 330440 1050 Bremen D-28359 1051 Germany 1053 Phone: +49-421-218-63905 1054 Email: hartke@tzi.org