idnits 2.17.1 draft-wood-linkable-identifiers-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 119: '...imple rules that SHOULD be followed by...' RFC 2119 keyword, line 330: '...pe changes, endpoints SHOULD use fresh...' RFC 2119 keyword, line 333: '... SHOULD be initiated from an endpoin...' RFC 2119 keyword, line 351: '...in this document SHOULD NOT be interpr...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 22, 2018) is 1985 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC2508' is defined on line 410, but no explicit reference was found in the text == Outdated reference: A later version (-04) exists of draft-ietf-ntp-data-minimization-03 == Outdated reference: A later version (-34) exists of draft-ietf-quic-transport-15 == Outdated reference: A later version (-13) exists of draft-ietf-tls-dtls-connection-id-01 ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) ** Obsolete normative reference: RFC 4941 (Obsoleted by RFC 8981) ** Obsolete normative reference: RFC 5246 (Obsoleted by RFC 8446) ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147) ** Obsolete normative reference: RFC 6824 (Obsoleted by RFC 8684) ** Obsolete normative reference: RFC 7232 (Obsoleted by RFC 9110) Summary: 8 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Wood 3 Internet-Draft Apple Inc. 4 Intended status: Informational October 22, 2018 5 Expires: April 25, 2019 7 Linkable Identifiers 8 draft-wood-linkable-identifiers-00 10 Abstract 12 Rotating public identifiers is encouraged as best practice as a means 13 of protecting endpoint privacy. For example, regular MAC address 14 randomization helps mitigate device tracking across time and space. 15 Other protocols beyond those in the link layer also have public 16 identifiers or parameters that should rotate over time, in unison 17 with coupled protocol identifiers, and perhaps with application level 18 identifiers. This document surveys such privacy-related identifiers 19 exposed by common Internet protocols at various layers in a network 20 stack. It provides advice for rotating linked identifiers such that 21 privacy violations do not occur from rotating one identifier while 22 neglecting to rotate coupled identifiers. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on April 25, 2019. 41 Copyright Notice 43 Copyright (c) 2018 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 59 2. Sticky Protocol Identifiers . . . . . . . . . . . . . . . . . 3 60 2.1. Internet and Link Layer . . . . . . . . . . . . . . . . . 3 61 2.2. Transport and Session Layer . . . . . . . . . . . . . . . 4 62 2.3. Application Layer: . . . . . . . . . . . . . . . . . . . 5 63 3. Identifier Scope and Threat Model . . . . . . . . . . . . . . 6 64 4. Limiting Linkable Identifiers . . . . . . . . . . . . . . . . 6 65 4.1. Time and Path Linkability . . . . . . . . . . . . . . . . 7 66 5. Timing Considerations . . . . . . . . . . . . . . . . . . . . 8 67 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 68 7. Security Considerations . . . . . . . . . . . . . . . . . . . 8 69 8. Privacy Considerations . . . . . . . . . . . . . . . . . . . 8 70 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 8 71 10. Normative References . . . . . . . . . . . . . . . . . . . . 9 72 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 10 74 1. Introduction 76 [RFC6973] defines the correlation of information relevant to or 77 associated with a specific user as a significant attack on privacy. 78 Different layers of the network stack use identifiers to uniquely 79 address hosts or information flows. To mitigate the privacy concern, 80 many standards suggest randomizing or otherwise rotating such 81 identifiers on a regular basis. For example, a MAC address may be 82 used to link otherwise unrelated network packets to a single device. 83 Rotating the MAC address prevents this association at the link layer. 84 However, when multiple identifiers are simultaneously present on 85 different layers of the stack, breaking the association at any 86 individual layer might be insufficient to disassociate a host from 87 their network traffic. Linkability can also occur across protocols 88 and/or across layers. For example, TLS connections are commonly 89 preceded by DNS queries for a particular endpoint (host name), e.g. 90 example.com. Moreover, in the TLS handshake, this same host name is 91 sent in cleartext in the Server Name Indication extension. Thus, 92 observing either the DNS query or TLS SNI reveals information about 93 the other. Similarly, while an IP address of a device may rotate, if 94 web browser cookies do not, then, a website can track the various IP 95 addresses of a given cookie over time. 97 Huitema et al. [I-D.ietf-dnssd-privacy] say, "it is important that 98 the obfuscation of instance names is performed at the right time, and 99 that the obfuscated names change in synchrony with other identifiers, 100 such as MAC Addresses, IP Addresses or host names." Consider the 101 following example where this advice is not followed, wherein an IP 102 address is changed yet the MAC address is not. 104 +---------------+ +---------------+ 105 | ... | .... | ... | 106 +---------------+ +---------------+ 107 | IP Address A <---//---> IP Address B | 108 +---------------+ +---------------+ 109 | MAC Address A <--------> MAC Address A | 110 +---------------+ +---------------+ 112 +----------------------------------------> 113 time 115 A network adversary may trivially link these packets based on their 116 common MAC address and continue to associate traffic with this 117 particular host based on IP address B even if the MAC address 118 eventually changes in the future. In this document, we outline 119 simple rules that SHOULD be followed by protocol implementations to 120 avoid such linkability. We then survey protocols developed inside 121 the IETF and out, and identify their sticky identifiers. Results 122 were obtained by analyzing protocol documentation and specifications, 123 and also scanning packet traces captured from protocols in practice 124 on common systems. 126 2. Sticky Protocol Identifiers 128 In this section, we survey existing protocols developed inside and 129 out of the IETF, and identify sticky protocol identifiers for each. 130 A sticky identifier is one that persists across logically grouped 131 data exchanges between a client and server. This may include state- 132 generating servers or, commonly, client algorithm, software 133 configuration, or device-specific fields. We categorize surveyed 134 protocols by the OSI layer at which they operate. Specifically, we 135 focus on Link, Internet, Transport, Session, and Application layers. 136 (Our taxonomy may not match traditional OSI models, though we 137 consider it sufficiently representative.) 139 2.1. Internet and Link Layer 141 o Ethernet, 802.11, and Bluetooth: MAC addresses are fixed to 142 specific devices. Unless frequently rotated, they are sticky 143 identifiers. Simply rotating the MAC address may or may not be 144 sufficient depending on other information sent at the protocol 145 layer with the a (rotated) MAC address. For example, in 802.11, 146 frames have an incrementing sequence number and if the sequence 147 number is not reset in unison with a MAC address change, the 148 sequence number can be used to re-correlate randomized MAC 149 addresses. 151 o IPv4 and IPv6: Static or infrequently rotating addresses are 152 sticky identifiers when exposed on the network. Privacy 153 Extensions for Stateless Address Autoconfiguration [RFC4941] 154 enhance IPv6 client privacy by, e.g., issuing new IPv6 /64 155 prefixes every day. The 64-bit IID suffix remains random to deter 156 linkability. 158 o IKEv2: Initiator Security Parameters Indexes (SPIs) are used as 159 connection identifiers instead of IP addresses. They are required 160 to rotate for each new SA. 162 2.2. Transport and Session Layer 164 o TCP [RFC0793]: TCP source ports may be sticky if reused across 165 senders. For example, most operating systems allocate allocate 166 ephemeral (short lived) ports to each new connection. Per IANA 167 allocations, ephemeral ports range from 49152 to 65535 (2^15+2^14 168 to 2^16-1) [http://www.iana.org/assignments/port-numbers]. 169 However, this does not prevent an application from re-using port 170 across connections. Destination are also intentionally sticky, 171 since they identify services offered by endpoints. Therefore, 172 reusing a destination port does not lead to decreased linkability. 173 Moreover, with TCP Fast Open (TFO) [RFC7413], servers give clients 174 plaintext cookies that must be re-used when resuming a TCP+TFO 175 connection. Clients do not modify these server cookies, which 176 therefore means they can be tracked. 178 o MPTCP [RFC6824]: Connection tokens or IDs are explicitly used to 179 link MPTCP subflows between IP address pairs. These tokens are 180 only exposed during flow management operations, e.g., when 181 creating new subflows. Normal data transfer uses TCP sequence 182 numbers to bypass middlebox interference and an additional data 183 sequence number (DSN) TCP option to allow receivers to deal with 184 out-of-order subflow packet arrival. The union of packet DSNs 185 across subflows should yield a contiguous packet number sequence. 187 o TLS [RFC5246] [RFC8446]: Prior to TLS 1.3, significant information 188 is exposed during TLS handshakes, including: session identifiers 189 (or re-used PSK identifiers in TLS 1.3), timestamps, random 190 nonces, supported ciphersuites, certificates, and extensions. 191 Many of these are common across all TLS clients - specifically, 192 ciphersuites, nonces, and timestamps. However, others may persist 193 across active sessions, including: session identifiers (in TLS 1.2 194 and earlier versions) and re-used PSK identifiers (in TLS 1.3). 195 Without rotation, these re-used identifers are sticky. 197 o DTLS [RFC6347]: Datagram TLS is a slightly modified variant of TLS 198 aimed to run over datagram protocols such as UDP. In addition to 199 identifiers exposed via TLS, DTLS adds cookie-based denial-of- 200 service countermeasures. Servers issue stateless cookies to 201 clients during a handshake, which must be replayed in cleartext by 202 clients to prove ownership of its IP address. (This is similar to 203 TFO cookies described above.) Additionally, DTLS is considering 204 support of a static connection identifier (CID) 205 [I-D.ietf-tls-dtls-connection-id], which permits client address 206 mobility. CIDs are specifically designed to not change across 207 addresses. 209 o QUIC [I-D.ietf-quic-transport]: QUIC is another secure transport 210 protocol originally developed by Google and now being standardized 211 by the IETF. IETF-QUIC [I-D.ietf-quic-transport] uses TLS 1.3 for 212 its handshake. In addition to identifiers exposed by TLS 1.3, 213 QUIC has its own connection identifier (CID) used to permit 214 address mobility. 216 2.3. Application Layer: 218 o HTTP [RFC2616]: While HTTP is a stateless protocol, it enables 219 applications to define state-keeping mechanisms in header fields. 220 The fields might carry the state itself or tokens pointing to 221 state kept at the endpoints. The Cookie header field [RFC6265] is 222 de-facto the mechanism for web applications to uniquely identify 223 their clients by generating a token and instructing the client to 224 attach to any future requests. The ETag header field [RFC7232] 225 enables applications to uniquely reference a resource which the 226 client may cache. Applications may return unique reference tokens 227 to distinct clients. 229 o DNS [RFC1035]: SRV records often contain human-readable 230 information specific to particular devices, clients, or users. 231 For example, printers may advertise its services with SRV records 232 that contain a human-readable instance name. These are often not 233 rotated as services change. 235 o NTP [RFC5905]: By default, mode 3 for NTP - client to server - 236 sends several source-specific fields in the clear to NTP servers, 237 including: timestamps, poll, and precision. These fields should 238 be left empty or randomized as per 239 [I-D.ietf-ntp-data-minimization]. Other fields that may link to 240 clients include: Stratum, Root Delay, Root Dispersion, Ref ID, Ref 241 Timestamp, Origin Timestamp, and Receive Timestamp. 243 3. Identifier Scope and Threat Model 245 Not all packet identifiers are visible end-to-end in a client-server 246 interaction. For example, MAC addresses are only visible to those 247 with physical access to the medium - the local subnet for Ethernet 248 and proximity for Wi-Fi; we will consider both of these "on-path" for 249 the sake of this analysis. IP addresses are only visible between 250 endpoints. (In systems such as Tor, source and destination addresses 251 change at each circuit hop.) Thus, identifier linkability depends on 252 the threat model under consideration. Off-path adversaries, e.g. 253 those without physical access to the medium, are not considered a 254 problem since they do not have access to packets in flight. On-path 255 adversaries may exist at various locations relative to an endpoint 256 (sender or receiver) on a path, e.g., in a local subnet, as an 257 intermediate router or middlebox between two endpoints, or as a TLS 258 terminating reverse proxy. In this document, we categorize these 259 three types of adversaries as follows: 261 1. Local: An on-path adversary belonging to the same local subnet as 262 an endpoint, e.g., a switch. 264 2. Intermediate: An on-path adversary that observes datagrams in 265 flight but does not terminate a (TCP or TLS) connection, e.g., a 266 middlebox or performance enhancing proxy (PEP). 268 3. Terminator: An on-path adversary that terminates a connection, 269 e.g., a TLS- terminating reverse proxy. Note that there can be 270 distinct terminators for individual layers of network stack. 271 E.g., one for TLS and another for HTTP. 273 The scope of an identifier includes are all other protocols and 274 layers observable by the same adversary. 276 4. Limiting Linkable Identifiers 278 The introductory example illustrating packet linkability using MAC 279 addresses is one of many possible ways in which an attacker may link 280 packets. As another hypothetical example, assume that IP address and 281 MAC addresses were properly rotated, whereas TLS session identifiers 282 were reused over time, as shown below. 284 +---------------+ +---------------+ 285 | TLS Session X <--------> TLS Session X | 286 +---------------+ +---------------+ 287 | ... | .... | ... | 288 +---------------+ +---------------+ 289 | IP Address A <---//---> IP Address B | 290 +---------------+ +---------------+ 291 | MAC Address A <---//---> MAC Address C | 292 +---------------+ +---------------+ 294 +------------------------------------------> 295 time 297 Despite rotating all protocol identifiers beneath TLS, a static 298 session identifier makes packet linkability trivial. Thus, a strict, 299 yet safe rule for removing packet linkability is to rotate all linked 300 identifiers in unison. Unfortunately, this strategy is problematic 301 in practice. It would imply terminating active connections whenever 302 an identifier changes (otherwise, linkability remains trivial). For 303 example, if MAC addresses are rotated on a regular basis, e.g., every 304 15 minutes, then connection lifetimes would be limited to this 305 window. 307 A more sensible policy would be to restrict identifier rotation to 308 layers which are exposed to the same adversary. For example, origin 309 MAC addresses may not be visible to the destination. In this case, 310 rotating IP addresses and TLS session identifiers is not required to 311 prevent packet linkability by an adversary who does not see the 312 origin MAC address. A realistic threat model is one in which IP- to 313 TLS-layer information is exposed to the same on-path adversary. 314 Identifiers beneath IP are visible to local adversaries, which may 315 not be an issue, and those above TLS are visible to authenticated 316 peers. 318 4.1. Time and Path Linkability 320 There are multiple dimensions along which identifiers may be linked: 321 (1) time, as identifiers are used and re-used by senders, and (2) 322 space, as identifiers are duplicated across multiple disjoint network 323 paths, possibly by different protocols. We refer to these dimensions 324 as time and path linkability, respectively. 326 Time linkability is arguably simpler to mitigate, since new 327 connections over time may opt to use new identifiers. For example, 328 instead of resuming a TLS session with an existing session ID, a 329 client may initiate a fresh handshake. As a simple rule, if an 330 identifier in the same scope changes, endpoints SHOULD use fresh 331 identifiers for all other protocols in that scope. This means that, 332 for identifiers visible to intermediate adversaries, new TLS sessions 333 SHOULD be initiated from an endpoint with a fresh IP address and TCP 334 source port. Note that clients behind NATs may not need to generate 335 a fresh IP address, as they enjoy some measure of anonymity by 336 design. If local adversaries were considered part of the threat 337 model, then a fresh MAC address may also be needed. 339 In contrast, path linkability is more difficult to achieve, as it 340 requires using fresh identifiers for each protocol field. This may 341 not always be technically feasible. For example, DNS query names are 342 also intentionally used as the TLS SNI. Moreover, protocols such as 343 QUIC explicitly try to enable path linkability via connection-level 344 identifiers (CIDs) to support multihoming or mobile endpoints. This 345 makes path linkability impossible to mitigate. However, as multiple, 346 disjoint paths may be operated by different entities (e.g., ISPs), 347 collusion may be less common. 349 5. Timing Considerations 351 Advice in this document SHOULD NOT be interpreted as guarantees for 352 preventing linkability. Rather, it aims to increase linkability 353 complexity. It is difficult to prevent path-linkability without 354 modifying protocols above the layer at which identifiers rotate. For 355 example, assuming MPTCP subflows were unlinkable across paths, shared 356 transport state controlling the rate of data transmission may be 357 sufficient to link these flows. 359 6. IANA Considerations 361 This document has no request to IANA. 363 7. Security Considerations 365 This document does not introduce any new security protocol. 367 8. Privacy Considerations 369 This document describes considerations and suggestions for improving 370 privacy in the context of many IETF protocols. It does not introduce 371 any new features or protocol behavior that would adversely impact 372 privacy. 374 9. Acknowledgments 376 The authors thank Martin Thompson and Brian Trammell for comments on 377 earlier versions of this document. 379 10. Normative References 381 [I-D.ietf-dnssd-privacy] 382 Huitema, C. and D. Kaiser, "Privacy Extensions for DNS- 383 SD", draft-ietf-dnssd-privacy-05 (work in progress), 384 October 2018. 386 [I-D.ietf-ntp-data-minimization] 387 Franke, D. and A. Malhotra, "NTP Client Data 388 Minimization", draft-ietf-ntp-data-minimization-03 (work 389 in progress), September 2018. 391 [I-D.ietf-quic-transport] 392 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 393 and Secure Transport", draft-ietf-quic-transport-15 (work 394 in progress), October 2018. 396 [I-D.ietf-tls-dtls-connection-id] 397 Rescorla, E., Tschofenig, H., Fossati, T., and T. Gondrom, 398 "The Datagram Transport Layer Security (DTLS) Connection 399 Identifier", draft-ietf-tls-dtls-connection-id-01 (work in 400 progress), July 2018. 402 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 403 RFC 793, DOI 10.17487/RFC0793, September 1981, 404 . 406 [RFC1035] Mockapetris, P., "Domain names - implementation and 407 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, 408 November 1987, . 410 [RFC2508] Casner, S. and V. Jacobson, "Compressing IP/UDP/RTP 411 Headers for Low-Speed Serial Links", RFC 2508, 412 DOI 10.17487/RFC2508, February 1999, . 415 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 416 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 417 Transfer Protocol -- HTTP/1.1", RFC 2616, 418 DOI 10.17487/RFC2616, June 1999, . 421 [RFC4941] Narten, T., Draves, R., and S. Krishnan, "Privacy 422 Extensions for Stateless Address Autoconfiguration in 423 IPv6", RFC 4941, DOI 10.17487/RFC4941, September 2007, 424 . 426 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 427 (TLS) Protocol Version 1.2", RFC 5246, 428 DOI 10.17487/RFC5246, August 2008, . 431 [RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch, 432 "Network Time Protocol Version 4: Protocol and Algorithms 433 Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010, 434 . 436 [RFC6265] Barth, A., "HTTP State Management Mechanism", RFC 6265, 437 DOI 10.17487/RFC6265, April 2011, . 440 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 441 Security Version 1.2", RFC 6347, DOI 10.17487/RFC6347, 442 January 2012, . 444 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 445 "TCP Extensions for Multipath Operation with Multiple 446 Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013, 447 . 449 [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 450 Morris, J., Hansen, M., and R. Smith, "Privacy 451 Considerations for Internet Protocols", RFC 6973, 452 DOI 10.17487/RFC6973, July 2013, . 455 [RFC7232] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 456 Protocol (HTTP/1.1): Conditional Requests", RFC 7232, 457 DOI 10.17487/RFC7232, June 2014, . 460 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 461 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 462 . 464 [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol 465 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, 466 . 468 Author's Address 469 Christopher A. Wood 470 Apple Inc. 471 One Apple Park Way 472 Cupertino, California 95014 473 United States of America 475 Email: cawood@apple.com