idnits 2.17.1 draft-ietf-ipngwg-addrconf-privacy-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 59 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 19, 2000) is 8618 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 2373 (ref. 'ADDRARCH') (Obsoleted by RFC 3513) ** Obsolete normative reference: RFC 2462 (ref. 'ADDRCONF') (Obsoleted by RFC 4862) ** Downref: Normative reference to an Historic draft: draft-ietf-http-state-man-mec (ref. 'COOKIES') ** Obsolete normative reference: RFC 2461 (ref. 'DISCOVERY') (Obsoleted by RFC 4861) == Outdated reference: A later version (-05) exists of draft-ietf-ipngwg-esd-analysis-04 -- Possible downref: Normative reference to a draft: ref. 'GSE' ** Obsolete normative reference: RFC 2401 (ref. 'IPSEC') (Obsoleted by RFC 4301) ** Downref: Normative reference to an Informational RFC: RFC 1321 (ref. 'MD5') ** Obsolete normative reference: RFC 2002 (ref. 'MOBILEIP') (Obsoleted by RFC 3220) ** Obsolete normative reference: RFC 1750 (ref. 'RANDOM') (Obsoleted by RFC 4086) -- Possible downref: Normative reference to a draft: ref. 'SERIALNUM' Summary: 13 errors (**), 0 flaws (~~), 3 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Please post. Tnx. 3 INTERNET-DRAFT Thomas Narten 4 IBM 5 Richard Draves 6 Microsoft Research 7 September 19, 2000 9 Privacy Extensions for Stateless Address Autoconfiguration in IPv6 11 13 Status of this Memo 15 This document is an Internet-Draft and is in full conformance with 16 all provisions of Section 10 of RFC2026. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that other 20 groups may also distribute working documents as Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet- Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 Abstract 35 Nodes use IPv6 stateless address autoconfiguration to generate 36 addresses without the necessity of a DHCP server. Addresses are 37 formed by combining network prefixes with an interface identifier. On 38 interfaces that contain embedded IEEE Identifiers, the interface 39 identifier is typically derived from it. On other interface types, 40 the interface identifier is generated through other means, for 41 example, via random number generation. This document describes an 42 extension to IPv6 stateless address autoconfiguration for interfaces 43 whose interface identifier is derived from an IEEE identifier. Use of 44 the extension causes nodes to generate global-scope addresses from 45 interface identifiers that change over time, even in cases where the 46 interface contains an embedded IEEE identifier. Changing the 47 interface identifier (and the global-scope addresses generated from 48 it) over time makes it more difficult for eavesdroppers and other 49 information collectors to identify when different addresses used in 50 different transactions actually correspond to the same node. 52 Contents 54 Status of this Memo.......................................... 1 56 1. Introduction............................................. 2 58 2. Background............................................... 3 59 2.1. Extended Use of the Same Identifier................. 3 60 2.2. Not a New Issue..................................... 4 61 2.3. Possible Approaches................................. 6 63 3. Protocol Description..................................... 7 64 3.1. Assumptions......................................... 8 65 3.2. Generation Of Randomized Interface Identifiers...... 9 66 3.3. Generating Anonymous Addresses...................... 10 67 3.4. Expiration of Anonymous Addresses................... 11 68 3.5. Regeneration of Randomized Interface Identifiers.... 12 70 4. Implications of Changing Interface Identifiers........... 13 72 5. Defined Constants........................................ 14 74 6. Open Issues and Future Work.............................. 14 76 7. Security Considerations.................................. 14 78 8. Acknowledgments.......................................... 14 80 9. References............................................... 15 82 1. Introduction 84 Stateless address autoconfiguration [ADDRCONF] defines how an IPv6 85 node generates addresses without the need for a DHCP server. Some 86 types of network interfaces come with an embedded IEEE Identifier 87 (i.e., a link-layer MAC address), and in those cases stateless 88 address autoconfiguration uses the IEEE identifier to generate a 89 64-bit interface identifier [ADDRARCH]. By design, the interface 90 identifier is globally unique when generated in this fashion. The 91 interface identifier is in turn appended to a prefix to form a 92 128-bit IPv6 address. 94 All nodes combine interface identifiers (whether derived from an IEEE 95 identifier or generated through some other technique) with the 96 reserved link-local prefix to generate link-local addresses for their 97 attached interfaces. Additional addresses, including site-local and 98 global-scope addresses, are then created by combining prefixes 99 advertised in Router Advertisements via Neighbor Discovery 100 [DISCOVERY] with the interface identifier. 102 Not all nodes and interfaces contain IEEE identifiers. In such cases, 103 an interface identifier is generated through some other means (e.g., 104 at random), and the resultant interface identifier is not globally 105 unique and may also change over time. The focus of this document is 106 on addresses derived from IEEE identifiers, as the concern being 107 addressed exists only in those cases where the interface identifier 108 is globally unique and non-changing. The rest of this document 109 assumes that IEEE identifiers are being used, but the techniques 110 described may also apply to interfaces with other types of globally 111 unique and persistent identifiers. 113 This document discusses concerns associated with the embedding of 114 non-changing interface identifiers within IPv6 addresses and 115 describes extensions to stateless address autoconfiguration that can 116 help mitigate those concerns in environments where such concerns are 117 significant. Section 2 provides background information on the issue. 118 Section 3 describes a procedure for generating alternate interface 119 identifiers and global-scope addresses. Section 4 discusses 120 implications of changing interface identifiers. 122 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 123 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 124 document are to be interpreted as described in [KEYWORDS]. 126 2. Background 128 This section discusses the problem in more detail, provides context 129 for evaluating the significance of the concerns in specific 130 environments and makes comparisons with existing practices. 132 2.1. Extended Use of the Same Identifier 134 The use of a non-changing interface identifier to form addresses is a 135 specific instance of the more general case where a constant 136 identifier is reused over an extended period of time and in multiple 137 independent activities. Anytime the same identifier is used in 138 multiple contexts, it becomes possible for that identifier to be used 139 to correlate seemingly unrelated activity. For example, a network 140 sniffer placed strategically on a link across which all traffic 141 to/from a particular host crosses could keep track of which 142 destinations a node communicated with and at what times. Such 143 information can in some cases be used to infer things, such as what 144 hours an employee was active, when someone is at home, etc. 146 One of the requirements for correlating seemingly unrelated 147 activities is the use (and reuse) of an identifier that is 148 recognizable over time within different contexts. IP addresses 149 provide one obvious example, but there are more. Many nodes also have 150 DNS names associated with their addresses, in which case the DNS name 151 serves as a similar identifier. Although the DNS name associated with 152 an address is more work to obtain (it may require a DNS query) the 153 information is often readily available. In such cases, changing the 154 address on a machine over time would do little to address the concern 155 raised in this document, as the DNS name would become the correlating 156 identifier. 158 The use of a constant identifier within an address is of special 159 concern because addresses are a fundamental requirement of 160 communication and cannot easily be hidden from eavesdroppers and 161 other parties. Even when higher layers encrypt their payloads, 162 addresses in packet headers appear in the clear. Consequently, if a 163 mobile host (e.g., laptop) accessed the network from several 164 different locations, an eavesdropper might be able to track the 165 movement of that mobile host from place to place, even if the upper 166 layer payloads were encrypted [SERIALNUM]. 168 2.2. Not a New Issue 170 Although the topic of this document may at first appear to be an 171 issue new to IPv6, similar issues exist in today's Internet already. 172 That is, addresses used in today's Internet are often non-changing in 173 practice for extended periods of time. In many sites, addresses are 174 assigned statically; such addresses typically change infrequently. 175 However, many sites are moving away from static allocation to dynamic 176 allocation via DHCP [DHCP]. In theory, the address a client gets via 177 DHCP can change over time, but in practice servers return the same 178 address to the same client (unless addresses are in such short supply 179 that they are reused immediately by a different node when they become 180 free). Thus, although many sites use DHCP, clients end up using the 181 same address for months at a time. 183 Nodes that need a (non-changing) DNS name generally have static 184 addresses assigned to them to simplify the configuration of DNS 185 servers. Although Dynamic DNS [DDNS] can be used to update the DNS 186 dynamically, it is not widely deployed today. In addition, changing 187 an address but keeping the same DNS name does not really address the 188 underlying concern, since the DNS name becomes a non-changing 189 identifier. Servers generally require a DNS name (so clients can 190 connect to them), and clients often do as well (e.g., some servers 191 refuse to speak to a client whose address cannot be mapped into a DNS 192 name that also maps back into the same address). 194 Many network services require that the client authenticate itself to 195 the server before gaining access to a resource. The authentication 196 step binds the activity (e.g., TCP connection) to a specific entity 197 (e.g., an end user). In such cases, a server already has the ability 198 to track usage by an individual, independent of the address they 199 happen to use. Indeed, such tracking is an important part of 200 accounting. 202 Web browsers and servers typically exchange "cookies" with each other 203 [COOKIES]. Cookies allow web servers to correlate a current activity 204 with a previous activity. One common usage is to send back targeted 205 advertising to a user by using the cookie supplied by the browser to 206 identify what earlier queries had been made (e.g., for what type of 207 information). Based on the earlier queries, advertisements can be 208 targeted to match the (assumed) interests of the end-user. 210 The use of non-changing interface identifiers in IPv6 has 211 implications in two quite different contexts: stationary devices 212 (i.e., those that generally do not move physically such as desktop 213 PCs), and mobile devices (i.e., those that move frequently, including 214 laptops, cell phones, etc.). 216 In today's internet, many home users do not have permanent 217 connections and indeed are assigned temporary addresses each time 218 they connect to their ISP. Consequently, the addresses they use 219 change frequently over time and are shared among a number of 220 different users. If addresses are generated from an interface 221 identifier, however, a home user's address could contain an interface 222 identifier that remains the same from one dialup session to the next. 223 The way PPP is used today, however, PPP servers typically 224 unilaterally inform the client what address they are to use (i.e., 225 the client doesn't generate one on its own). This practice, if 226 continued in IPv6, would avoid the concerns that are the focus of 227 this document. 229 A more interesting case concerns always-on connections (e.g., cable 230 modems, ISDN, DSL, etc.) that result in a home site using the same 231 address for extended periods of time. This is a scenario that is just 232 starting to become common in IPv4 and promises to become more of a 233 concern as always-on internet connectivity becomes widely available. 234 The technique described later in the document attempts to address 235 this concern by changing the interface identifier portion of an 236 address. However, it should be noted that in the case of always-on 237 connections, the network prefix portion of an address is in effect a 238 constant identifier. All nodes at (say) a home, would have the same 239 network prefix. This has implications for privacy, though not at the 240 same granularity (i.e., all nodes within a home would be lumped 241 together for the purposes of collecting information). This issue is 242 also non-trivial to address, because the routing prefix part of an 243 address contains topology information and cannot contain arbitrary 244 values. 246 Another case concerns mobile devices (e.g., laptops, PDAs, etc.) that 247 move topologically within the Internet. Whenever they move (in the 248 absence of technology such as mobile IP [MOBILEIP]), they form new 249 addresses for their current topological point of attachment. This is 250 typified today by the "road warrior" who has Internet connectivity 251 both at home and at the office. While the node's address changes as 252 it moves, however, the interface identifier contained within the 253 address remains the same (when derived from an IEEE Identifier). In 254 such cases, the interface identifier could (in theory) be used to 255 track the movement and usage of a particular machine [SERIALNUM]. For 256 example, a server that logs usage information together with a source 257 addresses, is also recording the interface identifier since it is 258 embedded within an address. Consequently, any data-mining technique 259 that correlates activity based on addresses could easily be extended 260 to do the same using the interface identifier. This is of particular 261 concern with the expected proliferation of next-generation network- 262 connected devices (e.g., PDAs, cell phones, etc.) in which large 263 numbers of devices are in practice associated with individual users 264 (i.e., not shared). Thus, the interface identifier embedded within an 265 address could be used to track activities of an individual, even as 266 they move topologically within the internet. 268 2.3. Possible Approaches 270 One way to avoid some of the problems discussed above is to use DHCP 271 for obtaining addresses. With DHCP, the DHCP server could arrange to 272 hand out addresses that change over time. 274 Another approach, compatible with the stateless address 275 autoconfiguration architecture, would be to change the interface id 276 portion of an address over time and generate new addresses from the 277 interface identifier for some address scopes. Changing the interface 278 identifier can make it more difficult to look at the IP addresses in 279 independent transactions and identify which ones actually correspond 280 to the same node, both in the case where the routing prefix portion 281 of an address changes and when it does not. 283 Many machines function as both clients and servers. In such cases, 284 the machine would need a DNS name for its use as a server. Whether 285 the address stays fixed or changes has little privacy implication 286 since the DNS name remains constant and serves as a constant 287 identifier. When acting as a client (e.g., initiating communication), 288 however, such a machine may want to vary the addresses it uses. In 289 such environments, one may need multiple addresses: a "public" (i.e., 290 non-secret) server address, registered in the DNS, that is used to 291 accept incoming connection requests from other machines, and 292 (possibly) an "anonymous" address used to shield the identity of the 293 client when it initiates communication. These two cases are roughly 294 analogous to telephone numbers and caller ID, where a user may list 295 their telephone number in the public phone book, but disable the 296 display of its number via caller ID when initiating calls. 298 To make it difficult to make educated guesses as to whether two 299 different interface identifiers belong to the same node, the 300 algorithm for generating alternate identifiers must include input 301 that has an unpredictable component from the perspective of the 302 outside entities that are collecting information. Picking identifiers 303 from a pseudo-random sequence suffices, so long as the specific 304 sequence cannot be determined by an outsider examining just the 305 identifiers that appear in addresses or are otherwise readily 306 available (e.g., a node's link-layer address). This document proposes 307 the generation of a pseudo-random sequence of interface identifiers 308 via an MD5 hash. Periodically, the next interface identifier in the 309 sequence is generated, a new set of anonymous addresses is created, 310 and the previous anonymous addresses are deprecated to discourage 311 their further use. The precise pseudo-random sequence depends on both 312 a random component and the globally unique interface identifier (when 313 available), to increase the likelihood that different nodes generate 314 different sequences. 316 3. Protocol Description 318 The goal of this section is to define procedures that: 320 1) Do not result in any changes to the basic behavior of addresses 321 generated via stateless address autoconfiguration [ADDRCONF]. 323 2) Define new procedures that create additional global-scope 324 addresses based on a random interface identifier for use with 325 global scope addresses. Such addresses would be used to initiate 326 outgoing sessions. These "random" or anonymous addresses would be 327 used for a short period of time (hours to days) and would then be 328 deprecated. Deprecated address can continue to be used for 329 already established connections, but are not used to initiate new 330 connections. New anonymous addresses are generated periodically to 331 replace anonymous addresses that expire, with the exact time 332 between address generation a matter of local policy. 334 3) Produce a sequence of anonymous global-scope addresses from a 335 sequence of interface identifiers that appear to be random in the 336 sense that it is difficult for an outside observer to predict a 337 future address (or identifier) based on a current one and it is 338 difficult to determine previous addresses (or identifiers) knowing 339 only the present one. 341 4) Generate a set of addresses from the same (randomized) interface 342 identifier, one address for each prefix for which a global address 343 has been generated via stateless address autoconfiguration. Using 344 the same interface identifier to generate a set of anonymous 345 addresses reduces the number of IP multicast groups a host must 346 join. Nodes join the solicited-node multicast address for each 347 unicast address they support, and solicited-node addresses are 348 dependent only on the low-order bits of the corresponding address. 349 This decision was made to address the concern that a node that 350 joins a large number of multicast groups may be required to put 351 its interface into promiscuous mode, resulting in possible reduced 352 performance. 354 3.1. Assumptions 356 The following algorithm assumes that each interface maintains an 357 associated randomized interface identifier. When anonymous addresses 358 are generated, the current value of the associated randomized 359 interface identifier is used. The actual value of the identifier 360 changes over time as described below, but the same identifier can be 361 used to generate more than one anonymous address. 363 The algorithm also assumes that for a given anonymous address, one 364 can determine the corresponding public address. When an anonymous 365 address is deprecated, a new anonymous address is generated. The 366 specific valid and preferred lifetimes for the new address are 367 dependent on the corresponding lifetime values in the public address. 369 Finally, this document assumes that when a node initiates outgoing 370 communication, anonymous addresses can be given preference over other 371 public addresses. This can mean that all outgoing connections use 372 anonymous addresses by default, or that applications individually 373 indicate whether they prefer to use anonymous or public addresses. 374 Giving preference to anonymous address is consistent with on-going 375 work that addresses the topic of source address-selection in the more 376 general case [ADDR_SELECT]. 378 3.2. Generation Of Randomized Interface Identifiers. 380 We describe two approaches for the maintenance of the randomized 381 interface identifier. The first assumes the presence of stable 382 storage that can be used to record state history for use as input 383 into the next iteration of the algorithm across system restarts. A 384 second approach addresses the case where stable storage is 385 unavailable and a randomized interface identifier may need to be 386 generated at random. 388 3.2.1. When Stable Storage Is Present 390 The following algorithm assumes the presence of a 64-bit "history 391 value" that is used as input in generating a randomized interface 392 identifier. The very first time the system boots (i.e., out-of-the- 393 box), a random value should be generated using techniques that help 394 ensure the initial value is hard to guess [RANDOM]. Whenever a new 395 interface identifier is generated, a value generated by the 396 computation is saved in the history value for the next iteration of 397 the algorithm. 399 A randomized interface identifier is created as follows: 401 1) Take the history value from the previous iteration of this 402 algorithm (or a random value if there is no previous value) and 403 append to it the interface identifier generated as described in 404 [ADDRARCH]. 405 2) Compute the MD5 message digest [MD5] over the quantity created in 406 the previous step. 407 3) Take the left-most 64-bits of the MD5 digest and set bit 6 (the 408 left-most bit is numbered 0) to zero. This creates an interface 409 identifier with the universal/local bit indicating local 410 significance only. Save the generated identifier as the associated 411 randomized interface identifier. 412 4) Take the rightmost 64-bits of the MD5 digest computed in step 2) 413 and save them in stable storage as the history value to be used in 414 the next iteration of the algorithm. 416 MD5 was chosen for convenience, and because its particular properties 417 were adequate to produce the desired level of randomization. IPv6 418 nodes are already required to implement MD5 as part of IPsec [IPSEC], 419 thus the code will already be present on IPv6 machines. 421 In theory, generating successive randomized interface identifiers 422 using a history scheme as above has no advantages over generating 423 them at random. In practice, however, generating truly random numbers 424 can be tricky. Use of a history value is intended to avoid the 425 particular scenario where two nodes generate the same randomized 426 interface identifier, both detect the situation via DAD, but then 427 proceed to generate identical randomized interface identifiers via 428 the same (flawed) random number generation algorithm. The above 429 algorithm avoids this problem by having the interface identifier 430 (which will often be globally unique) used in the calculation that 431 generates subsequent randomized interface identifiers. Thus, if two 432 nodes happen to generate the same randomized interface identifier, 433 they should generate different ones on the followup attempt. 435 3.2.2. In The Absence of Stable Storage 437 In the absence of stable storage, no history value will be available 438 across system restarts to generate a pseudo-random sequence of 439 interface identifiers. Consequently, the initial history value used 440 above will need to be generated at random. A number of techniques 441 might be appropriate. Consult [RANDOM] for suggestions on good 442 sources for obtaining random numbers. Note that even though machines 443 may not have stable storage for storing a history value, they will in 444 many cases have configuration information that differs from one 445 machine to another (e.g., user identity, security keys, serial 446 numbers, etc.). One approach to generating a random initial history 447 value in such cases is to use the configuration information to 448 generate some data bits (which may remain constant for the life of 449 the machine, but will vary from one machine to another), append some 450 random data and compute the MD5 digest as before. 452 3.3. Generating Anonymous Addresses 454 [ADDRCONF] describes the steps for generating a link-local address 455 when an interface becomes enabled as well as the steps for generating 456 addresses for other scopes. This document extends [ADDRCONF] as 457 follows. When processing a Router Advertisement with a Prefix 458 Information option carrying a global-scope prefix for the purposes of 459 address autoconfiguration (i.e., the A bit is set), perform the 460 following steps: 462 1) Process the Prefix Information Option as defined in [ADDRCONF], 463 either creating a public address or adjusting the lifetimes of 464 existing addresses, both public and anonymous. When adjusting the 465 lifetimes of an existing anonymous address, only lower the 466 lifetimes. Implementations MUST NOT increase the lifetimes of an 467 existing anonymous address when processing a Prefix Information 468 Option. 469 2) When a new public address is created as described in [ADDRCONF] 470 (because the prefix advertised does not match the prefix of any 471 address already assigned to the interface, and the Valid Lifetime 472 in the option is not zero), also create a new anonymous address. 473 3) When creating an anonymous address, the lifetime values are 474 derived from the corresponding public address as follows: 476 - Its Valid Lifetime is the lower of the Valid Lifetime of the 477 public address or ANON_VALID_LIFETIME. 478 - Its Preferred Lifetime is the lower of the Preferred Lifetime 479 of the public address or ANON_PREFERRED_LIFETIME. 481 An anonymous address is created only if this calculated Preferred 482 Lifetime is greater than REGEN_ADVANCE time units. In particular, 483 an implementation MUST NOT create an anonymous address with a zero 484 Preferred Lifetime. 485 4) New anonymous addresses are created by appending the interface's 486 current randomized interface identifier to the prefix that was 487 used to generate the corresponding public address. If by chance 488 the new anonymous address is the same as an address already 489 assigned to the interface, generate a new randomized interface 490 identifier and repeat this step. 491 5) Perform duplicate address detection (DAD) on the generated 492 anonymous address. If DAD indicates the address is already in use, 493 generate a new randomized interface identifier as described in 494 Section 3.2 above, and repeat the previous steps as appropriate up 495 to 5 times. If after 5 consecutive attempts no non-unique address 496 was generated, log a system error and give up attempting to 497 generate anonymous addresses for that interface. 499 Note: because multiple anonymous addresses are generated from the 500 same associated randomized interface identifier, there is little 501 benefit in running DAD on every anonymous address. This document 502 recommends that DAD be run on the first address generated from a 503 given randomized identifier, but that DAD be skipped on all 504 subsequent addresses generated from the same randomized interface 505 identifier. 507 3.4. Expiration of Anonymous Addresses 509 When an anonymous address becomes deprecated, a new one should be 510 generated. This is done by repeating the actions described in Section 511 3.3, starting at step 3). Note that, except for the transient period 512 when an anonymous address is being regenerated, in normal operation 513 at most one anonymous address corresponding to a public address 514 should be in a non-deprecated state at any given time. Note that if 515 an anonymous address becomes deprecated as result of processing a 516 Prefix Information Option with a zero Preferred Lifetime, then a new 517 anonymous address MUST NOT be generated. The Prefix Information 518 Option will also deprecate the corresponding public address. 520 To insure that a preferred anonymous address is always available, a 521 new anonymous address should be regenerated slightly before its 522 predecessor is deprecated. This is to allow sufficient time to avoid 523 race conditions in the case where generating a new anonymous address 524 is not instantaneous, such as when duplicate address detection must 525 be run. It is recommended that an implementation start the address 526 regeneration process REGEN_ADVANCE time units before an anonymous 527 address would actually be deprecated. 529 As an optional optimization, an implementation may wish to remove a 530 deprecated anonymous address that is not in use by applications or 531 upper-layers. For TCP connections, such information is available in 532 control blocks. For UDP-based applications, it may be the case that 533 only the applications have knowledge about what addresses are 534 actually in use. Consequently, one may need to use heuristics in 535 deciding when an address is no longer in use (e.g., the default 536 ANON_VALID_LIFETIME suggested above). 538 3.5. Regeneration of Randomized Interface Identifiers 540 The frequency at which anonymous addresses should change depends on 541 how a device is being used (e.g., how frequently it initiates new 542 communication) and the concerns of the end user. The most egregious 543 privacy concerns appear to involve addresses used for long periods of 544 time (weeks to months to years). The more frequently an address 545 changes, the less feasible collecting or coordinating information 546 keyed on interface identifiers becomes. Moreover, the cost of 547 collecting information and attempting to correlate it based on 548 interface identifiers will only be justified if enough addresses 549 contain non-changing identifiers to make it worthwhile. Thus, having 550 large numbers of clients change their address on a daily or weekly 551 basis is likely to be sufficient to alleviate most privacy concerns. 553 There are also client costs associated with having a large number of 554 addresses associated with a node (e.g., in doing address lookups, the 555 need to join many multicast groups, etc.). Thus, changing addresses 556 frequently (e.g., every few minutes) may have performance 557 implications. 559 This document recommends that implementations generate new anonymous 560 addresses on a periodic basis. This can be achieved automatically by 561 generating a new randomized interface identifier at least once every 562 (ANON_PREFERRED_LIFETIME - REGEN_ADVANCE) time units. As described 563 above, generating a new anonymous address REGEN_ADVANCE time units 564 before an anonymous address becomes deprecated produces addresses 565 with a preferred lifetime no larger than ANON_PREFERRED_LIFETIME. 566 When the preferred lifetime expires, a new anonymous address is 567 generated using the new randomized interface identifier. 569 Because the precise frequency at which it is appropriate to generate 570 new addresses varies from one environment to another, implementations 571 should provide end users with the ability to change the frequency at 572 which addresses are regenerated. The default value is given in 573 ANON_PREFERRED_LIFETIME and is one day. In addition, the exact time 574 at which to invalidate an anonymous address depends on how 575 applications are used by end users. Thus the default value given of 576 one week (ANON_PREFERRED_LIFETIME) may not be appropriate in all 577 environments. Implementations should provide end users with the 578 ability to override both of these default values. 580 4. Implications of Changing Interface Identifiers 582 The IPv6 addressing architecture goes to great lengths to ensure that 583 interface identifiers are globally unique. During the IPng 584 discussions of the GSE proposal [GSE], it was felt that keeping 585 interface identifiers globally unique in practice might prove useful 586 to future transport protocols. Usage of the algorithms in this 587 document would eliminate that future flexibility. 589 The desires of protecting individual privacy vs. the desire to 590 effectively maintain and debug a network can conflict with each 591 other. Having clients use addresses that change over time will make 592 it more difficult to track down and isolate operational problems. For 593 example, when looking at packet traces, it could become more 594 difficult to determine whether one is seeing behavior caused by a 595 single errant machine, or by a number of them. 597 Some servers refuse to grant access to clients for which no DNS name 598 exists. That is, they perform a DNS PTR query to determine the DNS 599 name, and may then also perform an A query on the returned name to 600 verify that the returned DNS name maps back into the address being 601 used. Consequently, clients not properly registered in the DNS may be 602 unable to access some services. As noted earlier, however, a node's 603 DNS name (if non-changing) serves as a constant identifier. If the 604 extension described in this document becomes widely deployed, servers 605 will likely need to change their behavior to not require every 606 address be in the DNS. Another alternative is to register anonymous 607 addresses in DNS using random names (for example a string version of 608 the address itself). 610 5. Defined Constants 612 Constants defined in this document include: 614 ANON_VALID_LIFETIME -- Default value: 1 week. Users should be able to 615 override the default value. 616 ANON_PREFERRED_LIFETIME -- Default value: 1 day. Users should be able 617 to override the default value. 618 REGEN_ADVANCE -- 5 seconds 620 6. Open Issues and Future Work 622 An implementation might want to keep track of which addresses are 623 being used by upper layers so as to be able to remove a deprecated 624 anonymous address from internal data structures once no upper layer 625 protocols are using it (but not before). This is in contrast to 626 current approaches where addresses are removed from an interface when 627 they become invalid [ADDRCONF], independent of whether or not upper 628 layer protocols are still using them. For TCP connections, such 629 information is available in control blocks. For UDP-based 630 applications, it may be the case that only the applications have 631 knowledge about what addresses are actually in use. Consequently, it 632 may need to use heuristics in deciding when an address is no longer 633 in use (e.g., as is suggested in Section 3.4). 635 Use of the extensions defined in this document is likely to make 636 debugging and other operational troubleshooting activities more 637 difficult. Consequently, it may be site policy that anonymous 638 addresses should not be used. Implementations MAY provide a method 639 for a trusted administrator to override the use of anonymous 640 addresses. 642 7. Security Considerations 644 The motivation for this document stems from privacy concerns for 645 individuals. This document does not appear to add any security issues 646 beyond those already associated with stateless address 647 autoconfiguration [ADDRCONF]. 649 8. Acknowledgments 651 The authors would like to acknowledge the contributions of the IPNGWG 652 working group and, in particular, Matt Crawford and Steve Deering for 653 their detailed comments. 655 9. References 657 [ADDRARCH] Hinden, R. and S. Deering, "IP Version 6 Addressing 658 Architecture", RFC 2373, July 1998. 660 [ADDRCONF] Thomson, S. and T. Narten, "IPv6 Address 661 Autoconfiguration", RFC 2462, December 1998. 663 [ADDR_SELECT] Draves, R. "Default Address Selection for IPv6", draft- 664 ietf-ipngwg-default-addr-select-00.txt. 666 [COOKIES] Kristol, D., Montulli, L., "HTTP State Management 667 Mechanism", draft-ietf-http-state-man-mec-12.txt. 669 [DHCP] Droms, R., "Dynamic Host Configuration Protocol", RFC 2131, 670 March 1997. 672 [DDNS] Vixie et. al., "Dynamic Updates in the Domain Name System (DNS 673 UPDATE)", RFC 2136, April 1997. 675 [DISCOVERY] Narten, T., Nordmark, E. and W. Simpson, "Neighbor 676 Discovery for IP Version 6 (IPv6)", RFC 2461, December 1998. 678 [GSE] Crawford et. al., "Separating Identifiers and Locators in 679 Addresses: An Analysis of the GSE Proposal for IPv6 ", draft- 680 ietf-ipngwg-esd-analysis-04.txt. 682 [IPSEC] Kent, S., Atkinson, R., "Security Architecture for the 683 Internet Protocol", RFC 2401, November 1998. 685 [KEYWORDS] Bradner,S. "Key words for use in RFCs to Indicate 686 Requirement Levels" RFC 2119, March 1997. 688 [MD5] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, April 689 1992. 691 [MOBILEIP] Perkins, C., "IP Mobility Support", RFC 2002, October 692 1996. 694 [RANDOM] "Randomness Recommendations for Security", Eastlake 3rd, D., 695 Crocker S., Schiller, J., RFC 1750, December 1994. 697 [SERIALNUM] Moore, K., "Privacy Considerations for the Use of 698 Hardware Serial Numbers in End-to-End Network Protocols", 699 draft-iesg-serno-privacy-00.txt. 701 10. 702 Authors' Addresses 704 Thomas Narten 705 IBM Corporation 706 P.O. Box 12195 707 Research Triangle Park, NC 27709-2195 708 USA 710 Phone: +1 919 254 7798 711 EMail: narten@raleigh.ibm.com 713 Richard Draves 714 Microsoft Research 715 One Microsoft Way 716 Redmond, WA 98052 718 Phone: +1 425 936 2268 719 Email: richdr@microsoft.com