idnits 2.17.1 draft-williams-exp-tcp-host-id-opt-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 14, 2016) is 2872 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) -- Obsolete informational reference (is this intentional?): RFC 6824 (Obsoleted by RFC 8684) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Williams 3 Internet-Draft Akamai, Inc. 4 Intended status: Informational M. Boucadair 5 Expires: December 16, 2016 France Telecom 6 D. Wing 7 Cisco Systems, Inc. 8 June 14, 2016 10 An Experimental TCP Option for Host Identification 11 draft-williams-exp-tcp-host-id-opt-08 13 Abstract 15 Recent RFCs have discussed issues with host identification in IP 16 address sharing systems, such as shared address/prefix sharing 17 devices and application-layer proxies. Potential solutions for 18 revealing a host identifier in shared address deployments have also 19 been discussed. This memo describes the design, deployment, and 20 privacy considerations for one such solution in operational use on 21 the Internet today that uses a TCP option to transmit a host 22 identifier. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on December 16, 2016. 41 Copyright Notice 43 Copyright (c) 2016 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 1.1. Important Use Cases . . . . . . . . . . . . . . . . . . . 3 60 1.2. Document Goals . . . . . . . . . . . . . . . . . . . . . 5 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 62 3. Option Format . . . . . . . . . . . . . . . . . . . . . . . . 6 63 4. Option Use . . . . . . . . . . . . . . . . . . . . . . . . . 6 64 4.1. Option Values . . . . . . . . . . . . . . . . . . . . . . 6 65 4.2. Sending Host Requirements . . . . . . . . . . . . . . . . 8 66 4.2.1. Alternative SYN Cookie Support . . . . . . . . . . . 8 67 4.2.2. Persistent TCP Connections . . . . . . . . . . . . . 8 68 4.2.3. Packet Fragmentation . . . . . . . . . . . . . . . . 9 69 4.3. Multiple In-Path HOST_ID Senders . . . . . . . . . . . . 9 70 5. Option Interpretation . . . . . . . . . . . . . . . . . . . . 10 71 6. Interaction with Other TCP Options . . . . . . . . . . . . . 11 72 6.1. Multipath TCP (MPTCP) . . . . . . . . . . . . . . . . . . 11 73 6.2. Authentication Option (TCP-AO) . . . . . . . . . . . . . 11 74 6.3. TCP Fast Open (TFO) . . . . . . . . . . . . . . . . . . . 12 75 7. Security Considerations . . . . . . . . . . . . . . . . . . . 12 76 8. Privacy Considerations . . . . . . . . . . . . . . . . . . . 13 77 9. Pervasive Monitoring Considerations . . . . . . . . . . . . . 14 78 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 79 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 80 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 81 12.1. Normative References . . . . . . . . . . . . . . . . . . 15 82 12.2. Informative References . . . . . . . . . . . . . . . . . 16 83 Appendix A. Change History . . . . . . . . . . . . . . . . . . . 18 84 A.1. Changes from version 07 to 08 . . . . . . . . . . . . . . 18 85 A.2. Changes from version 06 to 07 . . . . . . . . . . . . . . 18 86 A.3. Changes from version 05 to 06 . . . . . . . . . . . . . . 19 87 A.4. Changes from version 04 to 05 . . . . . . . . . . . . . . 19 88 A.5. Changes from version 03 to 04 . . . . . . . . . . . . . . 20 89 A.6. Changes from version 02 to 03 . . . . . . . . . . . . . . 20 90 A.7. Changes from version 01 to 02 . . . . . . . . . . . . . . 20 91 A.8. Changes from version 00 to 01 . . . . . . . . . . . . . . 20 92 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 94 1. Introduction 96 A broad range of issues associated with address sharing have been 97 documented in [RFC6269] and [RFC7620]. In addition, [RFC6967] 98 provides analysis of various solutions to the problem of revealing 99 the sending host's identifier (HOST_ID) information to the receiver, 100 indicating that a solution using a TCP [RFC0793] option for this 101 purpose is among the possible approaches that could be applied with 102 limited performance impact and a high success ratio. The purpose of 103 this memo is to describe a TCP HOST_ID option that is currently 104 deployed on the public Internet using the TCP experimental option 105 codepoint, including discussion of related design, deployment, and 106 privacy considerations. 108 Multiple Internet Drafts have defined TCP options for the purpose of 109 host identification: [I-D.wing-nat-reveal-option], 110 [I-D.abdo-hostid-tcpopt-implementation], and 111 [I-D.williams-overlaypath-ip-tcp-rfc]. Specification of multiple 112 option formats to serve the purpose of host identification increases 113 the burden for potential implementers and presents interoperability 114 challenges as well, so the authors of those drafts have worked 115 together to define a common TCP option that supersedes the formats 116 from those three drafts. This memo describes a version of that 117 common TCP option format that is currently in use on the public 118 Internet. 120 The option defined in this memo uses the TCP experimental option 121 codepoint sharing mechanism defined in [RFC6994]. One of the earlier 122 draft specifications, [I-D.williams-overlaypath-ip-tcp-rfc], is 123 associated with unauthorized use of a TCP option kind number, and 124 moving to the TCP experimental option code-point has allowed the 125 authors of that draft to correct their error. 127 1.1. Important Use Cases 129 The authors' implementations have primarily focussed on the following 130 address-sharing use cases in which currently deployed systems insert 131 the HOST_ID option: 133 Carrier Grade NAT (CGN): As defined in [RFC6888], [RFC6333], and 134 other sources, a CGN allows multiple hosts connected to the public 135 Internet to share a single Internet routable IPv4 address. One 136 important characteristic of the CGN use case is that it modifies 137 IP packets in-path, but does not serve as the end point for the 138 associated TCP connections. 140 Application Proxy: As defined in [RFC1919], an application proxy 141 splits a TCP connection into two segments, serving as an endpoint 142 for each of the connections and relaying data flows between the 143 connections. 145 Overlay Network: An overlay network is an Internet based system 146 providing security, optimization, or other services for data flows 147 that transit the system. A network-layer overlay will sometimes 148 act much like a CGN, in that packets transit the system with NAT 149 being applied at the edge of the overlay. A transport-layer or 150 application-layer overlay [RFC3135] will typically act much like 151 an application proxy, in that the TCP connection will be segmented 152 with the overlay network serving as an endpoint for each of the 153 TCP connections. 155 In this set of sender use cases, the TCP option is either applied to 156 an individual TCP packet at the connection endpoint (e.g. an 157 application proxy or a transport layer overlay network) or at an 158 address-sharing middle box (e.g. a CGN or a network layer overlay 159 network). See Section 4 below for additional details about the types 160 of devices that add the option to a TCP packet, as well as existing 161 limitations on use of the option when it is inserted by an address- 162 sharing middlebox, including issues related to packet fragmentation. 164 The existing receiver use cases considered by this memo include the 165 following: 167 o Differentiating between attack and non-attack traffic when the 168 source of the attack is sharing an address with non-attack 169 traffic. 171 o Application of per-subscriber policies for resource utilization, 172 etc. when multiple subscribers are sharing a common address. 174 o Improving server-side load-balancing decisions by allowing the 175 load for multiple clients behind a shared address to be assigned 176 to different servers, even when session-affinity is required at 177 the application layer. 179 In all of the above cases, differentiation between address-sharing 180 clients is performed by a network function that does not process the 181 application layer protocol (e.g. HTTP) or the security protocol 182 (e.g. TLS), because the action needs to be performed prior to 183 decryption or parsing the application layer. Due to this, a solution 184 implemented within the application layer or security protocol was 185 considered unable to fully meet the receiver-side requirements. At 186 the same time, as noted in [RFC6967], use of an IP option for this 187 purpose has a low success rate. For these reasons, using a TCP 188 option to deliver the host identifier was deemed by the authors to be 189 an effective way to satisfy these specific use cases. See Section 5 190 below for details about receiver-side interpretation of the option. 192 1.2. Document Goals 194 Publication of this memo is intended to serve multiple purposes. 196 First and foremost, the document intends to inform readers about a 197 mechanism that is in broad use on the public Internet. The authors 198 are each affiliated with companies that have implemented and/or 199 deployed systems that use the HOST_ID option on the public Internet. 200 Other systems might encounter packets that contain this TCP option, 201 and this document is intended to help others understand the nature of 202 the TCP option when it is encountered so they can make informed 203 decisions about how to handle it. 205 The testing effort documented in 206 [I-D.abdo-hostid-tcpopt-implementation] indicated that a TCP option 207 could be used for host identification purposes without significant 208 disruption of TCP connectivity to legacy servers and networks that do 209 not support the option. It also showed how mechanisms available in 210 existing TCP implementations could make use of such a TCP option for 211 diagnostics and/or packet filtering. The authors' uses of the TCP 212 option on the public Internet has confirmed that it can be used 213 effectively for our use cases, but it has also uncovered some 214 interoperability issues associated with the option's use on the 215 public Internet, especially regarding interactions with other TCP 216 options that support new transport capability being specified within 217 the IETF. Section 6 discusses those interactions and limitations and 218 our systems' handling of associated issues. 220 Discussions within the IETF have raised privacy concerns about the 221 option's use, especially as regards pervasive monitoring risks. 222 Existing uses of the option limit the nature of the HOST_ID values 223 that are used and the systems that insert them in order to mitigate 224 pervasive monitoring risks. Section 8 and Section 9 discuss the 225 authors' assessments of the privacy and monitoring impact of this TCP 226 option in its current uses and suggest behavior for some external 227 systems when the option is encountered. Continued discussion 228 following publication of this memo is expected to allow further 229 refinement of requirements related to the values used to populate the 230 option and how those values can be interpreted by the receiver. 231 There is a trade-off between providing the expected functionality to 232 the receiver and protecting the privacy of the sender, and continued 233 assessment will be necessary in order to find the right balance. 235 2. Terminology 237 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 238 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 239 document are to be interpreted as described in [RFC2119]. 241 3. Option Format 243 When used for host identification, the TCP experimental option uses 244 the experiment identification mechanism described in [RFC6994] and 245 has the following format and content. 247 0 1 2 3 248 01234567 89012345 67890123 45678901 249 +--------+--------+--------+--------+ 250 | Kind | Length | ExID | 251 +--------+--------+--------+--------+ 252 | Host ID ... 253 +--------+--- 255 Kind: The option kind value is 253 257 Length: The length of the option is variable, based on the required 258 size of the host identifier (e.g. a 2 octet host ID will require a 259 length of 6, while a 4 octet host ID will require a length of 8). 261 ExID: The experiment ID value is 0x0348 (840). 263 Host ID: The host identifier is a value that can be used to 264 differentiate among the various hosts sharing a common public IP 265 address. See below for further discussion of this value. 267 4. Option Use 269 This section describes requirements associated with the use of the 270 option, including: expected option values, which hosts are allowed to 271 include the option, and segments that include the option. 273 4.1. Option Values 275 The information conveyed in the HOST_ID option is intended to 276 uniquely identify the sending host to the best capability of the 277 machine that adds the option to the segment, while at the same time 278 avoiding inclusion of information that does not assist this purpose. 279 In addition, the option is not intended to be used to expose 280 information about the sending host that could not be discovered by 281 observing segments in transit on some portion of the Internet path 282 between the sender and the receiver. Existing use cases have 283 different requirements for receiver side functionality, so this 284 document attempts to provide a high degree of flexibility for the 285 machine that adds the option to TCP segments. 287 The HOST_ID option value MUST correlate to IP addresses and/or TCP 288 port numbers that were changed by the inserting host/device (i.e., 289 some of the IP address and/or port number bits are used to generate 290 the HOST_ID). Example values that satisfy this requirement include 291 the following: 293 Unique ID: An inserting host/device could maintain a pool of locally 294 unique ID values that are dynamically mapped to the unique source 295 IP address values in use behind the host/device as a result of 296 address sharing. This ID value would be meaningful only within 297 the context of a specific shared IP address due to the local 298 uniqueness characteristic. Such an ID value could be smaller than 299 an IP address (e.g. 16-bits) in order to conserve TCP option 300 space. This option is preferred because it does not increase IP 301 address visibility on the forward side of the address sharing 302 system, and it SHOULD be used in cases where receiver side 303 requirements can be met without direct inclusion of the original 304 IP address (e.g. some load balancing uses). 306 IP Address/Subnet: An inserting host/device could simply populate 307 the option value with the IP address value in use behind the host/ 308 device. In the case of IPv6 addresses, it could be difficult to 309 include the full address due to TCP option space constraints, so 310 the value would likely need to provide only a portion of the 311 address (e.g. the first 64 bits). 313 IP Address and TCP Port: Some networks share public IP addresses 314 among multiple subscribers with a portion of the TCP port number 315 space being assigned to each subscriber [RFC6346]. When such a 316 system is behind an address sharing host/device, inclusion of both 317 the IP address and the TCP port number will more uniquely identify 318 the sending host than just the IP address on its own. 320 When multiple host identifiers are necessary (e.g. an IP address and 321 a port number), the HOST_ID option is included multiple times within 322 the packet, once for each identifier. While this approach 323 significantly increases option space utilization when multiple 324 identifiers are included, cases where only a single identifier is 325 included are expected to be more common and thus it is beneficial to 326 optimize for those cases. Note that some middleboxes might reorder 327 TCP options, so this method could be problematic if such a middlebox 328 is in-path between the address sharing system and the receiver. This 329 has not proven to be a problem for existing use cases. 331 See Section 8 below for discussion of privacy considerations related 332 to selection of HOST_ID values. 334 4.2. Sending Host Requirements 336 The HOST_ID option MUST only be added by the sending host or any 337 device involved in the forwarding path that changes IP addresses and/ 338 or TCP port numbers (e.g., NAT44 [RFC3022], Layer-2 Aware NAT, DS- 339 Lite AFTR [RFC6333], NPTv6 [RFC6296], NAT64 [RFC6146], Dual-Stack 340 Extra Lite [RFC6619], TCP Proxy, etc.). The HOST_ID option MUST NOT 341 be added or modified en-route by any device that does not modify IP 342 addresses and/or TCP port numbers. 344 The sending host or intermediary device cannot determine whether the 345 option value is used in a stateful manner by the receiver, nor can it 346 determine whether SYN cookies are in use by the receiver. For this 347 reason, the option MUST be included in all segments, both SYN and 348 non-SYN segments, until return segments from the receiver positively 349 indicate that the TCP connection is fully established on the receiver 350 (e.g. the return segment either includes or acknowledges data). 352 4.2.1. Alternative SYN Cookie Support 354 The authors have also considered an alternative approach to SYN 355 cookie support in which the receiving host (i.e. the host that 356 accepts the TCP connection) to echo the option back to the sender in 357 the SYN/ACK segment when a SYN cookie is being sent. This would 358 allow the host sending HOST_ID to determine whether further inclusion 359 of the option is necessary. This approach would have the benefit of 360 not requiring inclusion of the option in non-SYN segments if SYN 361 cookies had not been used. Unfortunately, this approach fails if the 362 responding host itself does not support the option, since an 363 intermediate node would have no way to determine that SYN cookies had 364 been used. 366 4.2.2. Persistent TCP Connections 368 Some types of middleboxes (e.g. application proxy) open and maintain 369 persistent TCP connections to regularly visited destinations in order 370 to minimize connection establishment burden. Such middleboxes might 371 use a single persistent TCP connection for multiple different client 372 hosts over the life of the persistent connection. 374 This specification does not attempt to support the use of persistent 375 TCP connections for multiple client hosts due to the perceived 376 complexity of providing such support. Instead, the HOST_ID option is 377 only allowed to be used at connection initiation. An inserting host/ 378 device that supports both the HOST_ID option and multi-client 379 persistent TCP connections MUST NOT apply the HOST_ID option to TCP 380 connections that could be used for multiple clients over the life of 381 the connection. If the HOST_ID option was sent during connection 382 initiation, the inserting host/device MUST NOT reuse the connection 383 for data flows originating from a client that would require a 384 different HOST_ID value. 386 4.2.3. Packet Fragmentation 388 In order to avoid the overhead associated with in-path IP 389 fragmentation, it is desirable for the inserting host/device to avoid 390 including the HOST_ID option when IP fragmentation might be required. 391 This is not a firm requirement, though, because the HOST_ID option is 392 only included in the first few packets of a TCP connection and thus 393 associated IP fragmentation will generally have minimal impact. The 394 option SHOULD NOT be included in packets if the resulting packet 395 would require local fragmentation. 397 It can be difficult to determine whether local fragmentation would be 398 required. For example, in cases where multiple interfaces with 399 different MTUs are in use, a local routing decision has to be made 400 before the MTU can be determined and in some systems this decision 401 could be made after TCP option handling is complete. Additionally, 402 it could be true that inclusion of the option causes the packet to 403 violate the path's MTU but that the path's MTU has not been learned 404 yet on the sending host/device. 406 In existing deployed systems, the impact of IP fragmentation that 407 results from use of the option has been minimal. 409 4.3. Multiple In-Path HOST_ID Senders 411 The possibility exists that there could be multiple in-path hosts/ 412 devices configured to insert the HOST_ID option. For example, the 413 client's TCP packets might first traverse a CGN device on their way 414 to the edge of a public Internet overlay network. In order for the 415 HOST_ID value to most uniquely identify the sender, it needs to 416 represent both the identity observed by the CGN device (the 417 subscriber's internal IP address, e.g. [RFC6598]) and the identity 418 observed by the overlay network (the shared address of the CGN 419 device). The mechanism for handling the received HOST_ID value could 420 vary depending upon the nature of the new HOST_ID value to be 421 inserted, as described below. 423 The problem of multiple in-path HOST_ID senders has not been observed 424 in existing deployed systems. For this reason, existing 425 implementations do not consistently support this scenario. Some 426 systems do not propagate forward the received HOST_ID option value in 427 any way, while other systems follow the guidance described below. 429 An inserting host/device that uses the received packet's source IP 430 address as the HOST_ID value (possibly along with the port) MUST 431 propagate forward the HOST_ID value(s) from the received packet, 432 since the source IP address and port only represent the previous in- 433 path address sharing device and do not represent the original sender. 434 In the CGN-plus-overlay example, this means that the overlay will 435 include both the CGN's HOST_ID value(s) and a HOST_ID with the source 436 IP address received by the overlay. 438 An inserting host/device that sends a unique ID (as described in 439 Section 4.1) has two options for how to handle the HOST_ID value(s) 440 from the received packet. 442 1. A host/device that sends a unique ID MAY strip the received 443 HOST_ID option and insert its own option, provided that it uses 444 the received HOST_ID value as a differentiator for selecting the 445 unique ID. What this means in the CGN-plus-overlay example above 446 is that the overlay is allowed to drop the HOST_ID value inserted 447 by the CGN provided that the HOST_ID value selected by the 448 overlay represents both the CGN itself and the HOST_ID value 449 inserted by the CGN. 451 2. A host/device that sends a unique ID MAY instead select a unique 452 ID that represents only the previous in-path address-sharing 453 host/device and propagate forward the HOST_ID value inserted by 454 the previous host/device. In the CGN-plus-overlay example, this 455 means that the overlay would include both the CGN's HOST_ID value 456 and a HOST_ID with a unique ID of its own that was selected to 457 represent the CGN's shared address. 459 An inserting host/device that sends a unique ID MUST use one of the 460 above two mechanisms. 462 5. Option Interpretation 464 Due to the variable nature of the option value, it is not possible 465 for the receiving machine to reliably determine the value type from 466 the option itself. For this reason, a receiving host/device SHOULD 467 interpret the option value as an opaque identifier. 469 This specification allows the inserting host/device to provide 470 multiple HOST_ID options. The order of appearance of TCP options 471 could be modified by some middleboxes, so receivers SHOULD NOT rely 472 on option order to provide additional meaning to the individual 473 options. Instead, when multiple HOST_ID options are present, their 474 values SHOULD be concatenated together in the order in which they 475 appear in the packet and treated as a single large identifier. 477 For both of the receiver requirements discussed above, this 478 specification uses SHOULD rather than MUST because reliable 479 interpretation and ordering of options could be possible if the 480 inserting host and the interpreting host are under common 481 administrative control and integrity protect communication between 482 the inserting host and the interpreting host. Mechanisms for 483 signaling the value type(s) and integrity protection are not provided 484 by this specification, and in their absence the receiving host/device 485 MUST interpret the option value(s) as a single opaque identifier. 487 6. Interaction with Other TCP Options 489 This section details how the HOST_ID option functions in conjunction 490 with other TCP options. 492 6.1. Multipath TCP (MPTCP) 494 TCP provides for a maximum of 40 octets for TCP options. As 495 discussed in Appendix A of MPTCP [RFC6824], a typical SYN from 496 modern, popular operating systems contains several TCP options (MSS, 497 window scale, SACK permitted, and timestamp) which consume 19-24 498 octets depending on word alignment of the options. The initial SYN 499 from a multipath TCP client would consume an additional 16 octets. 501 HOST_ID needs at least 6 octets to be useful, so 9-21 octets are 502 sufficient for many scenarios that benefit from HOST_ID. However, 4 503 octets are not enough space for the HOST_ID option. Thus, a TCP SYN 504 containing all the typical TCP options (MSS, window Scale, SACK 505 permitted, timestamp), and also containing multipath capable or 506 multipath join, and also being word aligned, has insufficient space 507 to accommodate HOST_ID. This means something has to give. The 508 choices are either to avoid word alignment in that case (freeing 5 509 octets) or avoid adding the HOST_ID option. Each of these approaches 510 is used in existing implementations and has been deemed acceptable 511 for the associated use case. 513 6.2. Authentication Option (TCP-AO) 515 The TCP-AO option [RFC5925] is incompatible with address sharing due 516 to the fact that it provides integrity protection of the source IP 517 address. For this reason, the only use cases where it makes sense to 518 combine TCP-AO and HOST_ID are those where the TCP-AO-NAT extension 519 [RFC6978] is in use. Injecting a HOST_ID TCP option does not 520 interfere with the use of TCP-AO-NAT because the TCP options are not 521 included in the MAC calculation. 523 6.3. TCP Fast Open (TFO) 525 The TFO option [RFC7413] uses a zero length cookie (total option 526 length 2 bytes) to request a TFO cookie for use on future 527 connections. The server-generated TFO cookie is required to be at 528 least 4 bytes long and allowed to be as long as 16 bytes (total 529 option length 6 to 18 bytes). The cookie request form of the option 530 leaves enough room available in a SYN packet with the most commonly 531 used options to accommodate the HOST_ID option, but a valid TFO 532 cookie length of any longer than 13 bytes would prevent even the 533 minimal 6 byte HOST_ID option from being included in the header. 535 There are multiple possibilities for allowing TFO and HOST_ID to be 536 supported for the same connection, including: 538 o If the TFO implementation allows the cookie size to be 539 configurable, the configured cookie size can be specifically 540 selected to leave enough option space available in a typical TFO 541 SYN packet to allow inclusion of the HOST_ID option. 543 o If the TFO implementation provides explicit support for the 544 HOST_ID option, it can be designed to use a shorter cookie length 545 when the HOST_ID option is present in the TFO cookie request SYN. 547 Reducing the TFO cookie size in order to include the HOST_ID option 548 could have unacceptable security implications, and so existing 549 deployed systems that use the HOST_ID option consider TFO and HOST_ID 550 to be mutually exclusive and do not support the use of both options 551 on the same TCP connection. 553 It should also be noted that the presence of data in a TFO SYN 554 increases the likelihood that there will be no space available in the 555 SYN packet to support inclusion of the HOST_ID option without IP 556 fragmentation, even if there is enough room in the TCP option space. 557 This is an additional reason existing system consider TFO and HOST_ID 558 to be mutually exclusive. 560 7. Security Considerations 562 Security (including privacy) considerations common to all HOST_ID 563 solutions are discussed in [RFC6967]. 565 The content of the HOST_ID option SHOULD NOT be used for purposes 566 that require a trust relationship between the sender and the receiver 567 (e.g. billing and/or subscriber policy enforcement). This 568 requirement uses SHOULD rather than MUST because reliable 569 interpretation of options could be possible if the inserting host and 570 the interpreting host are under common administrative control and 571 integrity protect communication between the inserting host and the 572 interpreting host. Mechanisms for signaling the value type(s) and 573 integrity protection are not provided by this specification, and in 574 their absence the receiving host/device MUST NOT use the HOST_ID 575 value for purposes that require a trust relationship. 577 Note that the above trust requirement applies equally to HOST_ID 578 option values propagated forward from a previous in-path host as 579 described in Section 4.3. In other words, if the trust mechanism 580 does not apply to all option values in the packet, then none of the 581 HOST_ID values can be considered trusted and the receiving host/ 582 device MUST NOT use any of the HOST_ID values for purposes that 583 require a trust relationship. An inserting host/device that has such 584 a trust relationship MUST NOT propagate forward an untrusted HOST_ID 585 in such a way as to allow it to be considered trusted. 587 When the receiving network uses the values provided by the option in 588 a way that does not require trust (e.g. maintaining session affinity 589 in a load-balancing system), then use of a mechanism to enforce the 590 trust relationship is OPTIONAL. 592 8. Privacy Considerations 594 Sending a TCP SYN across the public Internet necessarily discloses 595 the public IP address of the sending host. When an intermediate 596 address sharing device is deployed on the public Internet, anonymity 597 of the hosts using the device will be increased, with hosts 598 represented by multiple source IP addresses on the ingress side of 599 the device using a single source IP address on the egress side. The 600 HOST_ID TCP option removes that increased anonymity, taking 601 information that was already visible in TCP packets on the public 602 Internet on the ingress side of the address sharing device and making 603 it available on the egress side of the device as well. In some 604 cases, an explicit purpose of the address sharing device is 605 anonymity, in which case use of the HOST_ID TCP option would be 606 incompatible with the purpose of the device. 608 A NAT device used to provide interoperability between a local area 609 network (LAN) using private [RFC1918] IP addresses and the public 610 Internet is sometimes specifically intended to provide anonymity for 611 the LAN clients as described in the above paragraph. For this 612 reason, address sharing devices at the border between a private LAN 613 and the public Internet MUST NOT insert the HOST_ID option. 615 The HOST_ID option MUST NOT be used to provide client geographic or 616 network location information that was not publicly visible in IP 617 packets for the TCP flows processed by the inserting host. For 618 example, the client's IP address MAY be used as the HOST_ID option 619 value, but any geographic or network location information derived 620 from the client's IP address MUST NOT be used as the HOST_ID value. 622 The HOST_ID option MAY provide differentiating information that is 623 locally unique such that individual TCP flows processed by the 624 inserting host can be reliably identified. The HOST_ID option MUST 625 NOT provide client identification information that was not publicly 626 visible in IP packets for the TCP flows processed by the inserting 627 host, such as subscriber information linked to the IP address. 629 The HOST_ID value MUST be changed whenever the subscriber IP address 630 changes. This requirement ensures that the HOST_ID option does not 631 introduce a new globally unique identifier that persists across 632 subscriber IP address changes. 634 The HOST_ID option MUST be stripped from IP packets traversing middle 635 boxes that provide network-based anonymity services. 637 9. Pervasive Monitoring Considerations 639 [RFC7258] provides the following guidance: "those developing IETF 640 specifications need to be able to describe how they have considered 641 Pervasive Monitoring, and, if the attack is relevant to the work to 642 be published, be able to justify related design decisions." 643 Legitimate concerns about host identification have been raised within 644 the IETF. The authors of this memo have attempted to address those 645 concerns by providing details about the nature of the HOST_ID values 646 and the types of middleboxes that should and should not be including 647 the HOST_ID option in TCP headers, which describes limitations 648 already imposed by existing deployed systems. This section is 649 intended to highlight some particularly important aspects of this 650 design and the related guidance/limitations that are relevant to the 651 pervasive monitoring discussion. 653 When a generated identifier is used, this document prohibits the 654 address sharing device from using globally unique or permanent 655 identifiers. Only locally unique identifiers are allowed. As with 656 persistent IP addresses, persistent HOST_ID values could facilitate 657 user tracking and are therefore prohibited. The specific 658 requirements for permissible HOST_ID values are discussed in 659 Section 8 and Section 4.1. 661 This specification does not target exposing a host beyond what the 662 original packet, issued from that host, would have already exposed on 663 the public Internet without introduction of the option. The option 664 is intended only to carry forward information that was conveyed to 665 the address-sharing device in the original packet, and HOST_ID option 666 values that do not match this description are prohibited by 667 requirements discussed in Section 8. This design does not allow the 668 HOST_ID option to carry personally identifiable information, 669 geographic location identifiers, or any other information that is not 670 available in the wire format of the associated TCP/IP headers. 672 This document's guidance on option values is followed in existing 673 deployed system. Thus, the volatility of the information conveyed in 674 a HOST_ID option is similar to that of the public, subscriber IP 675 address. A distinct HOST_ID is used by the address-sharing function 676 when the host reboots or gets a new public IP address from the 677 subscriber network. 679 The described TCP option allows network identification to a similar 680 level as the first 64 bits of an IPv6 address. That is, the server 681 can use the bits of the TCP option to help identify a host behind an 682 address-sharing device, in much the same way the server would use the 683 host's IPv6 network address if the client and server were using IPv6 684 end-to-end. 686 Some address-sharing middleboxes on the public Internet have the 687 express intention of providing originator anonymity. Publication of 688 this document can help such middleboxes recognize the associated risk 689 and take action to mitigate it (e.g. by stripping or modifying the 690 option value). 692 10. IANA Considerations 694 This document specifies a new TCP option that uses the shared 695 experimental options format [RFC6994], with ExID=0x0348 (840) in 696 network-standard byte order. This ExID has already been registered 697 with IANA. 699 11. Acknowledgements 701 Many thanks to W. Eddy, Y. Nishida, T. Reddy, M. Scharf, J. 702 Touch, A. Zimmermann, and A. Falk for their comments. 704 12. References 706 12.1. Normative References 708 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 709 RFC 793, DOI 10.17487/RFC0793, September 1981, 710 . 712 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 713 Requirement Levels", BCP 14, RFC 2119, 714 DOI 10.17487/RFC2119, March 1997, 715 . 717 [RFC6994] Touch, J., "Shared Use of Experimental TCP Options", 718 RFC 6994, DOI 10.17487/RFC6994, August 2013, 719 . 721 12.2. Informative References 723 [I-D.abdo-hostid-tcpopt-implementation] 724 Abdo, E., Boucadair, M., and J. Queiroz, "HOST_ID TCP 725 Options: Implementation & Preliminary Test Results", 726 draft-abdo-hostid-tcpopt-implementation-03 (work in 727 progress), July 2012. 729 [I-D.williams-overlaypath-ip-tcp-rfc] 730 Williams, B., "Overlay Path Option for IP and TCP", draft- 731 williams-overlaypath-ip-tcp-rfc-04 (work in progress), 732 June 2013. 734 [I-D.wing-nat-reveal-option] 735 Yourtchenko, A. and D. Wing, "Revealing hosts sharing an 736 IP address using TCP option", draft-wing-nat-reveal- 737 option-03 (work in progress), December 2011. 739 [RFC1918] Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G., 740 and E. Lear, "Address Allocation for Private Internets", 741 BCP 5, RFC 1918, DOI 10.17487/RFC1918, February 1996, 742 . 744 [RFC1919] Chatel, M., "Classical versus Transparent IP Proxies", 745 RFC 1919, DOI 10.17487/RFC1919, March 1996, 746 . 748 [RFC3022] Srisuresh, P. and K. Egevang, "Traditional IP Network 749 Address Translator (Traditional NAT)", RFC 3022, 750 DOI 10.17487/RFC3022, January 2001, 751 . 753 [RFC3135] Border, J., Kojo, M., Griner, J., Montenegro, G., and Z. 754 Shelby, "Performance Enhancing Proxies Intended to 755 Mitigate Link-Related Degradations", RFC 3135, 756 DOI 10.17487/RFC3135, June 2001, 757 . 759 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 760 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 761 June 2010, . 763 [RFC6146] Bagnulo, M., Matthews, P., and I. van Beijnum, "Stateful 764 NAT64: Network Address and Protocol Translation from IPv6 765 Clients to IPv4 Servers", RFC 6146, DOI 10.17487/RFC6146, 766 April 2011, . 768 [RFC6269] Ford, M., Ed., Boucadair, M., Durand, A., Levis, P., and 769 P. Roberts, "Issues with IP Address Sharing", RFC 6269, 770 DOI 10.17487/RFC6269, June 2011, 771 . 773 [RFC6296] Wasserman, M. and F. Baker, "IPv6-to-IPv6 Network Prefix 774 Translation", RFC 6296, DOI 10.17487/RFC6296, June 2011, 775 . 777 [RFC6333] Durand, A., Droms, R., Woodyatt, J., and Y. Lee, "Dual- 778 Stack Lite Broadband Deployments Following IPv4 779 Exhaustion", RFC 6333, DOI 10.17487/RFC6333, August 2011, 780 . 782 [RFC6346] Bush, R., Ed., "The Address plus Port (A+P) Approach to 783 the IPv4 Address Shortage", RFC 6346, 784 DOI 10.17487/RFC6346, August 2011, 785 . 787 [RFC6598] Weil, J., Kuarsingh, V., Donley, C., Liljenstolpe, C., and 788 M. Azinger, "IANA-Reserved IPv4 Prefix for Shared Address 789 Space", BCP 153, RFC 6598, DOI 10.17487/RFC6598, April 790 2012, . 792 [RFC6619] Arkko, J., Eggert, L., and M. Townsley, "Scalable 793 Operation of Address Translators with Per-Interface 794 Bindings", RFC 6619, DOI 10.17487/RFC6619, June 2012, 795 . 797 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 798 "TCP Extensions for Multipath Operation with Multiple 799 Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013, 800 . 802 [RFC6888] Perreault, S., Ed., Yamagata, I., Miyakawa, S., Nakagawa, 803 A., and H. Ashida, "Common Requirements for Carrier-Grade 804 NATs (CGNs)", BCP 127, RFC 6888, DOI 10.17487/RFC6888, 805 April 2013, . 807 [RFC6967] Boucadair, M., Touch, J., Levis, P., and R. Penno, 808 "Analysis of Potential Solutions for Revealing a Host 809 Identifier (HOST_ID) in Shared Address Deployments", 810 RFC 6967, DOI 10.17487/RFC6967, June 2013, 811 . 813 [RFC6978] Touch, J., "A TCP Authentication Option Extension for NAT 814 Traversal", RFC 6978, DOI 10.17487/RFC6978, July 2013, 815 . 817 [RFC7258] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an 818 Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May 819 2014, . 821 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 822 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 823 . 825 [RFC7620] Boucadair, M., Ed., Chatras, B., Reddy, T., Williams, B., 826 and B. Sarikaya, "Scenarios with Host Identification 827 Complications", RFC 7620, DOI 10.17487/RFC7620, August 828 2015, . 830 Appendix A. Change History 832 [Note to RFC Editor: Please remove this section prior to 833 publication.] 835 A.1. Changes from version 07 to 08 837 Changed document category from experimental to informational. 839 Updated text throughout the document to further document that the 840 option is in use on the public Internet and high-lighted specifics of 841 how the option is used in existing implementations, especially when 842 those implementations deviate from the document's recommendations. 844 Added text to further clarify that the document does not represent 845 IETF consensus, especially due to concerns about privacy and 846 pervasive monitoring. 848 A.2. Changes from version 06 to 07 850 Clarified pervasive monitoring considerations and added back-pointers 851 to where the requirements are more clearly called out. 853 A.3. Changes from version 05 to 06 855 Re-write the introduction to clarify that this document describes a 856 practice that is in use on the public Internet today, and that the 857 purpose of the document is publish design, deployment, and privacy 858 considerations related to its use. 860 Correct wording in the abstract to clarify that the IETF has not 861 indicated support for host identification, but rather than proposals 862 discussed within the IETF have done so. 864 Add a section that summarizes the authors' understanding of the 865 impact on pervasive monitoring to re-enforce the importance of 866 following the document's related guidance. 868 A.4. Changes from version 04 to 05 870 Make this document self-contained, rather than referring readers to 871 use-cases and requirements contained in other I.D.s that were never 872 published as RFCs. 874 Add discussion of TCP Fast Open. 876 Correct some discussion of TCP-AO and TCP-AO-NAT. 878 Clarify exactly what the identifier is identifying. 880 Improve discussion on interpretation of multiple instances of the 881 option, including order of interpretation and set interpretation. 883 Evaluated whether use of multiple identifiers should be constrained. 884 This is unclear, and so left for the experiment to determine. 886 Discuss the possibility of the option value changing over the life of 887 the connection (spec now prohibits this). 889 Clarify use cases related to stripping and replacing the option. 891 Add discussion of non-local fragmentation. 893 Evaluate the reliability of attempts to exclude the option when local 894 fragmentation would be required. 896 Clarify the security requirements re: trust relationship. 897 Specifically calls out that common admin control and authentication 898 can allow additional uses. 900 Clarify privacy considerations regarding NATs that separate private 901 and public networks. 903 Remove restatement of requirements from other documents. 905 Justify use of SHOULD rather than MUST throughout. 907 A.5. Changes from version 03 to 04 909 Improve discussion of RFC6967. 911 Don't use "message" to describe TCP segments. 913 Add reference to RFC6994 to section 3. 915 Clarify that this specifications supersedes earlier documents. 917 Improve discussion of SYN cookie handling. 919 Remove lower case uses of keywords (e.g. must, should, etc.) 920 throughout the document. 922 Some stronger privacy guidance, replacing SHOULD with MUST. 924 Add an experiment goal related to optimal option value. 926 Add text related to the identification goals of the option value 927 (still needs more work). 929 A.6. Changes from version 02 to 03 931 Clarification of arguments in favor of this approach. 933 Add discussion of important use cases. 935 Clarification of experiment goals and earlier test results. 937 A.7. Changes from version 01 to 02 939 Add note re: order of appearance. 941 A.8. Changes from version 00 to 01 943 Add discussion of experiment goals. 945 Limit external references to the earlier specifications. 947 Add guidance to limit the types of device that add the option. 949 Improve/correct discussion of TCP-AO and security. 951 Authors' Addresses 953 Brandon Williams 954 Akamai, Inc. 955 8 Cambridge Center 956 Cambridge, MA 02142 957 USA 959 Email: brandon.williams@akamai.com 961 Mohamed Boucadair 962 France Telecom 963 Rennes, 35000 964 Fance 966 Email: mohamed.boucadair@orange.com 968 Dan Wing 969 Cisco Systems, Inc. 970 170 West Tasman Drive 971 San Jose, CA 95134 972 USA 974 Email: dwing@cisco.com