idnits 2.17.1 

draft-williams-exp-tcp-host-id-opt-08.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (June 14, 2016) is 2872 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  ** Obsolete normative reference: RFC  793 (Obsoleted by RFC 9293)

  -- Obsolete informational reference (is this intentional?): RFC 6824
     (Obsoleted by RFC 8684)


     Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                        B. Williams
3	Internet-Draft                                              Akamai, Inc.
4	Intended status: Informational                              M. Boucadair
5	Expires: December 16, 2016                                France Telecom
6	                                                                 D. Wing
7	                                                     Cisco Systems, Inc.
8	                                                           June 14, 2016

10	           An Experimental TCP Option for Host Identification
11	                 draft-williams-exp-tcp-host-id-opt-08

13	Abstract

15	   Recent RFCs have discussed issues with host identification in IP
16	   address sharing systems, such as shared address/prefix sharing
17	   devices and application-layer proxies.  Potential solutions for
18	   revealing a host identifier in shared address deployments have also
19	   been discussed.  This memo describes the design, deployment, and
20	   privacy considerations for one such solution in operational use on
21	   the Internet today that uses a TCP option to transmit a host
22	   identifier.

24	Status of This Memo

26	   This Internet-Draft is submitted in full conformance with the
27	   provisions of BCP 78 and BCP 79.

29	   Internet-Drafts are working documents of the Internet Engineering
30	   Task Force (IETF).  Note that other groups may also distribute
31	   working documents as Internet-Drafts.  The list of current Internet-
32	   Drafts is at http://datatracker.ietf.org/drafts/current/.

34	   Internet-Drafts are draft documents valid for a maximum of six months
35	   and may be updated, replaced, or obsoleted by other documents at any
36	   time.  It is inappropriate to use Internet-Drafts as reference
37	   material or to cite them other than as "work in progress."

39	   This Internet-Draft will expire on December 16, 2016.

41	Copyright Notice

43	   Copyright (c) 2016 IETF Trust and the persons identified as the
44	   document authors.  All rights reserved.

46	   This document is subject to BCP 78 and the IETF Trust's Legal
47	   Provisions Relating to IETF Documents
48	   (http://trustee.ietf.org/license-info) in effect on the date of
49	   publication of this document.  Please review these documents
50	   carefully, as they describe your rights and restrictions with respect
51	   to this document.  Code Components extracted from this document must
52	   include Simplified BSD License text as described in Section 4.e of
53	   the Trust Legal Provisions and are provided without warranty as
54	   described in the Simplified BSD License.

56	Table of Contents

58	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
59	     1.1.  Important Use Cases . . . . . . . . . . . . . . . . . . .   3
60	     1.2.  Document Goals  . . . . . . . . . . . . . . . . . . . . .   5
61	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   6
62	   3.  Option Format . . . . . . . . . . . . . . . . . . . . . . . .   6
63	   4.  Option Use  . . . . . . . . . . . . . . . . . . . . . . . . .   6
64	     4.1.  Option Values . . . . . . . . . . . . . . . . . . . . . .   6
65	     4.2.  Sending Host Requirements . . . . . . . . . . . . . . . .   8
66	       4.2.1.  Alternative SYN Cookie Support  . . . . . . . . . . .   8
67	       4.2.2.  Persistent TCP Connections  . . . . . . . . . . . . .   8
68	       4.2.3.  Packet Fragmentation  . . . . . . . . . . . . . . . .   9
69	     4.3.  Multiple In-Path HOST_ID Senders  . . . . . . . . . . . .   9
70	   5.  Option Interpretation . . . . . . . . . . . . . . . . . . . .  10
71	   6.  Interaction with Other TCP Options  . . . . . . . . . . . . .  11
72	     6.1.  Multipath TCP (MPTCP) . . . . . . . . . . . . . . . . . .  11
73	     6.2.  Authentication Option (TCP-AO)  . . . . . . . . . . . . .  11
74	     6.3.  TCP Fast Open (TFO) . . . . . . . . . . . . . . . . . . .  12
75	   7.  Security Considerations . . . . . . . . . . . . . . . . . . .  12
76	   8.  Privacy Considerations  . . . . . . . . . . . . . . . . . . .  13
77	   9.  Pervasive Monitoring Considerations . . . . . . . . . . . . .  14
78	   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  15
79	   11. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  15
80	   12. References  . . . . . . . . . . . . . . . . . . . . . . . . .  15
81	     12.1.  Normative References . . . . . . . . . . . . . . . . . .  15
82	     12.2.  Informative References . . . . . . . . . . . . . . . . .  16
83	   Appendix A.  Change History . . . . . . . . . . . . . . . . . . .  18
84	     A.1.  Changes from version 07 to 08 . . . . . . . . . . . . . .  18
85	     A.2.  Changes from version 06 to 07 . . . . . . . . . . . . . .  18
86	     A.3.  Changes from version 05 to 06 . . . . . . . . . . . . . .  19
87	     A.4.  Changes from version 04 to 05 . . . . . . . . . . . . . .  19
88	     A.5.  Changes from version 03 to 04 . . . . . . . . . . . . . .  20
89	     A.6.  Changes from version 02 to 03 . . . . . . . . . . . . . .  20
90	     A.7.  Changes from version 01 to 02 . . . . . . . . . . . . . .  20
91	     A.8.  Changes from version 00 to 01 . . . . . . . . . . . . . .  20
92	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  21

94	1.  Introduction

96	   A broad range of issues associated with address sharing have been
97	   documented in [RFC6269] and [RFC7620].  In addition, [RFC6967]
98	   provides analysis of various solutions to the problem of revealing
99	   the sending host's identifier (HOST_ID) information to the receiver,
100	   indicating that a solution using a TCP [RFC0793] option for this
101	   purpose is among the possible approaches that could be applied with
102	   limited performance impact and a high success ratio.  The purpose of
103	   this memo is to describe a TCP HOST_ID option that is currently
104	   deployed on the public Internet using the TCP experimental option
105	   codepoint, including discussion of related design, deployment, and
106	   privacy considerations.

108	   Multiple Internet Drafts have defined TCP options for the purpose of
109	   host identification: [I-D.wing-nat-reveal-option],
110	   [I-D.abdo-hostid-tcpopt-implementation], and
111	   [I-D.williams-overlaypath-ip-tcp-rfc].  Specification of multiple
112	   option formats to serve the purpose of host identification increases
113	   the burden for potential implementers and presents interoperability
114	   challenges as well, so the authors of those drafts have worked
115	   together to define a common TCP option that supersedes the formats
116	   from those three drafts.  This memo describes a version of that
117	   common TCP option format that is currently in use on the public
118	   Internet.

120	   The option defined in this memo uses the TCP experimental option
121	   codepoint sharing mechanism defined in [RFC6994].  One of the earlier
122	   draft specifications, [I-D.williams-overlaypath-ip-tcp-rfc], is
123	   associated with unauthorized use of a TCP option kind number, and
124	   moving to the TCP experimental option code-point has allowed the
125	   authors of that draft to correct their error.

127	1.1.  Important Use Cases

129	   The authors' implementations have primarily focussed on the following
130	   address-sharing use cases in which currently deployed systems insert
131	   the HOST_ID option:

133	   Carrier Grade NAT (CGN):  As defined in [RFC6888], [RFC6333], and
134	      other sources, a CGN allows multiple hosts connected to the public
135	      Internet to share a single Internet routable IPv4 address.  One
136	      important characteristic of the CGN use case is that it modifies
137	      IP packets in-path, but does not serve as the end point for the
138	      associated TCP connections.

140	   Application Proxy:  As defined in [RFC1919], an application proxy
141	      splits a TCP connection into two segments, serving as an endpoint
142	      for each of the connections and relaying data flows between the
143	      connections.

145	   Overlay Network:  An overlay network is an Internet based system
146	      providing security, optimization, or other services for data flows
147	      that transit the system.  A network-layer overlay will sometimes
148	      act much like a CGN, in that packets transit the system with NAT
149	      being applied at the edge of the overlay.  A transport-layer or
150	      application-layer overlay [RFC3135] will typically act much like
151	      an application proxy, in that the TCP connection will be segmented
152	      with the overlay network serving as an endpoint for each of the
153	      TCP connections.

155	   In this set of sender use cases, the TCP option is either applied to
156	   an individual TCP packet at the connection endpoint (e.g. an
157	   application proxy or a transport layer overlay network) or at an
158	   address-sharing middle box (e.g. a CGN or a network layer overlay
159	   network).  See Section 4 below for additional details about the types
160	   of devices that add the option to a TCP packet, as well as existing
161	   limitations on use of the option when it is inserted by an address-
162	   sharing middlebox, including issues related to packet fragmentation.

164	   The existing receiver use cases considered by this memo include the
165	   following:

167	   o  Differentiating between attack and non-attack traffic when the
168	      source of the attack is sharing an address with non-attack
169	      traffic.

171	   o  Application of per-subscriber policies for resource utilization,
172	      etc. when multiple subscribers are sharing a common address.

174	   o  Improving server-side load-balancing decisions by allowing the
175	      load for multiple clients behind a shared address to be assigned
176	      to different servers, even when session-affinity is required at
177	      the application layer.

179	   In all of the above cases, differentiation between address-sharing
180	   clients is performed by a network function that does not process the
181	   application layer protocol (e.g.  HTTP) or the security protocol
182	   (e.g.  TLS), because the action needs to be performed prior to
183	   decryption or parsing the application layer.  Due to this, a solution
184	   implemented within the application layer or security protocol was
185	   considered unable to fully meet the receiver-side requirements.  At
186	   the same time, as noted in [RFC6967], use of an IP option for this
187	   purpose has a low success rate.  For these reasons, using a TCP
188	   option to deliver the host identifier was deemed by the authors to be
189	   an effective way to satisfy these specific use cases.  See Section 5
190	   below for details about receiver-side interpretation of the option.

192	1.2.  Document Goals

194	   Publication of this memo is intended to serve multiple purposes.

196	   First and foremost, the document intends to inform readers about a
197	   mechanism that is in broad use on the public Internet.  The authors
198	   are each affiliated with companies that have implemented and/or
199	   deployed systems that use the HOST_ID option on the public Internet.
200	   Other systems might encounter packets that contain this TCP option,
201	   and this document is intended to help others understand the nature of
202	   the TCP option when it is encountered so they can make informed
203	   decisions about how to handle it.

205	   The testing effort documented in
206	   [I-D.abdo-hostid-tcpopt-implementation] indicated that a TCP option
207	   could be used for host identification purposes without significant
208	   disruption of TCP connectivity to legacy servers and networks that do
209	   not support the option.  It also showed how mechanisms available in
210	   existing TCP implementations could make use of such a TCP option for
211	   diagnostics and/or packet filtering.  The authors' uses of the TCP
212	   option on the public Internet has confirmed that it can be used
213	   effectively for our use cases, but it has also uncovered some
214	   interoperability issues associated with the option's use on the
215	   public Internet, especially regarding interactions with other TCP
216	   options that support new transport capability being specified within
217	   the IETF.  Section 6 discusses those interactions and limitations and
218	   our systems' handling of associated issues.

220	   Discussions within the IETF have raised privacy concerns about the
221	   option's use, especially as regards pervasive monitoring risks.
222	   Existing uses of the option limit the nature of the HOST_ID values
223	   that are used and the systems that insert them in order to mitigate
224	   pervasive monitoring risks.  Section 8 and Section 9 discuss the
225	   authors' assessments of the privacy and monitoring impact of this TCP
226	   option in its current uses and suggest behavior for some external
227	   systems when the option is encountered.  Continued discussion
228	   following publication of this memo is expected to allow further
229	   refinement of requirements related to the values used to populate the
230	   option and how those values can be interpreted by the receiver.
231	   There is a trade-off between providing the expected functionality to
232	   the receiver and protecting the privacy of the sender, and continued
233	   assessment will be necessary in order to find the right balance.

235	2.  Terminology

237	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
238	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
239	   document are to be interpreted as described in [RFC2119].

241	3.  Option Format

243	   When used for host identification, the TCP experimental option uses
244	   the experiment identification mechanism described in [RFC6994] and
245	   has the following format and content.

247	    0          1          2          3
248	    01234567 89012345 67890123 45678901
249	   +--------+--------+--------+--------+
250	   |  Kind  | Length |       ExID      |
251	   +--------+--------+--------+--------+
252	   |  Host ID ...
253	   +--------+---

255	   Kind:  The option kind value is 253

257	   Length:  The length of the option is variable, based on the required
258	      size of the host identifier (e.g. a 2 octet host ID will require a
259	      length of 6, while a 4 octet host ID will require a length of 8).

261	   ExID:  The experiment ID value is 0x0348 (840).

263	   Host ID:  The host identifier is a value that can be used to
264	      differentiate among the various hosts sharing a common public IP
265	      address.  See below for further discussion of this value.

267	4.  Option Use

269	   This section describes requirements associated with the use of the
270	   option, including: expected option values, which hosts are allowed to
271	   include the option, and segments that include the option.

273	4.1.  Option Values

275	   The information conveyed in the HOST_ID option is intended to
276	   uniquely identify the sending host to the best capability of the
277	   machine that adds the option to the segment, while at the same time
278	   avoiding inclusion of information that does not assist this purpose.
279	   In addition, the option is not intended to be used to expose
280	   information about the sending host that could not be discovered by
281	   observing segments in transit on some portion of the Internet path
282	   between the sender and the receiver.  Existing use cases have
283	   different requirements for receiver side functionality, so this
284	   document attempts to provide a high degree of flexibility for the
285	   machine that adds the option to TCP segments.

287	   The HOST_ID option value MUST correlate to IP addresses and/or TCP
288	   port numbers that were changed by the inserting host/device (i.e.,
289	   some of the IP address and/or port number bits are used to generate
290	   the HOST_ID).  Example values that satisfy this requirement include
291	   the following:

293	   Unique ID:  An inserting host/device could maintain a pool of locally
294	      unique ID values that are dynamically mapped to the unique source
295	      IP address values in use behind the host/device as a result of
296	      address sharing.  This ID value would be meaningful only within
297	      the context of a specific shared IP address due to the local
298	      uniqueness characteristic.  Such an ID value could be smaller than
299	      an IP address (e.g. 16-bits) in order to conserve TCP option
300	      space.  This option is preferred because it does not increase IP
301	      address visibility on the forward side of the address sharing
302	      system, and it SHOULD be used in cases where receiver side
303	      requirements can be met without direct inclusion of the original
304	      IP address (e.g. some load balancing uses).

306	   IP Address/Subnet:  An inserting host/device could simply populate
307	      the option value with the IP address value in use behind the host/
308	      device.  In the case of IPv6 addresses, it could be difficult to
309	      include the full address due to TCP option space constraints, so
310	      the value would likely need to provide only a portion of the
311	      address (e.g. the first 64 bits).

313	   IP Address and TCP Port:  Some networks share public IP addresses
314	      among multiple subscribers with a portion of the TCP port number
315	      space being assigned to each subscriber [RFC6346].  When such a
316	      system is behind an address sharing host/device, inclusion of both
317	      the IP address and the TCP port number will more uniquely identify
318	      the sending host than just the IP address on its own.

320	   When multiple host identifiers are necessary (e.g.  an IP address and
321	   a port number), the HOST_ID option is included multiple times within
322	   the packet, once for each identifier.  While this approach
323	   significantly increases option space utilization when multiple
324	   identifiers are included, cases where only a single identifier is
325	   included are expected to be more common and thus it is beneficial to
326	   optimize for those cases.  Note that some middleboxes might reorder
327	   TCP options, so this method could be problematic if such a middlebox
328	   is in-path between the address sharing system and the receiver.  This
329	   has not proven to be a problem for existing use cases.

331	   See Section 8 below for discussion of privacy considerations related
332	   to selection of HOST_ID values.

334	4.2.  Sending Host Requirements

336	   The HOST_ID option MUST only be added by the sending host or any
337	   device involved in the forwarding path that changes IP addresses and/
338	   or TCP port numbers (e.g., NAT44 [RFC3022], Layer-2 Aware NAT, DS-
339	   Lite AFTR [RFC6333], NPTv6 [RFC6296], NAT64 [RFC6146], Dual-Stack
340	   Extra Lite [RFC6619], TCP Proxy, etc.).  The HOST_ID option MUST NOT
341	   be added or modified en-route by any device that does not modify IP
342	   addresses and/or TCP port numbers.

344	   The sending host or intermediary device cannot determine whether the
345	   option value is used in a stateful manner by the receiver, nor can it
346	   determine whether SYN cookies are in use by the receiver.  For this
347	   reason, the option MUST be included in all segments, both SYN and
348	   non-SYN segments, until return segments from the receiver positively
349	   indicate that the TCP connection is fully established on the receiver
350	   (e.g. the return segment either includes or acknowledges data).

352	4.2.1.  Alternative SYN Cookie Support

354	   The authors have also considered an alternative approach to SYN
355	   cookie support in which the receiving host (i.e.  the host that
356	   accepts the TCP connection) to echo the option back to the sender in
357	   the SYN/ACK segment when a SYN cookie is being sent.  This would
358	   allow the host sending HOST_ID to determine whether further inclusion
359	   of the option is necessary.  This approach would have the benefit of
360	   not requiring inclusion of the option in non-SYN segments if SYN
361	   cookies had not been used.  Unfortunately, this approach fails if the
362	   responding host itself does not support the option, since an
363	   intermediate node would have no way to determine that SYN cookies had
364	   been used.

366	4.2.2.  Persistent TCP Connections

368	   Some types of middleboxes (e.g. application proxy) open and maintain
369	   persistent TCP connections to regularly visited destinations in order
370	   to minimize connection establishment burden.  Such middleboxes might
371	   use a single persistent TCP connection for multiple different client
372	   hosts over the life of the persistent connection.

374	   This specification does not attempt to support the use of persistent
375	   TCP connections for multiple client hosts due to the perceived
376	   complexity of providing such support.  Instead, the HOST_ID option is
377	   only allowed to be used at connection initiation.  An inserting host/
378	   device that supports both the HOST_ID option and multi-client
379	   persistent TCP connections MUST NOT apply the HOST_ID option to TCP
380	   connections that could be used for multiple clients over the life of
381	   the connection.  If the HOST_ID option was sent during connection
382	   initiation, the inserting host/device MUST NOT reuse the connection
383	   for data flows originating from a client that would require a
384	   different HOST_ID value.

386	4.2.3.  Packet Fragmentation

388	   In order to avoid the overhead associated with in-path IP
389	   fragmentation, it is desirable for the inserting host/device to avoid
390	   including the HOST_ID option when IP fragmentation might be required.
391	   This is not a firm requirement, though, because the HOST_ID option is
392	   only included in the first few packets of a TCP connection and thus
393	   associated IP fragmentation will generally have minimal impact.  The
394	   option SHOULD NOT be included in packets if the resulting packet
395	   would require local fragmentation.

397	   It can be difficult to determine whether local fragmentation would be
398	   required.  For example, in cases where multiple interfaces with
399	   different MTUs are in use, a local routing decision has to be made
400	   before the MTU can be determined and in some systems this decision
401	   could be made after TCP option handling is complete.  Additionally,
402	   it could be true that inclusion of the option causes the packet to
403	   violate the path's MTU but that the path's MTU has not been learned
404	   yet on the sending host/device.

406	   In existing deployed systems, the impact of IP fragmentation that
407	   results from use of the option has been minimal.

409	4.3.  Multiple In-Path HOST_ID Senders

411	   The possibility exists that there could be multiple in-path hosts/
412	   devices configured to insert the HOST_ID option.  For example, the
413	   client's TCP packets might first traverse a CGN device on their way
414	   to the edge of a public Internet overlay network.  In order for the
415	   HOST_ID value to most uniquely identify the sender, it needs to
416	   represent both the identity observed by the CGN device (the
417	   subscriber's internal IP address, e.g.  [RFC6598]) and the identity
418	   observed by the overlay network (the shared address of the CGN
419	   device).  The mechanism for handling the received HOST_ID value could
420	   vary depending upon the nature of the new HOST_ID value to be
421	   inserted, as described below.

423	   The problem of multiple in-path HOST_ID senders has not been observed
424	   in existing deployed systems.  For this reason, existing
425	   implementations do not consistently support this scenario.  Some
426	   systems do not propagate forward the received HOST_ID option value in
427	   any way, while other systems follow the guidance described below.

429	   An inserting host/device that uses the received packet's source IP
430	   address as the HOST_ID value (possibly along with the port) MUST
431	   propagate forward the HOST_ID value(s) from the received packet,
432	   since the source IP address and port only represent the previous in-
433	   path address sharing device and do not represent the original sender.
434	   In the CGN-plus-overlay example, this means that the overlay will
435	   include both the CGN's HOST_ID value(s) and a HOST_ID with the source
436	   IP address received by the overlay.

438	   An inserting host/device that sends a unique ID (as described in
439	   Section 4.1) has two options for how to handle the HOST_ID value(s)
440	   from the received packet.

442	   1.  A host/device that sends a unique ID MAY strip the received
443	       HOST_ID option and insert its own option, provided that it uses
444	       the received HOST_ID value as a differentiator for selecting the
445	       unique ID.  What this means in the CGN-plus-overlay example above
446	       is that the overlay is allowed to drop the HOST_ID value inserted
447	       by the CGN provided that the HOST_ID value selected by the
448	       overlay represents both the CGN itself and the HOST_ID value
449	       inserted by the CGN.

451	   2.  A host/device that sends a unique ID MAY instead select a unique
452	       ID that represents only the previous in-path address-sharing
453	       host/device and propagate forward the HOST_ID value inserted by
454	       the previous host/device.  In the CGN-plus-overlay example, this
455	       means that the overlay would include both the CGN's HOST_ID value
456	       and a HOST_ID with a unique ID of its own that was selected to
457	       represent the CGN's shared address.

459	   An inserting host/device that sends a unique ID MUST use one of the
460	   above two mechanisms.

462	5.  Option Interpretation

464	   Due to the variable nature of the option value, it is not possible
465	   for the receiving machine to reliably determine the value type from
466	   the option itself.  For this reason, a receiving host/device SHOULD
467	   interpret the option value as an opaque identifier.

469	   This specification allows the inserting host/device to provide
470	   multiple HOST_ID options.  The order of appearance of TCP options
471	   could be modified by some middleboxes, so receivers SHOULD NOT rely
472	   on option order to provide additional meaning to the individual
473	   options.  Instead, when multiple HOST_ID options are present, their
474	   values SHOULD be concatenated together in the order in which they
475	   appear in the packet and treated as a single large identifier.

477	   For both of the receiver requirements discussed above, this
478	   specification uses SHOULD rather than MUST because reliable
479	   interpretation and ordering of options could be possible if the
480	   inserting host and the interpreting host are under common
481	   administrative control and integrity protect communication between
482	   the inserting host and the interpreting host.  Mechanisms for
483	   signaling the value type(s) and integrity protection are not provided
484	   by this specification, and in their absence the receiving host/device
485	   MUST interpret the option value(s) as a single opaque identifier.

487	6.  Interaction with Other TCP Options

489	   This section details how the HOST_ID option functions in conjunction
490	   with other TCP options.

492	6.1.  Multipath TCP (MPTCP)

494	   TCP provides for a maximum of 40 octets for TCP options.  As
495	   discussed in Appendix A of MPTCP [RFC6824], a typical SYN from
496	   modern, popular operating systems contains several TCP options (MSS,
497	   window scale, SACK permitted, and timestamp) which consume 19-24
498	   octets depending on word alignment of the options.  The initial SYN
499	   from a multipath TCP client would consume an additional 16 octets.

501	   HOST_ID needs at least 6 octets to be useful, so 9-21 octets are
502	   sufficient for many scenarios that benefit from HOST_ID.  However, 4
503	   octets are not enough space for the HOST_ID option.  Thus, a TCP SYN
504	   containing all the typical TCP options (MSS, window Scale, SACK
505	   permitted, timestamp), and also containing multipath capable or
506	   multipath join, and also being word aligned, has insufficient space
507	   to accommodate HOST_ID.  This means something has to give.  The
508	   choices are either to avoid word alignment in that case (freeing 5
509	   octets) or avoid adding the HOST_ID option.  Each of these approaches
510	   is used in existing implementations and has been deemed acceptable
511	   for the associated use case.

513	6.2.  Authentication Option (TCP-AO)

515	   The TCP-AO option [RFC5925] is incompatible with address sharing due
516	   to the fact that it provides integrity protection of the source IP
517	   address.  For this reason, the only use cases where it makes sense to
518	   combine TCP-AO and HOST_ID are those where the TCP-AO-NAT extension
519	   [RFC6978] is in use.  Injecting a HOST_ID TCP option does not
520	   interfere with the use of TCP-AO-NAT because the TCP options are not
521	   included in the MAC calculation.

523	6.3.  TCP Fast Open (TFO)

525	   The TFO option [RFC7413] uses a zero length cookie (total option
526	   length 2 bytes) to request a TFO cookie for use on future
527	   connections.  The server-generated TFO cookie is required to be at
528	   least 4 bytes long and allowed to be as long as 16 bytes (total
529	   option length 6 to 18 bytes).  The cookie request form of the option
530	   leaves enough room available in a SYN packet with the most commonly
531	   used options to accommodate the HOST_ID option, but a valid TFO
532	   cookie length of any longer than 13 bytes would prevent even the
533	   minimal 6 byte HOST_ID option from being included in the header.

535	   There are multiple possibilities for allowing TFO and HOST_ID to be
536	   supported for the same connection, including:

538	   o  If the TFO implementation allows the cookie size to be
539	      configurable, the configured cookie size can be specifically
540	      selected to leave enough option space available in a typical TFO
541	      SYN packet to allow inclusion of the HOST_ID option.

543	   o  If the TFO implementation provides explicit support for the
544	      HOST_ID option, it can be designed to use a shorter cookie length
545	      when the HOST_ID option is present in the TFO cookie request SYN.

547	   Reducing the TFO cookie size in order to include the HOST_ID option
548	   could have unacceptable security implications, and so existing
549	   deployed systems that use the HOST_ID option consider TFO and HOST_ID
550	   to be mutually exclusive and do not support the use of both options
551	   on the same TCP connection.

553	   It should also be noted that the presence of data in a TFO SYN
554	   increases the likelihood that there will be no space available in the
555	   SYN packet to support inclusion of the HOST_ID option without IP
556	   fragmentation, even if there is enough room in the TCP option space.
557	   This is an additional reason existing system consider TFO and HOST_ID
558	   to be mutually exclusive.

560	7.  Security Considerations

562	   Security (including privacy) considerations common to all HOST_ID
563	   solutions are discussed in [RFC6967].

565	   The content of the HOST_ID option SHOULD NOT be used for purposes
566	   that require a trust relationship between the sender and the receiver
567	   (e.g. billing and/or subscriber policy enforcement).  This
568	   requirement uses SHOULD rather than MUST because reliable
569	   interpretation of options could be possible if the inserting host and
570	   the interpreting host are under common administrative control and
571	   integrity protect communication between the inserting host and the
572	   interpreting host.  Mechanisms for signaling the value type(s) and
573	   integrity protection are not provided by this specification, and in
574	   their absence the receiving host/device MUST NOT use the HOST_ID
575	   value for purposes that require a trust relationship.

577	   Note that the above trust requirement applies equally to HOST_ID
578	   option values propagated forward from a previous in-path host as
579	   described in Section 4.3.  In other words, if the trust mechanism
580	   does not apply to all option values in the packet, then none of the
581	   HOST_ID values can be considered trusted and the receiving host/
582	   device MUST NOT use any of the HOST_ID values for purposes that
583	   require a trust relationship.  An inserting host/device that has such
584	   a trust relationship MUST NOT propagate forward an untrusted HOST_ID
585	   in such a way as to allow it to be considered trusted.

587	   When the receiving network uses the values provided by the option in
588	   a way that does not require trust (e.g. maintaining session affinity
589	   in a load-balancing system), then use of a mechanism to enforce the
590	   trust relationship is OPTIONAL.

592	8.  Privacy Considerations

594	   Sending a TCP SYN across the public Internet necessarily discloses
595	   the public IP address of the sending host.  When an intermediate
596	   address sharing device is deployed on the public Internet, anonymity
597	   of the hosts using the device will be increased, with hosts
598	   represented by multiple source IP addresses on the ingress side of
599	   the device using a single source IP address on the egress side.  The
600	   HOST_ID TCP option removes that increased anonymity, taking
601	   information that was already visible in TCP packets on the public
602	   Internet on the ingress side of the address sharing device and making
603	   it available on the egress side of the device as well.  In some
604	   cases, an explicit purpose of the address sharing device is
605	   anonymity, in which case use of the HOST_ID TCP option would be
606	   incompatible with the purpose of the device.

608	   A NAT device used to provide interoperability between a local area
609	   network (LAN) using private [RFC1918] IP addresses and the public
610	   Internet is sometimes specifically intended to provide anonymity for
611	   the LAN clients as described in the above paragraph.  For this
612	   reason, address sharing devices at the border between a private LAN
613	   and the public Internet MUST NOT insert the HOST_ID option.

615	   The HOST_ID option MUST NOT be used to provide client geographic or
616	   network location information that was not publicly visible in IP
617	   packets for the TCP flows processed by the inserting host.  For
618	   example, the client's IP address MAY be used as the HOST_ID option
619	   value, but any geographic or network location information derived
620	   from the client's IP address MUST NOT be used as the HOST_ID value.

622	   The HOST_ID option MAY provide differentiating information that is
623	   locally unique such that individual TCP flows processed by the
624	   inserting host can be reliably identified.  The HOST_ID option MUST
625	   NOT provide client identification information that was not publicly
626	   visible in IP packets for the TCP flows processed by the inserting
627	   host, such as subscriber information linked to the IP address.

629	   The HOST_ID value MUST be changed whenever the subscriber IP address
630	   changes.  This requirement ensures that the HOST_ID option does not
631	   introduce a new globally unique identifier that persists across
632	   subscriber IP address changes.

634	   The HOST_ID option MUST be stripped from IP packets traversing middle
635	   boxes that provide network-based anonymity services.

637	9.  Pervasive Monitoring Considerations

639	   [RFC7258] provides the following guidance: "those developing IETF
640	   specifications need to be able to describe how they have considered
641	   Pervasive Monitoring, and, if the attack is relevant to the work to
642	   be published, be able to justify related design decisions."
643	   Legitimate concerns about host identification have been raised within
644	   the IETF.  The authors of this memo have attempted to address those
645	   concerns by providing details about the nature of the HOST_ID values
646	   and the types of middleboxes that should and should not be including
647	   the HOST_ID option in TCP headers, which describes limitations
648	   already imposed by existing deployed systems.  This section is
649	   intended to highlight some particularly important aspects of this
650	   design and the related guidance/limitations that are relevant to the
651	   pervasive monitoring discussion.

653	   When a generated identifier is used, this document prohibits the
654	   address sharing device from using globally unique or permanent
655	   identifiers.  Only locally unique identifiers are allowed.  As with
656	   persistent IP addresses, persistent HOST_ID values could facilitate
657	   user tracking and are therefore prohibited.  The specific
658	   requirements for permissible HOST_ID values are discussed in
659	   Section 8 and Section 4.1.

661	   This specification does not target exposing a host beyond what the
662	   original packet, issued from that host, would have already exposed on
663	   the public Internet without introduction of the option.  The option
664	   is intended only to carry forward information that was conveyed to
665	   the address-sharing device in the original packet, and HOST_ID option
666	   values that do not match this description are prohibited by
667	   requirements discussed in Section 8.  This design does not allow the
668	   HOST_ID option to carry personally identifiable information,
669	   geographic location identifiers, or any other information that is not
670	   available in the wire format of the associated TCP/IP headers.

672	   This document's guidance on option values is followed in existing
673	   deployed system.  Thus, the volatility of the information conveyed in
674	   a HOST_ID option is similar to that of the public, subscriber IP
675	   address.  A distinct HOST_ID is used by the address-sharing function
676	   when the host reboots or gets a new public IP address from the
677	   subscriber network.

679	   The described TCP option allows network identification to a similar
680	   level as the first 64 bits of an IPv6 address.  That is, the server
681	   can use the bits of the TCP option to help identify a host behind an
682	   address-sharing device, in much the same way the server would use the
683	   host's IPv6 network address if the client and server were using IPv6
684	   end-to-end.

686	   Some address-sharing middleboxes on the public Internet have the
687	   express intention of providing originator anonymity.  Publication of
688	   this document can help such middleboxes recognize the associated risk
689	   and take action to mitigate it (e.g. by stripping or modifying the
690	   option value).

692	10.  IANA Considerations

694	   This document specifies a new TCP option that uses the shared
695	   experimental options format [RFC6994], with ExID=0x0348 (840) in
696	   network-standard byte order.  This ExID has already been registered
697	   with IANA.

699	11.  Acknowledgements

701	   Many thanks to W.  Eddy, Y.  Nishida, T.  Reddy, M.  Scharf, J.
702	   Touch, A.  Zimmermann, and A.  Falk for their comments.

704	12.  References

706	12.1.  Normative References

708	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
709	              RFC 793, DOI 10.17487/RFC0793, September 1981,
710	              <http://www.rfc-editor.org/info/rfc793>.

712	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
713	              Requirement Levels", BCP 14, RFC 2119,
714	              DOI 10.17487/RFC2119, March 1997,
715	              <http://www.rfc-editor.org/info/rfc2119>.

717	   [RFC6994]  Touch, J., "Shared Use of Experimental TCP Options",
718	              RFC 6994, DOI 10.17487/RFC6994, August 2013,
719	              <http://www.rfc-editor.org/info/rfc6994>.

721	12.2.  Informative References

723	   [I-D.abdo-hostid-tcpopt-implementation]
724	              Abdo, E., Boucadair, M., and J. Queiroz, "HOST_ID TCP
725	              Options: Implementation & Preliminary Test Results",
726	              draft-abdo-hostid-tcpopt-implementation-03 (work in
727	              progress), July 2012.

729	   [I-D.williams-overlaypath-ip-tcp-rfc]
730	              Williams, B., "Overlay Path Option for IP and TCP", draft-
731	              williams-overlaypath-ip-tcp-rfc-04 (work in progress),
732	              June 2013.

734	   [I-D.wing-nat-reveal-option]
735	              Yourtchenko, A. and D. Wing, "Revealing hosts sharing an
736	              IP address using TCP option", draft-wing-nat-reveal-
737	              option-03 (work in progress), December 2011.

739	   [RFC1918]  Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G.,
740	              and E. Lear, "Address Allocation for Private Internets",
741	              BCP 5, RFC 1918, DOI 10.17487/RFC1918, February 1996,
742	              <http://www.rfc-editor.org/info/rfc1918>.

744	   [RFC1919]  Chatel, M., "Classical versus Transparent IP Proxies",
745	              RFC 1919, DOI 10.17487/RFC1919, March 1996,
746	              <http://www.rfc-editor.org/info/rfc1919>.

748	   [RFC3022]  Srisuresh, P. and K. Egevang, "Traditional IP Network
749	              Address Translator (Traditional NAT)", RFC 3022,
750	              DOI 10.17487/RFC3022, January 2001,
751	              <http://www.rfc-editor.org/info/rfc3022>.

753	   [RFC3135]  Border, J., Kojo, M., Griner, J., Montenegro, G., and Z.
754	              Shelby, "Performance Enhancing Proxies Intended to
755	              Mitigate Link-Related Degradations", RFC 3135,
756	              DOI 10.17487/RFC3135, June 2001,
757	              <http://www.rfc-editor.org/info/rfc3135>.

759	   [RFC5925]  Touch, J., Mankin, A., and R. Bonica, "The TCP
760	              Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
761	              June 2010, <http://www.rfc-editor.org/info/rfc5925>.

763	   [RFC6146]  Bagnulo, M., Matthews, P., and I. van Beijnum, "Stateful
764	              NAT64: Network Address and Protocol Translation from IPv6
765	              Clients to IPv4 Servers", RFC 6146, DOI 10.17487/RFC6146,
766	              April 2011, <http://www.rfc-editor.org/info/rfc6146>.

768	   [RFC6269]  Ford, M., Ed., Boucadair, M., Durand, A., Levis, P., and
769	              P. Roberts, "Issues with IP Address Sharing", RFC 6269,
770	              DOI 10.17487/RFC6269, June 2011,
771	              <http://www.rfc-editor.org/info/rfc6269>.

773	   [RFC6296]  Wasserman, M. and F. Baker, "IPv6-to-IPv6 Network Prefix
774	              Translation", RFC 6296, DOI 10.17487/RFC6296, June 2011,
775	              <http://www.rfc-editor.org/info/rfc6296>.

777	   [RFC6333]  Durand, A., Droms, R., Woodyatt, J., and Y. Lee, "Dual-
778	              Stack Lite Broadband Deployments Following IPv4
779	              Exhaustion", RFC 6333, DOI 10.17487/RFC6333, August 2011,
780	              <http://www.rfc-editor.org/info/rfc6333>.

782	   [RFC6346]  Bush, R., Ed., "The Address plus Port (A+P) Approach to
783	              the IPv4 Address Shortage", RFC 6346,
784	              DOI 10.17487/RFC6346, August 2011,
785	              <http://www.rfc-editor.org/info/rfc6346>.

787	   [RFC6598]  Weil, J., Kuarsingh, V., Donley, C., Liljenstolpe, C., and
788	              M. Azinger, "IANA-Reserved IPv4 Prefix for Shared Address
789	              Space", BCP 153, RFC 6598, DOI 10.17487/RFC6598, April
790	              2012, <http://www.rfc-editor.org/info/rfc6598>.

792	   [RFC6619]  Arkko, J., Eggert, L., and M. Townsley, "Scalable
793	              Operation of Address Translators with Per-Interface
794	              Bindings", RFC 6619, DOI 10.17487/RFC6619, June 2012,
795	              <http://www.rfc-editor.org/info/rfc6619>.

797	   [RFC6824]  Ford, A., Raiciu, C., Handley, M., and O. Bonaventure,
798	              "TCP Extensions for Multipath Operation with Multiple
799	              Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013,
800	              <http://www.rfc-editor.org/info/rfc6824>.

802	   [RFC6888]  Perreault, S., Ed., Yamagata, I., Miyakawa, S., Nakagawa,
803	              A., and H. Ashida, "Common Requirements for Carrier-Grade
804	              NATs (CGNs)", BCP 127, RFC 6888, DOI 10.17487/RFC6888,
805	              April 2013, <http://www.rfc-editor.org/info/rfc6888>.

807	   [RFC6967]  Boucadair, M., Touch, J., Levis, P., and R. Penno,
808	              "Analysis of Potential Solutions for Revealing a Host
809	              Identifier (HOST_ID) in Shared Address Deployments",
810	              RFC 6967, DOI 10.17487/RFC6967, June 2013,
811	              <http://www.rfc-editor.org/info/rfc6967>.

813	   [RFC6978]  Touch, J., "A TCP Authentication Option Extension for NAT
814	              Traversal", RFC 6978, DOI 10.17487/RFC6978, July 2013,
815	              <http://www.rfc-editor.org/info/rfc6978>.

817	   [RFC7258]  Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an
818	              Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May
819	              2014, <http://www.rfc-editor.org/info/rfc7258>.

821	   [RFC7413]  Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP
822	              Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014,
823	              <http://www.rfc-editor.org/info/rfc7413>.

825	   [RFC7620]  Boucadair, M., Ed., Chatras, B., Reddy, T., Williams, B.,
826	              and B. Sarikaya, "Scenarios with Host Identification
827	              Complications", RFC 7620, DOI 10.17487/RFC7620, August
828	              2015, <http://www.rfc-editor.org/info/rfc7620>.

830	Appendix A.  Change History

832	   [Note to RFC Editor: Please remove this section prior to
833	   publication.]

835	A.1.  Changes from version 07 to 08

837	   Changed document category from experimental to informational.

839	   Updated text throughout the document to further document that the
840	   option is in use on the public Internet and high-lighted specifics of
841	   how the option is used in existing implementations, especially when
842	   those implementations deviate from the document's recommendations.

844	   Added text to further clarify that the document does not represent
845	   IETF consensus, especially due to concerns about privacy and
846	   pervasive monitoring.

848	A.2.  Changes from version 06 to 07

850	   Clarified pervasive monitoring considerations and added back-pointers
851	   to where the requirements are more clearly called out.

853	A.3.  Changes from version 05 to 06

855	   Re-write the introduction to clarify that this document describes a
856	   practice that is in use on the public Internet today, and that the
857	   purpose of the document is publish design, deployment, and privacy
858	   considerations related to its use.

860	   Correct wording in the abstract to clarify that the IETF has not
861	   indicated support for host identification, but rather than proposals
862	   discussed within the IETF have done so.

864	   Add a section that summarizes the authors' understanding of the
865	   impact on pervasive monitoring to re-enforce the importance of
866	   following the document's related guidance.

868	A.4.  Changes from version 04 to 05

870	   Make this document self-contained, rather than referring readers to
871	   use-cases and requirements contained in other I.D.s that were never
872	   published as RFCs.

874	   Add discussion of TCP Fast Open.

876	   Correct some discussion of TCP-AO and TCP-AO-NAT.

878	   Clarify exactly what the identifier is identifying.

880	   Improve discussion on interpretation of multiple instances of the
881	   option, including order of interpretation and set interpretation.

883	   Evaluated whether use of multiple identifiers should be constrained.
884	   This is unclear, and so left for the experiment to determine.

886	   Discuss the possibility of the option value changing over the life of
887	   the connection (spec now prohibits this).

889	   Clarify use cases related to stripping and replacing the option.

891	   Add discussion of non-local fragmentation.

893	   Evaluate the reliability of attempts to exclude the option when local
894	   fragmentation would be required.

896	   Clarify the security requirements re: trust relationship.
897	   Specifically calls out that common admin control and authentication
898	   can allow additional uses.

900	   Clarify privacy considerations regarding NATs that separate private
901	   and public networks.

903	   Remove restatement of requirements from other documents.

905	   Justify use of SHOULD rather than MUST throughout.

907	A.5.  Changes from version 03 to 04

909	   Improve discussion of RFC6967.

911	   Don't use "message" to describe TCP segments.

913	   Add reference to RFC6994 to section 3.

915	   Clarify that this specifications supersedes earlier documents.

917	   Improve discussion of SYN cookie handling.

919	   Remove lower case uses of keywords (e.g. must, should, etc.)
920	   throughout the document.

922	   Some stronger privacy guidance, replacing SHOULD with MUST.

924	   Add an experiment goal related to optimal option value.

926	   Add text related to the identification goals of the option value
927	   (still needs more work).

929	A.6.  Changes from version 02 to 03

931	   Clarification of arguments in favor of this approach.

933	   Add discussion of important use cases.

935	   Clarification of experiment goals and earlier test results.

937	A.7.  Changes from version 01 to 02

939	   Add note re: order of appearance.

941	A.8.  Changes from version 00 to 01

943	   Add discussion of experiment goals.

945	   Limit external references to the earlier specifications.

947	   Add guidance to limit the types of device that add the option.

949	   Improve/correct discussion of TCP-AO and security.

951	Authors' Addresses

953	   Brandon Williams
954	   Akamai, Inc.
955	   8 Cambridge Center
956	   Cambridge, MA  02142
957	   USA

959	   Email: brandon.williams@akamai.com

961	   Mohamed Boucadair
962	   France Telecom
963	   Rennes, 35000
964	   Fance

966	   Email: mohamed.boucadair@orange.com

968	   Dan Wing
969	   Cisco Systems, Inc.
970	   170 West Tasman Drive
971	   San Jose, CA  95134
972	   USA

974	   Email: dwing@cisco.com