idnits 2.17.1 draft-ietf-intarea-nat-reveal-analysis-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 24, 2013) is 4020 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-01) exists of draft-boucadair-pcp-nat-reveal-00 == Outdated reference: A later version (-09) exists of draft-donley-behave-deterministic-cgn-05 == Outdated reference: A later version (-09) exists of draft-iab-privacy-considerations-03 == Outdated reference: A later version (-13) exists of draft-ietf-behave-ipfix-nat-logging-00 == Outdated reference: A later version (-06) exists of draft-ietf-behave-syslog-nat-logging-00 == Outdated reference: A later version (-10) exists of draft-ietf-tcpm-fastopen-03 -- Obsolete informational reference (is this intentional?): RFC 5201 (Obsoleted by RFC 7401) Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTAREA WG M. Boucadair 3 Internet-Draft France Telecom 4 Intended status: Informational J. Touch 5 Expires: October 26, 2013 USC/ISI 6 P. Levis 7 France Telecom 8 R. Penno 9 Cisco 10 April 24, 2013 12 Analysis of Solution Candidates to Reveal a Host Identifier (HOST_ID) in 13 Shared Address Deployments 14 draft-ietf-intarea-nat-reveal-analysis-10 16 Abstract 18 This document is a collection of solutions to reveal a host 19 identifier (denoted as HOST_ID) when a Carrier Grade NAT (CGN) or 20 application proxies are involved in the path. This host identifier 21 could be used by a remote server to sort out the packets by sending 22 host. The host identifier must be unique to each host under the same 23 shared IP address. 25 This document analyzes a set of solution candidates to reveal a host 26 identifier; no recommendation is sketched in the document. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on October 26, 2013. 45 Copyright Notice 47 Copyright (c) 2013 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. On HOST_ID . . . . . . . . . . . . . . . . . . . . . . . . . 4 64 3. HOST_ID and Privacy . . . . . . . . . . . . . . . . . . . . . 5 65 4. Detailed Solutions Analysis . . . . . . . . . . . . . . . . . 7 66 4.1. Use the Identification Field of IPv4 Header (IP-ID) . . . 7 67 4.1.1. Description . . . . . . . . . . . . . . . . . . . . . 7 68 4.1.2. Analysis . . . . . . . . . . . . . . . . . . . . . . 7 69 4.2. Define an IP Option . . . . . . . . . . . . . . . . . . . 7 70 4.2.1. Description . . . . . . . . . . . . . . . . . . . . . 8 71 4.2.2. Analysis . . . . . . . . . . . . . . . . . . . . . . 8 72 4.3. Define a TCP Option . . . . . . . . . . . . . . . . . . . 8 73 4.3.1. Description . . . . . . . . . . . . . . . . . . . . . 8 74 4.3.2. Analysis . . . . . . . . . . . . . . . . . . . . . . 8 75 4.4. Inject Application Protocol Message Headers . . . . . . . 10 76 4.4.1. Description . . . . . . . . . . . . . . . . . . . . . 10 77 4.4.2. Analysis . . . . . . . . . . . . . . . . . . . . . . 11 78 4.5. PROXY Protocol . . . . . . . . . . . . . . . . . . . . . 12 79 4.5.1. Description . . . . . . . . . . . . . . . . . . . . . 12 80 4.5.2. Analysis . . . . . . . . . . . . . . . . . . . . . . 12 81 4.6. Assign Port Sets . . . . . . . . . . . . . . . . . . . . 12 82 4.6.1. Description . . . . . . . . . . . . . . . . . . . . . 12 83 4.6.2. Analysis . . . . . . . . . . . . . . . . . . . . . . 13 84 4.7. Host Identity Protocol (HIP) . . . . . . . . . . . . . . 13 85 4.7.1. Description . . . . . . . . . . . . . . . . . . . . . 13 86 4.7.2. Analysis . . . . . . . . . . . . . . . . . . . . . . 13 87 4.8. Use of a Notification Channel (e.g., ICMP) . . . . . . . 14 88 4.8.1. Description . . . . . . . . . . . . . . . . . . . . . 14 89 4.8.2. Analysis . . . . . . . . . . . . . . . . . . . . . . 14 90 4.9. Use Out-of-Band Mechanisms (e.g., IDENT) . . . . . . . . 15 91 4.9.1. Description . . . . . . . . . . . . . . . . . . . . . 15 92 4.9.2. Analysis . . . . . . . . . . . . . . . . . . . . . . 15 93 5. Solutions Analysis: Synthesis . . . . . . . . . . . . . . . . 16 94 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 95 7. Security Considerations . . . . . . . . . . . . . . . . . . . 18 96 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 19 97 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 98 9.1. Normative References . . . . . . . . . . . . . . . . . . 19 99 9.2. Informative References . . . . . . . . . . . . . . . . . 19 100 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22 102 1. Introduction 104 As reported in [RFC6269], several issues are encountered when an IP 105 address is shared among several subscribers. These issues are 106 encountered in various deployment contexts: e.g., Carrier Grade NAT 107 (CGN), application proxies, or A+P [RFC6346]. Examples of such 108 issues are: implicit identification (Section 13.2 of [RFC6269]), spam 109 (Section 13.3 of [RFC6269]), blacklisting a mis-behaving host 110 (Section 13.1 of [RFC6269]) or redirect users with infected machines 111 to a dedicated portal (Section 5.1 of [RFC6269]). 113 In particular, some servers use the source IPv4 address as an 114 identifier to treat some incoming connections differently. Due to 115 the deployment of CGNs (e.g., NAT44 [RFC3022], NAT64 [RFC6146]), that 116 address will be shared. In particular, when a server receives 117 packets from the same source address, because this address is shared, 118 the server does not know which host is the sending host [RFC6269]. 119 The sole use of the IPv4 address is not sufficient to uniquely 120 distinguish a host. As a mitigation, it is tempting to investigate 121 means which would help in disclosing information to be used by the 122 remote server as a means to uniquely disambiguate packets of hosts 123 using the same IPv4 address. 125 The risk of not mitigating these issues include: OPEX (Operational 126 Expenditure) increase for IP connectivity service providers (costs 127 induced by calls to a hotline), revenue loss for content providers 128 (loss of users audience) and customers' dissatisfaction (low quality 129 of experience, service segregation, etc.). 131 The purpose of this document is to analyze a set of alternative 132 channels to convey a host identifier and to assess to what extent 133 they solve the problem described in Section 2. The evaluation is 134 intended to be comprehensive regardless of the maturity or validity 135 of any currently known or proposed solution. Below are listed the 136 alternatives analyzed in the document: 138 o Use the Identification field of IP header (denoted as IP-ID, 139 Section 4.1). 140 o Define a new IP option (Section 4.2). 141 o Define a new TCP Option (Section 4.3). 142 o Inject application headers (Section 4.4). 143 o Enable Proxy Protocol (Section 4.5). 144 o Assign port sets (Section 4.6). 145 o Activate HIP (Host Identity Protocol, Section 4.7). 147 o Use a notification channel (Section 4.8). 148 o Use an out-of-band mechanism (Section 4.9). 150 A synthesis is provided in Section 5 while the detailed analysis is 151 elaborated in Section 4. 153 Section 3 discusses privacy issues common to all candidate solutions. 154 It is out of scope of this document to elaborate on privacy issues 155 specific to each solution. 157 This document does not include any recommendation because the working 158 group felt it is too premature to include one. 160 2. On HOST_ID 162 Policies relying on source IP address which are enforced by some 163 servers will be applied to all hosts sharing the same IP address. 164 For example, blacklisting the IP address of a spammer host will 165 result in all other hosts sharing that address having their access to 166 the requested service restricted. [RFC6269] describes the issues in 167 detail. Therefore, due to address sharing, servers need extra 168 information beyond the source IP address to differentiate the sending 169 host. We call this information the HOST_ID. 171 HOST_ID identifies a host under a shared IP address. Privacy-related 172 considerations are discussed in Section 3. 174 Within this document, a host can be any computer located behind a 175 Home Gateway or directly connected to an address-sharing function 176 located in the network provider's domain (typically this would be the 177 Home Gateway itself). 179 Because HOST_ID is used by a remote server to sort out the packets by 180 sending host, HOST_ID must be unique to each host under the same 181 shared IP address, where possible. In the case where only the Home 182 Gateway is revealed to the operator side of the translation function, 183 HOST_ID need only be unique to the Home Gateway. HOST_ID does not 184 need to be globally unique. Of course, the combination of the 185 (public) IP source address and the identifier (i.e., HOST_ID) ends up 186 being unique. 188 If the HOST_ID is conveyed at the IP level, all packets will have to 189 bear the identifier. If it is conveyed at a higher connection- 190 oriented level, the identifier is only needed once in the session 191 establishment phase (for instance TCP three-way-handshake), then, all 192 packets received in this session will be attributed to the HOST_ID 193 designated during the session opening. 195 Within this document, we assume the operator-side address-sharing 196 function injects the HOST_ID. Another deployment option to avoid 197 potential performance degradation is to let the host or Home Gateway 198 inject its HOST_ID but the address-sharing function will check its 199 content (just like an IP anti-spoofing function). For some 200 proposals, the HOST_ID is retrieved using an out-of-band mechanism or 201 signaled in a dedicated notification channel. 203 For A+P [RFC6346] and its variants, port set announcements may be 204 needed as discussed in Section 4.6. 206 Security considerations are common to all analyzed solutions (see 207 Section 7). Privacy-related aspects are discussed in Section 3. 209 HOST_ID can be ambiguous for hosts with multiple interfaces, or 210 multiple addresses assigned to a single interface. HOST_IDs that are 211 the same may be used by to imply or infer the same end system, but 212 HOST_IDs that are different should not be used to imply or infer 213 whether the end systems are the same or different. 215 3. HOST_ID and Privacy 217 IP address sharing is motivated by a number of different factors. 218 For years, many network operators have conserved the use of public 219 IPv4 addresses by making use of Customer Premises Equipment (CPE) 220 that assigns a single public IPv4 address to all hosts within the 221 customer's local area network and uses NAT [RFC3022] to translate 222 between locally unique private IPv4 addresses and the CPE's public 223 address. With the exhaustion of IPv4 address space, address sharing 224 between customers on a much larger scale is likely to become much 225 more prevalent. While many individual users are unaware of and 226 uninvolved in decisions about whether their unique IPv4 addresses get 227 revealed when they send data via IP, some users realize privacy 228 benefits associated with IP address sharing, and some may even take 229 steps to ensure that NAT functionality sits between them and the 230 public Internet. IP address sharing makes the actions of all users 231 behind the NAT function unattributable to any single host, creating 232 room for abuse but also providing some identity protection for non- 233 abusive users who wish to transmit data with reduced risk of being 234 uniquely identified. 236 The proposals considered in this document add a measure of 237 identifiability back to hosts that share a public IP address. The 238 extent of that identifiability depends on what information is 239 included in the HOST_ID. 241 The volatility of the HOST_ID information is similar to that of the 242 internal IP address: a distinct HOST_ID may be used by the address- 243 sharing function when the host reboots or gets a new internal IP 244 address. As with persistent IP addresses, persistent HOST_IDs 245 facilitate user tracking over time. 247 As a general matter, the HOST_ID proposals do not seek to make hosts 248 any more identifiable than they would be if they were using a public, 249 non-shared IP address. However, depending on the solution proposal, 250 the addition of HOST_ID information may allow a device to be 251 fingerprinted more easily than it otherwise would be. To prevent 252 this, the following design considerations are to be taken into 253 account: 255 o It is recommended that HOST_IDs be limited to providing local 256 uniqueness rather than global uniqueness. 258 o Address-sharing function should not use permanent HOST_ID values. 260 Should multiple solutions be combined (e.g., TCP Option and Forwarded 261 header) that include different pieces of information in the HOST_ID, 262 fingerprinting may become even easier. To prevent this, an address- 263 sharing function, able to inject HOST_IDs in several layers, should 264 reveal the same subsets of information at each layer. For example, 265 if one references the lower 16 bits of an IPv4 address, the other 266 should reference these 16 bits too. 268 A HOST_ID can be spoofed as this is also the case for spoofing an IP 269 address. Furthermore, users of network-based anonymity services 270 (like Tor) may be capable of stripping HOST_ID information before it 271 reaches its destination. 273 In order to control the information revealed to external parties, an 274 address-sharing function should be able to strip, rewrite and add 275 HOST_ID fields. 277 An address-sharing function may be configured to enforce different 278 end-user preferences with regards to HOST_ID injection. For example, 279 HOST_ID injection can be disabled for some users. This feature is 280 policy-based and deployment-specific. 282 HOST_ID specification document(s) should explain the privacy impact 283 of the solutions they specify, including the extent of HOST_ID 284 uniqueness and persistence, assumptions made about the lifetime of 285 the HOST_ID, whether and how the HOST_ID can be obfuscated or 286 recycled, whether location information can be exposed, and the impact 287 of the use of the HOST_ID on device or implementation fingerprinting. 288 [I-D.iab-privacy-considerations] provides further guidance. 290 For more discussion about privacy, refer to [RFC6462]. 292 4. Detailed Solutions Analysis 294 4.1. Use the Identification Field of IPv4 Header (IP-ID) 296 4.1.1. Description 298 The IPv4 ID (Identification field of IP header, i.e., IP-ID) can be 299 used to insert information which uniquely distinguishes a host among 300 those sharing the same IPv4 address. The use of IP-ID as a channel 301 to convey HOST_ID is a theoretical construct (i.e., it is an 302 undocumented proposal). 304 An address-sharing function can re-write the IP-ID field to insert a 305 value unique to the host (16 bits are sufficient to uniquely 306 disambiguate hosts sharing the same IP address). The address-sharing 307 function injecting the HOST_ID must follow the rules defined in 308 [RFC6864]; in particular the same HOST_ID is not re-assigned to 309 another host sharing the same IP address during a given time 310 interval. 312 A variant of this approach relies upon the format of certain packets, 313 such as TCP SYN, where the IP-ID can be modified to contain a 16 bit 314 HOST_ID. 316 Address-sharing devices using this solution would be required to 317 indicate that they do so, possibly using a special DNS record. 319 4.1.2. Analysis 321 This usage is not consistent with the fragment reassembly use of the 322 Identification field [RFC0791] or the updated handling rules for the 323 Identification field [RFC6864]. 325 Complications may arise if the packet is fragmented before reaching 326 the device injecting the HOST_ID. To appropriately handle those 327 packet fragments, the address-sharing function will need to maintain 328 a lot of state. 330 Another complication to be encountered is where translation is 331 balanced among several NATs; setting the appropriate HOST_ID by a 332 given NAT would alter the coordination between those NATs. Of 333 course, one can argue this coordinated NAT scenario is not a typical 334 deployment scenario; regardless, using IP-ID as a channel to convey a 335 HOST_ID is ill-advised. 337 4.2. Define an IP Option 338 4.2.1. Description 340 A solution alternative to convey the HOST_ID is to define an IP 341 option [RFC0791]. A HOST_ID IP option can be inserted by the 342 address-sharing function to uniquely distinguish a host among those 343 sharing the same IP address. An example of such option is documented 344 in [I-D.chen-intarea-v4-uid-header-option]. This IP option allows 345 the conveyance of an IPv4 address, an IPv6 prefix, a GRE (Generic 346 Routing Encapsulation) key, an IPv6 Flow Label, etc. 348 Another way for using an IP option has been described in Section 4.6 349 of [RFC3022]. 351 4.2.2. Analysis 353 This proposal can apply to any transport protocol. Nevertheless, it 354 is widely known that routers and other middleboxes filter IP options 355 (e.g., drop IP packets with unknown IP options, strip unknown IP 356 options, etc.). 358 Injecting the HOST_ID IP Option introduces some implementations 359 complexity in the following cases: 361 o If the packet is at or close to the MTU size. 363 o The options space is exhausted. 365 Previous studies demonstrated that "IP Options are not an option" 366 (Refer to [Not_An_Option], [Options]). 368 In conclusion, using an IP option to convey a HOST_ID is not viable. 370 4.3. Define a TCP Option 372 4.3.1. Description 374 HOST_ID may be conveyed in a dedicated TCP Option. An example is 375 specified in [I-D.wing-nat-reveal-option]. This option encloses the 376 TCP client's identifier (e.g., the lower 16 bits of its IPv4 address, 377 its VLAN ID, VRF ID, or subscriber ID). The address-sharing device 378 inserts this TCP Option into the TCP SYN packet. 380 4.3.2. Analysis 382 Using a new TCP Option to convey the HOST_ID does not require any 383 modification to the applications but it is applicable only for TCP- 384 based applications. Applications relying on other transport 385 protocols are therefore left unsolved. 387 [I-D.wing-nat-reveal-option] discusses the interference with other 388 TCP Options. 390 The risk to experience session failures due to handling a new TCP 391 Option is low as measured in [Options]. 392 [I-D.abdo-hostid-tcpopt-implementation] provides a detailed 393 implementation and experimentation report of a HOST_ID TCP Option. 394 This document investigated in depth the impact of activation HOST_ID 395 on the host, the address-sharing function, and the enforcement of 396 policies at the server side. It also reports a failure ratio of 397 0.103% among top 100000 websites. 399 Some downsides have been raised against defining a TCP Option to 400 reveal a host identity: 402 o Conveying an IP address in a TCP Option may be seen as a violation 403 of OSI layers but since IP addresses are already used for the 404 checksum computation, this is not seen as a blocking point. 405 Moreover, updated version of [I-D.wing-nat-reveal-option] no 406 longer allows conveyance of a full IP address as the HOST_ID is 407 encoded in 16 bits. 409 o TCP Option space is limited and might be consumed by the TCP 410 client. [I-D.abdo-hostid-tcpopt-implementation] discusses two 411 approaches to sending the HOST_ID: sending the HOST_ID in the TCP 412 SYN (which consumes more bytes in the TCP header of the TCP SYN) 413 and sending the HOST_ID in a TCP ACK (which consumes only two 414 bytes in the TCP SYN). 416 o Content providers may find it more desirable to receive the 417 HOST_ID in the TCP SYN, as that more closely preserves the HOST_ID 418 received in the source IP address as per current practices. 419 Moreover, sending the HOST_ID in the TCP SYN does not interfere 420 with [I-D.ietf-tcpm-fastopen]. In the ACK mode, If the server is 421 configured to deliver different data based on HOST_ID, then it 422 would have to wait for the ACK before transmitting data. 424 o HOST_ID mechanisms need to be aware of E2E (End-to-End) issues and 425 avoid interfering with them. One example of such interference 426 would be injecting or removing TCP options of transited packets; 427 another such interference involves terminating and re-originating 428 TCP connections not belonging to the transit device. HOST_ID TCP 429 option handled by the source node avoids this issue. 431 o Injecting the HOST_ID TCP Option introduces some implementations 432 complexity if the options space is exhausted. Specification 433 document(s) should specify in detail the behavior of the address- 434 sharing function in such case. 436 o It is more complicated to implement sending the HOST_ID in a TCP 437 ACK as it can introduce MTU issues if the ACK packet also contains 438 TCP data, or a TCP segment is lost. Note, MTU complications can 439 be experienced also if user data is included in a SYN packet 440 (e.g., [I-D.ietf-tcpm-fastopen]). 442 o When there are several NATs in the path, the original HOST_ID may 443 be lost. The loss of the original HOST_ID may not be a problem as 444 the target usage is between proxies or a CGN and server. Only the 445 information leaked in the last communication leg (i.e., between 446 the last address-sharing function and the server) is likely to be 447 useful. 449 o Interference with usages such as Forwarded HTTP header (see 450 Section 4.4) should be elaborated to specify the behavior of 451 servers when both options are used; in particular, specify which 452 information to use: the content of the TCP Option or what is 453 conveyed in the application headers. 455 o When load-balancers or proxies are in the path, this option does 456 not allow the preservation of the original source IP address and 457 source port. Preserving such information is required for logging 458 purposes for instance (e.g., [RFC6302]). 459 [I-D.abdo-hostid-tcpopt-implementation] defines a TCP Option which 460 allows revealing various combinations of source information (e.g., 461 source port, source port and source IP address, source IPv6 462 prefix, etc.). 464 More discussion about issues raised when extending TCP can be found 465 at [ExtendTCP]. 467 4.4. Inject Application Protocol Message Headers 469 4.4.1. Description 471 Another option is not to require any change within the transport nor 472 the IP levels but to convey at the application payload the required 473 information that will be used to disambiguate hosts. The format of 474 the conveyed information and the related semantics depend on its 475 application (e.g., HTTP, SIP, SMTP, etc.). 477 Related mechanisms could be developed for other application-layer 478 protocols, but the discussion in this document is limited to HTTP and 479 similar protocols. 481 For HTTP, Forwarded header ([I-D.ietf-appsawg-http-forwarded]) can be 482 used to display the original IP address when an address-sharing 483 device is involved. Service Providers operating address-sharing 484 devices can enable the feature of injecting the Forwarded header 485 which will enclose the original IPv4 address or the IPv6 prefix part 486 (see the example shown in Figure 1). The address-sharing device has 487 to strip all included Forwarded headers before injecting its own. 488 Servers may rely on the contents of this field to enforce some 489 policies such as blacklisting misbehaving users. 491 Note that the X-Forwarded-For (XFF) header is obsoleted by 492 [I-D.ietf-appsawg-http-forwarded]. 494 Forwarded: for=192.0.2.1,for=[2001:db8::1] 495 Forwarded: proto=https;by=192.0.2.15 497 Figure 1: Example of Forwarded-For 499 4.4.2. Analysis 501 Not all applications impacted by address sharing can support the 502 ability to disclose the original IP address. Only a subset of 503 protocols (e.g., HTTP) can rely on this solution. 505 For the HTTP case, to prevent users injecting invalid HOST_IDs, an 506 initiative has been launched by Wikipedia to maintain a list of 507 trusted ISPs (Internet Service Providers) using XFF (See the list 508 available at [Trusted_ISPs]). If an address-sharing device is on the 509 trusted XFF ISPs list, users editing Wikipedia located behind the 510 address-sharing device will appear to be editing from their 511 "original" IP address and not from the NATed IP address. If an 512 offending activity is detected, individual hosts can be blacklisted 513 instead of all hosts sharing the same IP address. 515 XFF header injection is a common practice of load balancers. When a 516 load balancer is in the path, the original content of any included 517 XFF header should not be stripped. Otherwise the information about 518 the "origin" IP address will be lost. 520 When several address-sharing devices are crossed, the Forwarded 521 header can convey the list of IP addresses (e.g., Figure 1). The 522 origin HOST_ID can be exposed to the target server. 524 Injecting Forwarded header also introduces some implementations 525 complexity if the HTTP message is at or close to the MTU size. 527 It has been reported that "poor" HTTP proxy implementations may 528 encounter parsing issues when injecting an XFF header. 530 Injecting Forwarded header for all HTTPS traffic is infeasible. This 531 may be problematic given the current HTTPS usage trends. 533 4.5. PROXY Protocol 535 4.5.1. Description 537 The solution, referred to as Proxy Protocol [Proxy], does not require 538 any application-specific knowledge. The rationale behind this 539 solution (Proxy Protocol Version 1) is to insert identification data 540 directly into the application data stream prior to the actual 541 protocol data being sent, regardless of the protocol. Every 542 application protocol would begin with a textual string of "PROXY", 543 followed by some textual identification data, ending with a CRLF, and 544 only then the application data would be inserted. Figure 2 shows an 545 example of a line of data used for this, in this case for a TCP over 546 IPv4 connection received from 192.0.2.1:56324 and destined to 547 192.0.2.15:443. 549 PROXY TCP4 192.0.2.1 192.0.2.15 56324 443\r\n 551 Figure 2: Example of PROXY connection report 553 Upon receipt of a message conveying this line, the server removes the 554 line. The line is parsed to retrieve the transported protocol. The 555 content of this line is recorded in logs and used to enforce 556 policies. 558 Proxy Protocol Version 2 is designed to accommodate IPv4/IPv6 and 559 also non-TCP protocols (see [Proxy] for more details). 561 4.5.2. Analysis 563 This solution can be deployed in a controlled environment but it can 564 not be deployed to all access services available in the Internet. If 565 the remote server does not support the Proxy Protocol, the session 566 will fail. Other complications will arise due to the presence of 567 firewalls, for instance. 569 As a consequence, this solution is infeasible and can not be 570 recommended. 572 4.6. Assign Port Sets 574 4.6.1. Description 576 This solution does not require any action from the address-sharing 577 function to disclose a host identifier. Instead of assuming all 578 transport ports are associated with one single host, each host under 579 the same external IP address is assigned a restricted port set. 580 These port sets are then advertised to remote servers using off-line 581 means. This announcement is not required for the delivery of 582 internal services (i.e., offered by the service provider deploying 583 the address-sharing function) relying on implicit identification. 585 Port sets assigned to hosts may be static or dynamic. 587 Port set announcements to remote servers are not required to reveal 588 the identity of individual hosts but only to advertise the enforced 589 policy to generate non-overlapping port sets (e.g., the transport 590 space associated with an IP address is fragmented to contiguous 591 blocks of 2048 port numbers). 593 Examples of such an option are documented in [RFC6346] and 594 [I-D.donley-behave-deterministic-cgn]. 596 4.6.2. Analysis 598 The solution does not require defining new fields nor options; it is 599 policy-based. 601 The solution may contradict the port randomization ([RFC6056]) as 602 identified in [RFC6269]. A mitigation would be to avoid assigning 603 static port sets to individual hosts. 605 The method is convenient for the delivery of services offered by the 606 service provider also offering the Internet access service. 608 4.7. Host Identity Protocol (HIP) 610 4.7.1. Description 612 [RFC5201] specifies an architecture which introduces a new namespace 613 to convey identity information. 615 4.7.2. Analysis 617 This solution requires both the client and the server to support HIP 618 [RFC5201]. Additional architectural considerations are to be taken 619 into account such as the key exchanges, etc. 621 An alternative deployment model, which does not require the client to 622 be HIP-enabled, is having the address-sharing function behave as a 623 UDP/TCP-HIP relay. This model is also not viable as it assumes all 624 servers are HIP-enabled. 626 This solution is a theoretical construct (i.e., the proposal is not 627 documented). 629 4.8. Use of a Notification Channel (e.g., ICMP) 631 4.8.1. Description 633 Another alternative is to convey the HOST_ID using a separate 634 notification channel than the packets issued to invoke the service. 636 An implementation example is defined in 637 [I-D.yourtchenko-nat-reveal-ping]. This solution relies on a 638 mechanism where the address-sharing function encapsulates the 639 necessary host-identifying information into an ICMP Echo Request 640 packet that it sends in parallel with the initial session creation 641 (e.g., SYN). The information included in the ICMP Request Data 642 portion describes the five-tuples as seen on both of the sides of the 643 address-sharing function. 645 4.8.2. Analysis 647 o This ICMP proposal is valid for any transport protocol that uses a 648 port number. The address-sharing function may be configured with 649 the transport protocols which will trigger issuing those ICMP 650 messages. 651 o A hint should be provided to the ultimate server (or intermediate 652 nodes) that the ICMP Echo Request conveys a HOST_ID. This may be 653 implemented using magic numbers. 654 o Even if ICMP packets are blocked in the communication path, the 655 user connection does not have to be impacted. 656 o Implementations requiring delay of the establishment of a session 657 until receipt of the companion ICMP Echo Request may lead to some 658 user experience degradation. 659 o Because of the presence of load-balancers in the path, the 660 ultimate server receiving the SYN packet may not be the one which 661 receives the ICMP message conveying the HOST_ID. 662 o Because of the presence of load-balancers in the path, the port 663 number assigned by address sharing may be lost. Therefore the 664 mapping information conveyed in the ICMP may not be sufficient to 665 associate a SYN packet with a received ICMP. 666 o The proposal is not compatible with the presence of cascaded NAT. 667 The main reason is each NAT in the path will generate an ICMP 668 message to reveal the internal host identifier. Because these 669 messages will be translated by the downstream address-sharing 670 devices, the remote server will receive multiple ICMP messages and 671 will need to decide which host identifier to use. 672 o The ICMP proposal will add traffic overhead for both the server 673 and the address-sharing device. 674 o The ICMP proposal is similar to other mechanisms (e.g., Syslog 675 [I-D.ietf-behave-ipfix-nat-logging], IPFIX 676 [I-D.ietf-behave-syslog-nat-logging]) for reporting dynamic 677 mappings to a mediation platform (mainly for legal traceability 678 purposes). Performance degradation is likely to be experienced by 679 address-sharing functions because ICMP messages are sent for each 680 new instantiated mapping (and also even if the mapping exists). 681 o In some scenarios (e.g., Section 3 of 682 [I-D.boucadair-pcp-nat-reveal]), HOST_ID should be interpreted by 683 intermediate devices which embed Policy Enforcement Points (PEP, 684 [RFC2753]) responsible for granting access to some services. 685 These PEPs need to inspect all received packets in order to find 686 the companion (traffic) messages to be correlated with ICMP 687 messages conveying HOST_IDs. This induces more complexity to 688 these intermediate devices. 690 4.9. Use Out-of-Band Mechanisms (e.g., IDENT) 692 4.9.1. Description 694 Another alternative is to retrieve the HOST_ID using a dedicated 695 query channel. 697 An implementation example may rely on the Identification Protocol 698 (IDENT, [RFC1413]). This solution assumes the address-sharing 699 function implements the server part of IDENT, while remote servers 700 implement the client part of the protocol. IDENT needs to be updated 701 (see [IDENT_NAT]) to be able to return a host identifier instead of 702 the user-id as defined in [RFC1413]. The IDENT response syntax uses 703 the same USERID field described in [RFC1413] but rather than 704 returning a username, a host identifier (e.g., a 16-bit value) is 705 returned [IDENT_NAT]. For any new incoming connection, the server 706 contacts the IDENT server to retrieve the associated identifier. 707 During that phase, the connection may be delayed. 709 4.9.2. Analysis 711 o IDENT is specific to TCP. Alternative out-of-band mechanisms may 712 be designed to cover other transport protocols such as UDP. 713 o This solution requires the address-sharing function to embed an 714 IDENT server. 715 o A hint should be provided to the ultimate server (or intermediate 716 nodes) that the address-sharing function implements the IDENT 717 protocol. A solution example is to publish this capability using 718 DNS; other solutions can be envisaged. 719 o An out-of-band mechanism may require some administrative setup 720 (e.g., contract agreement) between the entity managing the 721 address-sharing function and the entity managing the remote 722 server. Such a deployment is not feasible in the Internet at 723 large because establishing and maintaining agreements between ISPs 724 and all service actors is burdensome and not scalable. 726 o Implementations requiring delay of the establishment of a session 727 until receipt of the companion IDENT response may lead to some 728 user experience degradation. 729 o The IDENT proposal will add traffic overhead for both the server 730 and the address-sharing device. 731 o Performance degradation is likely to be experienced by address- 732 sharing functions embedding the IDENT server. This is further 733 exacerbated if the address-sharing function has to handle an IDENT 734 query for each new instantiated mapping (and also even if the 735 mapping exists). 736 o In some scenarios (e.g., Section 3 of 737 [I-D.boucadair-pcp-nat-reveal]), HOST_ID should be interpreted by 738 intermediate devices which embed Policy Enforcement Points (PEP, 739 [RFC2753]) responsible for granting access to some services. 740 These PEPs need to inspect all received packets in order to 741 generate the companion IDENT queries. This may induce more 742 complexity to these intermediate devices. 743 o IDENT queries may be generated by illegitimate TCP servers. This 744 would require the address-sharing function to enforce some 745 policies (e.g., rate limit queries, filter based on the source IP 746 address, etc.). 748 5. Solutions Analysis: Synthesis 750 The following Table 1 summarizes the approaches analyzed in this 751 document. 753 o "Encrypted Traffic" refers to TLS. The use of IPsec and its 754 complications to traverse NATs are discussed in Section 2.2 of 755 [I-D.ietf-behave-64-analysis]. Similar to what is suggested in 756 Section 13.5 of [RFC6269], HOST_ID specification document(s) 757 should analyze in detail the compatibility of each IPsec mode. 758 o "Success ratio" indicates the ratio of successful communications 759 with remote servers when the HOST_ID is injected using a candidate 760 solution. More details are provided below to explain how the 761 success ratio is computed for each candidate solution. 762 o "Possible Perf Impact" indicates the level of expected performance 763 degradation. The rationale behind the indicated potential 764 performance degradation is whether the injection requires some 765 treatment at the IP level or not. 766 o "OS TCP/IP Modif" indicates whether a modification of the OS TCP/ 767 IP stack is required at the server side. 768 o "Deployable today" indicates if the solution can be generalized 769 without any constraint on current architectures and practices. 771 +-----+------+------+------+-----+-----+-----+-----+-----+ 772 |IP-ID| IP | TCP |HTTP |PROXY|Port | HIP |ICMP |IDENT| 773 | |Option|Option|Header| | Set | | | | 775 ----------+-----+------+------+------+-----+-----+-----+-----+-----+ 776 UDP | Yes | Yes | No | No | No | Yes | | Yes | No | 777 ----------+-----+------+------+------+-----+-----+-----+-----+-----+ 778 TCP | Yes | Yes | Yes | No | Yes | Yes | | Yes | Yes | 779 ----------+-----+------+------+------+-----+-----+-----+-----+-----+ 780 HTTP | Yes | Yes | Yes | Yes | Yes | Yes | | Yes | Yes | 781 ----------+-----+------+------+------+-----+-----+-----+-----+-----+ 782 Encrypted | Yes | Yes | Yes | No | Yes | Yes | | Yes | Yes | 783 Traffic | | | | | | | | | | 784 ----------+-----+------+------+------+-----+-----+-----+-----+-----+ 785 Success | High| Low | High | High | Low | 100%|Low |High |High | 786 Ratio | | | | | | | | | | 787 ----------+-----+------+------+------+-----+-----+-----+-----+-----+ 788 Possible | Low | High | Low | Med | High| No | N/A | High|High | 789 Perf | to | | to | to | | | | | | 790 Impact | Med | | Med | High | | | | | | 791 ----------+-----+------+------+------+-----+-----+-----+-----+-----+ 792 OS TCP/IP | Yes | Yes | Yes | No | No | No | | Yes | Yes | 793 Modif | | | | | | | | | | 794 ----------+-----+------+------+------+-----+-----+-----+-----+-----+ 795 Deployable| Yes | Yes | Yes | Yes | No | Yes | No | Yes | Yes | 796 Today | | | | | | | | | | 797 ----------+-----+------+------+------+-----+-----+-----+-----+-----+ 798 Notes | (1) | (8) | (8) | (2) | (8) | (1) | (4) | (6) | (1) | 799 | (7) | | | | | (3) | (7) | (8) | (6) | 800 | | | | | | | | | (8) | 801 ----------+-----+------+------+------+-----+-----+-----+-----+-----+ 802 Notes: 803 (1) Requires mechanism to advertise NAT is participating in this 804 scheme (e.g., DNS PTR record). 805 (2) This solution is widely deployed (e.g., HTTP Severs, 806 Load-Balancers, etc.). 807 (3) When the port set is not advertised, the solution is less 808 efficient for third-party services. 809 (4) Requires the client and the server to be HIP-compliant and HIP 810 infrastructure to be deployed. If the client and the server are 811 HIP-enabled, the address-sharing function does not need to 812 insert an identifier. If the client is not HIP-enabled, 813 designing the device that performs address sharing to act 814 as a UDP/TCP-HIP relay is not viable. 815 (6) The solution is inefficient in some scenarios (see Section 5) 816 (7) The solution is a theoretical construct (i.e., the solution 817 is not documented). 818 (8) The solution is a documented proposal. 820 Table 1: Summary of analyzed solutions. 822 Provided success ratio figures for TCP and IP options are based on 823 the results documented in [Options] and 824 [I-D.abdo-hostid-tcpopt-implementation]. 826 The provided success ratio for IP-ID is theoretical; it assumes the 827 address-sharing function follows the rules in [RFC6864] to re-write 828 the IP Identification field. 830 Since PROXY and HIP are not widely deployed, the success ratio for 831 establishing a communication with remote servers using these 832 protocols is low. 834 The success ratio for the ICMP-based solution is implementation- 835 specific but it is likely to be close to 100%. The success ratio 836 depends on how efficient the solution is implemented on the server 837 side. A remote server which does not support the ICMP-based solution 838 will ignore received companion ICMP messages. An upgraded server 839 will need to delay accepting a session until receiving the companion 840 ICMP message. 842 The success ratio for IDENT solution is implementation-specific but 843 it is likely to be close to 100%. The success ratio depends on how 844 efficient the solution is implemented on the server side. A remote 845 server which does not support IDENT will accept a session 846 establishment request following its normal operation. An upgraded 847 server will need to delay accepting a session until receipt of the 848 response to the IDENT request it will send to the host. 850 6. IANA Considerations 852 This document does not require any action from IANA. 854 7. Security Considerations 856 The same security concerns apply for the injection of an IP option, 857 TCP Option and application-related content (e.g., Forwarded HTTP 858 header) by the address-sharing device. If the server trusts the 859 content of the HOST_ID field, a third party user can be impacted by a 860 misbehaving user to reveal a "faked" HOST_ID (e.g., original IP 861 address). 863 HOST_ID may be used to leak information about the internal structure 864 of a network behind an address-sharing function. If this behavior is 865 undesired for the network administrator, the address-sharing function 866 can be configured to strip any existing HOST_ID in received packets 867 from internal hosts. 869 HOST_ID specification documents should elaborate further on threats 870 inherent to each individual solution used to convey the HOST_ID 871 (e.g., use of the IP-ID field to count hosts behind a NAT [Count]). 873 For more discussion of privacy issues related to HOST_ID, see 874 Section 3. 876 8. Acknowledgments 878 Many thanks to D. Wing, C. Jacquenet, J. Halpern, B. Haberman, 879 and P. Yee for their review, comments and inputs. 881 Thanks also to P. McCann, T. Tsou, Z. Dong, B. Briscoe, T. 882 Taylor, M. Blanchet, D. Wing, and A. Yourtchenko for the 883 discussions in Prague. 885 Some of the issues related to defining a new TCP Option have been 886 raised by L. Eggert. 888 The privacy text was provided by A. Cooper. 890 9. References 892 9.1. Normative References 894 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, September 895 1981. 897 [RFC3022] Srisuresh, P. and K. Egevang, "Traditional IP Network 898 Address Translator (Traditional NAT)", RFC 3022, January 899 2001. 901 [RFC6056] Larsen, M. and F. Gont, "Recommendations for Transport- 902 Protocol Port Randomization", BCP 156, RFC 6056, January 903 2011. 905 9.2. Informative References 907 [Count] , "A technique for counting NATted hosts", , 908 . 910 [ExtendTCP] 911 Honda, M., Nishida, Y., Raiciu, C., Greenhalgh, A., 912 Handley, M. and H. Tokuda,, "Is it still possible to 913 extend TCP?", November 2011, 914 . 916 [I-D.abdo-hostid-tcpopt-implementation] 917 Abdo, E., Boucadair, M., and J. Queiroz, "HOST_ID TCP 918 Options: Implementation & Preliminary Test Results", 919 draft-abdo-hostid-tcpopt-implementation-03 (work in 920 progress), July 2012. 922 [I-D.boucadair-pcp-nat-reveal] 923 Boucadair, M., Reddy, T., Patil, P., and D. Wing, "Using 924 PCP to Reveal a Host behind NAT", draft-boucadair-pcp-nat- 925 reveal-00 (work in progress), November 2012. 927 [I-D.chen-intarea-v4-uid-header-option] 928 Wu, Y., Ji, H., Chen, Q., and T. ZOU), "IPv4 Header Option 929 For User Identification In CGN Scenario", draft-chen- 930 intarea-v4-uid-header-option-00 (work in progress), March 931 2011. 933 [I-D.donley-behave-deterministic-cgn] 934 Donley, C., Grundemann, C., Sarawat, V., Sundaresan, K., 935 and O. Vautrin, "Deterministic Address Mapping to Reduce 936 Logging in Carrier Grade NAT Deployments", draft-donley- 937 behave-deterministic-cgn-05 (work in progress), January 938 2013. 940 [I-D.iab-privacy-considerations] 941 Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 942 Morris, J., Hansen, M., and R. Smith, "Privacy 943 Considerations for Internet Protocols", draft-iab-privacy- 944 considerations-03 (work in progress), July 2012. 946 [I-D.ietf-appsawg-http-forwarded] 947 Petersson, A. and M. Nilsson, "Forwarded HTTP Extension", 948 draft-ietf-appsawg-http-forwarded-10 (work in progress), 949 October 2012. 951 [I-D.ietf-behave-64-analysis] 952 Penno, R., Saxena, T., Boucadair, M., and S. Sivakumar, 953 "Analysis of Stateful 64 Translation", draft-ietf- 954 behave-64-analysis-07 (work in progress), March 2012. 956 [I-D.ietf-behave-ipfix-nat-logging] 957 Sivakumar, S. and R. Penno, "IPFIX Information Elements 958 for logging NAT Events", draft-ietf-behave-ipfix-nat- 959 logging-00 (work in progress), March 2013. 961 [I-D.ietf-behave-syslog-nat-logging] 962 Chen, Z., Zhou, C., Tsou, T., and T. Taylor, "Syslog 963 Format for NAT Logging", draft-ietf-behave-syslog-nat- 964 logging-00 (work in progress), February 2013. 966 [I-D.ietf-tcpm-fastopen] 967 Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 968 Fast Open", draft-ietf-tcpm-fastopen-03 (work in 969 progress), February 2013. 971 [I-D.wing-nat-reveal-option] 972 Yourtchenko, A. and D. Wing, "Revealing hosts sharing an 973 IP address using TCP option", draft-wing-nat-reveal- 974 option-03 (work in progress), December 2011. 976 [I-D.yourtchenko-nat-reveal-ping] 977 Yourtchenko, A., "Revealing hosts sharing an IP address 978 using ICMP Echo Request", draft-yourtchenko-nat-reveal- 979 ping-00 (work in progress), March 2012. 981 [IDENT_NAT] 982 Wing, D., "Using the Identification Protocol with an 983 Address Sharing Device", August 2012, . 986 [Not_An_Option] 987 R. Fonseca, G. Porter, R. Katz, S. Shenker, and I. 988 Stoica,, "IP options are not an option", 2005, . 992 [Options] Alberto Medina, Mark Allman, Sally Floyd, "Measuring 993 Interactions Between Transport Protocols and Middleboxes", 994 2005, . 997 [Proxy] Tarreau, W., "The PROXY protocol", November 2010, . 1000 [RFC1413] St. Johns, M.C., "Identification Protocol", RFC 1413, 1001 February 1993. 1003 [RFC2753] Yavatkar, R., Pendarakis, D., and R. Guerin, "A Framework 1004 for Policy-based Admission Control", RFC 2753, January 1005 2000. 1007 [RFC5201] Moskowitz, R., Nikander, P., Jokela, P., and T. Henderson, 1008 "Host Identity Protocol", RFC 5201, April 2008. 1010 [RFC6146] Bagnulo, M., Matthews, P., and I. van Beijnum, "Stateful 1011 NAT64: Network Address and Protocol Translation from IPv6 1012 Clients to IPv4 Servers", RFC 6146, April 2011. 1014 [RFC6269] Ford, M., Boucadair, M., Durand, A., Levis, P., and P. 1015 Roberts, "Issues with IP Address Sharing", RFC 6269, June 1016 2011. 1018 [RFC6302] Durand, A., Gashinsky, I., Lee, D., and S. Sheppard, 1019 "Logging Recommendations for Internet-Facing Servers", BCP 1020 162, RFC 6302, June 2011. 1022 [RFC6346] Bush, R., "The Address plus Port (A+P) Approach to the 1023 IPv4 Address Shortage", RFC 6346, August 2011. 1025 [RFC6462] Cooper, A., "Report from the Internet Privacy Workshop", 1026 RFC 6462, January 2012. 1028 [RFC6864] Touch, J., "Updated Specification of the IPv4 ID Field", 1029 RFC 6864, February 2013. 1031 [Trusted_ISPs] 1032 , "Trusted XFF list", , . 1035 Authors' Addresses 1037 Mohamed Boucadair 1038 France Telecom 1039 Rennes 35000 1040 France 1042 Email: mohamed.boucadair@orange.com 1044 Joe Touch 1045 USC/ISI 1047 Email: touch@isi.edu 1049 Pierre Levis 1050 France Telecom 1051 Caen 14000 1052 France 1054 Email: pierre.levis@orange.com 1055 Reinaldo Penno 1056 Cisco 1057 USA 1059 Email: repenno@cisco.com