idnits 2.17.1 draft-daveor-cgn-logging-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 2 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 11, 2018) is 2207 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC5905' is defined on line 721, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force D. O'Reilly 3 Internet-Draft April 11, 2018 4 Intended status: Informational 5 Expires: October 13, 2018 7 Approaches to Address the Availability of Information in Criminal 8 Investigations Involving Large-Scale IP Address Sharing Technologies 9 draft-daveor-cgn-logging-04 11 Abstract 13 The use of large-scale IP address sharing technologies (commonly 14 known as "Carrier-Grade NAT" and "A+P") presents a challenge for law 15 enforcement agencies due to the fact that incoming source port 16 information is not routinely logged by Internet-facing servers. The 17 absence of this information means that it is becoming increasingly 18 difficult for law enforcement agencies to identify suspects in 19 criminal activity online. This document considers the reasons why 20 source port information is not routinely logged by Internet-facing 21 servers and makes recommendations to help improve the situation. A 22 deployment maturity model has been developed and a study of the 23 support for logging incoming source port information in common server 24 software is also presented. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on October 13, 2018. 43 Copyright Notice 45 Copyright (c) 2018 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 2. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 3. Centralised Connection Logging . . . . . . . . . . . . . . . 5 63 4. Challenges to Capturing Source Port . . . . . . . . . . . . . 7 64 4.1. Lack of Awareness . . . . . . . . . . . . . . . . . . . . 7 65 4.2. Lack of Support for Logging Source Port . . . . . . . . . 8 66 4.3. Additional Storage Requirements . . . . . . . . . . . . . 8 67 4.4. Default Log Formats . . . . . . . . . . . . . . . . . . . 8 68 4.5. Breaking Existing Tooling . . . . . . . . . . . . . . . . 9 69 4.6. Accuracy of Recorded Time . . . . . . . . . . . . . . . . 9 70 4.7. Translation of Source Port by Endpoint Infrastructure . . 9 71 5. Comparison Model . . . . . . . . . . . . . . . . . . . . . . 10 72 6. Support for Logging Source Port . . . . . . . . . . . . . . . 10 73 7. Recommendations . . . . . . . . . . . . . . . . . . . . . . . 11 74 7.1. Raise Awareness of the Importance of Logging Source Port 12 75 7.2. Increase Support for Logging Source Port . . . . . . . . 12 76 7.3. Update Default Log Formats . . . . . . . . . . . . . . . 12 77 7.4. Adequate Timestamp Accuracy in Logs . . . . . . . . . . . 13 78 7.5. Source Port Translation in Endpoint Infrastructure . . . 13 79 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 80 9. Security Considerations . . . . . . . . . . . . . . . . . . . 14 81 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 82 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 83 11.1. Informative References . . . . . . . . . . . . . . . . . 15 84 11.2. Normative References . . . . . . . . . . . . . . . . . . 15 85 Appendix A. Support for Source Port Logging in Various Server 86 Software . . . . . . . . . . . . . . . . . . . . . . 17 87 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 18 89 1. Introduction 91 Large-scale IP address sharing technologies (such as "Carrier-Grade 92 NAT", [RFC6888]) are a helpful tool for extending the life of IPv4 93 addresses by allowing multiple endpoints to share a small number of 94 IPv4 addresses. A related category of technologies, known as 95 "Address plus Port", or "A+P" [RFC6346], are also used for large- 96 scale IP address sharing, achieved in these cases by using some of 97 the port number bits for addressing purposes. A number of such 98 technologies have been discussed and deployed, such as Dual-Stack 99 Lite [RFC6333], NAT64 [RFC6146], NAT444 [I-D.shirasaki-nat444], 100 Lightweight 4over6 [RFC7596], MAP-E [RFC7597] and MAP-T [RFC7599]. 102 All of these technologies involve extending the space of available 103 IPv4 addresses by mapping communication from multiple endpoints to a 104 single, or small number of shared addresses, through the use of port 105 numbers. The detail of how this is achieved in each technology 106 varies, but the principle remains the same in all cases. 108 From the perspective of a server on the Internet, endpoint traffic 109 that has passed through IP address sharing infrastructure appears to 110 be originating from the IP address of the address sharing appliance. 111 Common practice at the present time is for servers to log the 112 connection time and source IP address of incoming connections. 113 However, the IP address of the address sharing appliance is not 114 sufficient to identify the true source of the traffic because 115 potentially hundreds or thousands of individual endpoints were using 116 that IP address at the same time. If the need arises during a 117 criminal investigation to identify the source of a specific 118 connection, the source port and exact connection time will also be 119 required. Without this additional information it is highly unlikely 120 that it will be possible for law enforcement authorities to progress 121 their investigations. 123 Information is required from at least two sources to establish the 124 link from the logs of an Internet-facing server to a specific 125 subscriber endpoint: 127 1. The administrator of the Internet-facing server must have logged 128 enough information to enable the operator of the IP address 129 sharing infrastructure to isolate a specific subscriber endpoint. 131 2. The operator of the IP address sharing infrastructure must have 132 logged sufficient information (for a sufficient length of time) 133 to be able, when provided with adequate data by a law enforcement 134 agency, to isolate the relevant subscriber endpoint. 136 The operators of large-scale IP address sharing infrastructure, 137 typically Internet Service Providers, are usually required by law to 138 maintain records of which endpoint was using a particular IP address 139 and port at a particular time. The period of time for which these 140 records must be retained is defined by national legislation. 141 Irrespective of whether (and for how long) these records are 142 available, a starting point is needed to indicate to an investigating 143 law enforcement agency that a particular endpoint was involved in a 144 suspected criminal activity under investigation. Without such a 145 starting point, it would be very difficult to progress the 146 investigation even as far as engagement with the operator of the 147 address sharing infrastructure. The records of Internet-facing 148 servers are often a crucial source of this type of evidence. 150 It has been recognised for some time that IP address sharing presents 151 a challenge to the ability to trace network use and abuse [RFC7620]. 152 Further, it has also been recognised that this challenge is likely to 153 become more severe and widespread with the increased use of large- 154 scale address sharing [RFC6269]. More recently, Europol has 155 highlighted the issue of large-scale IP address sharing as a threat 156 to Internet governance [EUROPOL_IOCTA]. It is reported that the 157 problem of crime attribution related to the use of carrier-grade NAT 158 technologies is regularly encountered by 90% of respondents to a 159 survey on the topic. 161 Address sharing, including large-scale address sharing, is required 162 as long as the use of IPv4 continues. Full deployment of IPv6 has 163 the potential to ultimately eliminate the current attribution issues 164 arising from the use of large-scale address sharing technologies, 165 although presumably new attribution challenges will arise in that 166 scenario. Since it is impossible to anticipate if or when full 167 migration to IPv6 will take place, it is prudent to consider the 168 implications of the transitionary technologies until the need for 169 them has been eliminated. 171 2. Scope 173 Previous work has already suggested as best practice the logging by 174 Internet-facing servers of source IP address, source port and exact 175 connection time [RFC6302]. However, this continues to be 176 exceptional, rather than routine, logging practice. The purpose of 177 this document is to consider in more detail how it might be possible 178 to bring about routine logging by Internet-facing servers of the 179 information needed to re-establish the ability to trace network abuse 180 for criminal investigative purposes. This document specifically does 181 not address or consider the logging requirements of operators of 182 large-scale address sharing infrastructre. Instead, the focus is on 183 the logging considerations of operators of Internet-facing servers. 184 The main contributions of this document are: 186 1. To consider the reasons why source port logging is not routinely 187 carried out. 189 2. To identify some possible solutions and workarounds for the 190 reasons that source port logging is not routinely carried out. 192 3. To examine the feasibility of source port logging from the 193 perspective of software support for this feature. 195 Clearly no single solution will address the problem of crime 196 attribution on the Internet. Load balancers, proxies and other 197 network infrastructure may also, intentionally or as a side-effect, 198 obfuscate the true source of Internet traffic and these problems will 199 continue to exist with or without the presence of large-scale address 200 sharing technologies (like Carrier-Grade NAT and A+P). Nevertheless, 201 at the time of writing large-scale address sharing technologies 202 present a significant challenge to crime attribution, as highlighted 203 by Europol in the above referenced link, and this document attempts 204 to consider the challenges specifically presented by that category of 205 technologies. 207 The discussion begins by considering whether centralised connection 208 logging is a viable solution to the problem of subscriber 209 identification in criminal investigations. This is followed by an 210 examination of the reasons why source port logging is not currently 211 routinely carried out. A model has been developed for the comparison 212 of the maturity of various server deployments to log source port and 213 a study of common server software has been performed to assess the 214 status of support for this functionality. Many, but not all, 215 enterprise server solutions that were examined made the logging of 216 source port either "Possible" or "Feasible", as defined in the 217 maturity model. Only one type of server software examined made the 218 logging of source port "Default". 220 3. Centralised Connection Logging 222 When large-scale IP address sharing technologies are used, source IP 223 address is no longer a sufficient identifier of an individual 224 subscriber. At a minimum, source port and accurate timestamp 225 information are also required to distinguish between the potentially 226 large number of individual users of a specific IP address at a 227 particular time. [RFC6269] points out that there are two solutions 228 to the question of how adequate information can be recorded to 229 identify the parties to a particular connection. They are: 231 1. Operators of IP address sharing infrastructure log mappings 232 between (source IP address, source port) combinations and their 233 subscribers. Server operators log the IP address and source port 234 of incoming connections. This is referred to as source port 235 logging. 237 2. Instead of relying on server operators to log the source port of 238 incoming connections, operators of IP address sharing 239 infrastructure log all combinations of (external IP address, 240 external port, destination IP address) for outgoing connections. 241 This is referred to as connection logging. Server operators log 242 the IP address and timestamp of incoming connections, which is 243 the common current practice. 245 Two challenges to the use of connection logging by operators of IP 246 address sharing infrastructure are also presented in RFC6269. 247 Briefly: 249 o The volumes of data involved make centralised recording of 250 destination IP addresses infeasible. 252 o Many individuals using the same IP address to access a popular 253 destination (e.g. a popular website) might mean that it is not 254 possible to distinguish between the activity of one subscriber and 255 another, even if connection records are kept by the operator of 256 the address sharing infrastructure. 258 The first issue raised is that the volumes of data involved make 259 centralised recording of destination IP addresses infeasible. 260 Whether destination IP addresses are recorded or not, the volume of 261 logs generated by a large-scale IP address sharing infrastructure 262 will be substantial, and some approaches have been proposed to 263 address this hurdle and make central connection logging more 264 feasible, such as deterministic allocation of ports 265 [RFC6269],[RFC7422] or allocation of port ranges [RFC7768], 266 [RFC6346]. While arguments of infeasibility are not arguments in 267 principle why such logging cannot be done, the volumes of data 268 involved in recording every single outgoing connection in a large 269 Internet service provider represent legitimate technical, commercial 270 and operational arguments for why it can not work in practice. Some 271 representative figures for the scales of data involved can be found 272 in [RFC7422], wherein it is estimated that the logging overhead would 273 be of the order of 150MB per subscriber, per month. For a service 274 provider with one million subscribers, this would produce a volume of 275 logs (uncompressed) of the order of 150 terabytes per month. Aside 276 from the technical overhead of storing such a volume of data, 277 searching and locating relevant records over an extended, legally 278 mandated retention period would also present a significant technical 279 challenge. 281 The second point raised in [RFC6269] against connection logging by 282 operators of IP address sharing infrastructure suggests that even if 283 connection logs store all combinations of (timestamp, source IP, 284 source port, destination IP), if this information is queried in the 285 absence of source port because source port has not been recorded by 286 the destination IP, this would not be sufficient to distinguish the 287 activity of one individual from another in cases where the 288 destination IP is a popular one. This problem is further exacerbated 289 in the case of protocols that make multiple connections per session 290 (e.g. HTTP/HTTPS). The implication of this point is that connection 291 logging, despite potential significant technical and operational 292 overhead, cannot guarantee that the information retained is 293 sufficient to identify an individual suspect, even when all required 294 records are available. 296 Finally, the privacy concerns arising from connection logging in this 297 scenario have been repeatedly raised [RFC6888] and 298 [I-D.ietf-behave-ipfix-nat-logging]. 300 In summary, it is certainly clear that operators of address sharing 301 infrastructure need to retain records to enable the identification of 302 suspects, and such records must consist of, at least, sufficient 303 information to identify an individual subscriber when provided with a 304 timestamp, source IP, source port and destination IP. However, there 305 is no centralised solution available that removes the need for server 306 operators to retain source port information. 308 4. Challenges to Capturing Source Port 310 It is relatively easy to articulate the reason why the operator of an 311 Internet-facing server would wish to retain source port information 312 for incoming connections. If the server operator (or the users that 313 they serve) finds themselves the victim of a crime, it is preferable 314 that all information that could be needed by the server operator to 315 facilitate a criminal investigation is available. On the other hand, 316 there are reasons why a server operator might not have the required 317 source port information. This section enumerates the factors that 318 could negatively influence both the ability and the inclination of 319 server operators to capture and record source port information. 321 4.1. Lack of Awareness 323 Server operators are principally focussed on delivering the services 324 for which they are operating their infrastructure. One of the main 325 problems with the increasing use of IP address sharing technologies 326 is the lack of awareness on the part of server operators that there 327 are direct implications for them in case they should become the 328 victim of a crime. 330 At the time of writing, a minimal amount of material is available 331 online concerning this issue, even for those actively seeking to find 332 out about source port logging. Where specific guidance or 333 information has been provided by vendors in relation to the 334 configuration of source port logging, no explanation is provided for 335 why this might be something that server operators might consider 336 desirable. For example [MSDN_IIS_LOG]. 338 There is, therefore, a considerable awareness gap between the 339 importance of this issue for the purpose of investigating criminal 340 activity online and the awareness of those who need to act in advance 341 of any criminality taking place to ensure that the information needed 342 to facilitate a future investigation is available. 344 4.2. Lack of Support for Logging Source Port 346 Before a server operator can decide to log source port information, 347 the server software must support logging of the source port of 348 incoming connections. Many, but not all major software distributions 349 support the logging of the source port of incoming connections. 350 Clearly lack of support in server software is a technical obstacle 351 for a server operator to logging source port at the endpoint. It may 352 still be possible to log source port at some location before the 353 server endpoint (e.g. at a reverse proxy) but absence of support in 354 server software will mean that endpoint logging will not be possible. 356 4.3. Additional Storage Requirements 358 In cases where it is possible to simply add source port to the list 359 of fields recorded in log entries, the additional storage required to 360 preserve source port data is minimal; in the region of six bytes per 361 log entry (maximum of five ASCII digits for the source port plus an 362 additional delimiter). 364 However, in some cases where software supports logging source port of 365 incoming connections, it has been noted that this can only be 366 achieved by enabling verbose or debug logging in the software. This 367 would substantially (and unnecessarily) increase the size of logs 368 produced by the server and would also, in all probability, reduce the 369 production performance of the server. These factors would 370 undoubtedly negatively influence the decision by a server operator to 371 log incoming source port. 373 4.4. Default Log Formats 375 Many major software distributions provide default log formats in 376 their configuration files. A review of the default log format of 377 some common server software has been carried out and in only one case 378 was it found that the source port of incoming connections is logged 379 by any of the default log formats. 381 4.5. Breaking Existing Tooling 383 Much commercial and free log analysis software, by default, expects 384 logs to be in a particular format. Consider, for example, the 385 ubiquity of the Apache Common and Extended Log Formats. The software 386 can usually be configured to parse arbitrary log formats, but this is 387 additional configuration work for a server operator. For example: 388 [ANALOG_LOG_CONFIG],[AWSTATS_LOG_CONFIG]. Without migration 389 planning, a change to default log formats would most likely cause 390 substantial disruption to a considerable amount of downstream 391 processing of server log files. In addition to commercially 392 available software, many administrators have developed or downloaded 393 scripts that expect logs to be in a standard log format. 395 Therefore, log processing software, and in particular custom scripts, 396 may break if default log formats change unexpectedly. At least, the 397 tooling may need to be updated to correctly process the additional 398 fields newly present in log file. 400 4.6. Accuracy of Recorded Time 402 As well as recording the IP address and source port of the 403 connection, it is important to record the exact time of the 404 connection. It has been suggested that there is a need for keeping 405 the exact time against some sort of global standard (e.g. NTP) 406 [RFC6302], however this may not be possible for practical, security 407 or legacy reasons. In practice, it is usually not necessary to keep 408 time against a global standard, as long as time is recorded 409 consistently. The reason for this is that any time offset between 410 the server and the time recorded in another organisation's records 411 (running address sharing infrastructure) can be calculated and 412 compensated for manually. Time offsets of this nature are commonly 413 encountered and well understood in the digital forensics world. 415 4.7. Translation of Source Port by Endpoint Infrastructure 417 It is common for an incoming connection to terminate somewhere other 418 than the actual server that is ultimately handling the connection. 419 Load balancers, proxies or denial of service countermeasures may be 420 present to improve the efficiency or availability of the platform, 421 any one of which could potentially terminate the incoming connection. 422 The operation of these types of endpoint infrastructure can cause 423 translation of the incoming connection parameters, including source 424 port, before the connection is established to the actual server 425 endpoint. 427 In such cases the source port logged at the server endpoint is a 428 source port that only has meaning within the endpoint infrastructure 429 and in most cases will not carry any information about the source 430 port in use at the connection origin, in this case the connection 431 origin being the large-scale address sharing infrastructure. In the 432 worst case scenario (from a crime attribution point of view), the 433 endpoint infrastructure may obfuscate the true source connection 434 information in a way that is unrecoverable. 436 5. Comparison Model 438 A model has been developed to assist with comparison of the maturity 439 of server software deployments to store and retrieve source port 440 information for incoming connections. The model is depicted in 441 Figure 1. 443 +-------------------------------------------------------------+ 444 | Possible -> Feasible -> Default -> Manageable -> Accessible | 445 +-------------------------------------------------------------+ 447 Figure 1 449 o "Possible": Means that the server software supports, in any way, 450 the ability to record source ports for incoming connections. 452 o "Feasible": Means that it there are no significant performance or 453 storage implications for enabling the storage of source ports. 455 o "Default": Means that, at a minimum, at least one of the default 456 log formats provided with the software distribution enables the 457 storage of source ports. 459 o "Manageable": Means that tooling is, or has been, build or adapted 460 to support the storage of source ports. 462 o "Accessible": Means that it is possible to identify and retrieve 463 relevant records in the stored log data. 465 6. Support for Logging Source Port 467 Open-source research has been conducted to assess the status of 468 support for logging of source port information in common server 469 software. 471 The assessment criteria were as follows: 473 o Server software is categorised as "Possible" if there was any way 474 identified to cause the logging of source port. 476 o Server software is categorised as "Feasible" if the logging of 477 source port does not require increasing the log level to cause the 478 logging of source port to be possible. In other words, if a 479 server requires enabling verbose, debug or audit logging in order 480 to be able to record source port then logging is "Possible" but 481 not "Feasible". 483 o Server software is categorised as "Default" if at least one of the 484 available default log formats enables logging of the incoming 485 source port, or if source port is logged by default. 487 o The "Manageable" and "Accessible" aspects of the comparison model 488 relate to specific deployments and are therefore not considered in 489 the assessment of server software support. 491 The latest versions of 16 common server software packages have been 492 examined and documentation has been research to identify if and how 493 source port logging can be enabled. The findings are described in 494 Appendix A. Online documentation has been examined to identify if 495 and how source port logging can be enabled. The results are 496 presented in the following table: 498 +----------+----------+---------+------------+------------+ 499 | Possible | Feasible | Default | Manageable | Accessible | 500 +----------+----------+---------+------------+------------+ 501 | 13 | 11 | 1 | N/A | N/A | 502 +----------+----------+---------+------------+------------+ 504 Table 1: Support Table 506 It was noted that only one of the server software packages examined 507 (OpenSSH version 7.5) enables the logging of incoming source port by 508 default. This conclusion has been reached despite using the most 509 generous possible interpretation of "Default", whereby meeting the 510 criteria for "Default" is achieved when logging of source port is 511 offered as a possible default, rather than requiring that logging of 512 source port is enabled by default. In due course, as awareness of 513 this issue increases, it is envisioned that a stricter interpretation 514 of "Default" would be more appropriate, requiring that the logging of 515 source port be enabled by default. 517 7. Recommendations 519 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 520 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 521 document are to be interpreted as described in [RFC2119]. 523 The recommendations presented below are courses of action that have 524 been identified based on the current state of source port logging and 525 the challenges described above. 527 7.1. Raise Awareness of the Importance of Logging Source Port 529 Publishers of both free and commercial software SHOULD release 530 deployment guidance or best practice that describes why server 531 administrators need to record source port information, with 532 instructions for how this can be done. This will help to address the 533 lack of awareness of the importance of this issue. 535 Considering also the awareness of those who are building software 536 applications, or otherwise involved with coding of Internet-facing 537 applications, secure coding guidance SHOULD be updated to include 538 reference to source port information, particularly where such 539 guidance already touches on the issue of logging. For example the 540 OWASP Secure Coding Practices specifies a list of important log event 541 data [OWASP_SCP]. However the "important log event data" list does 542 not, at the time of writing, include source port. 544 7.2. Increase Support for Logging Source Port 546 Many software packages support logging of source port information, 547 but only ten out of the sixteen examined support logging in a way 548 that would not significantly negatively impact the operation of the 549 server software. Software publishers therefore need to consider 550 their level of support of logging source port. In particular, 551 software SHOULD support the logging of source port and SHOULD do so 552 in a way that does not substantially impact on production 553 performance. 555 7.3. Update Default Log Formats 557 In cases where a software package has support for logging of incoming 558 source port, the configuration SHOULD incorporate one or more 559 optional log formats that include incoming source port as a field 560 logged by default. Obviously this will not have any impact on 561 deployments of the software that are already in place but for future 562 deployments, the incorporation of source port into "out of the box" 563 log formats will mean that those administrators using unaltered 564 default log formats will automatically store the needed information. 565 Software vendors SHOULD provide a default log format that includes 566 logging of source port, as described in this document. 568 An alternative approach, taking into account the fact that changes to 569 log formats might break downstream tooling, would be to configuring 570 parallel logging of connection information to a separate log stream. 572 This would also be a possible solution that could be used by those 573 server software types that log via syslog. In this case, software 574 publishers SHOULD produce guidance on how to configure syslog to log 575 connection information parallel to the main log files. Such a 576 solution would help to ease the transition to an alternate log format 577 since current log formats would not need to be changed because the 578 required source port information is stored separately, but can still 579 be correlated with the main log files if needed. 581 7.4. Adequate Timestamp Accuracy in Logs 583 In order to query their records, operators of large-scale address 584 sharing infrastructure will usually need connection times specified 585 with at least the granularity of a second. Consideration should be 586 given by server operators to making sure that the times recorded in 587 their log files have sufficient accuracy to allow identification of 588 the required records. Server software SHOULD be able to log time 589 with at least the granularity of a second. 591 There are many reasons why it is may not be possible for servers to 592 record logs with reference to a global time source. This could 593 include scenarios should as security sensitive networks, or internal 594 production networks. As long as times are recorded consistently, it 595 should be possible to measure the offset from a traceable global time 596 source (if required) for the purposes of quering records at another 597 source. If the entity controlling the server is aware that there is 598 an offset required to synchronise with a global time source, it is 599 expected that the offset would be indicated by the entity while the 600 logs were being collected. 602 Adequate timstamp accuracy also needs to be considered by software 603 developers when they are producing software. Although the recording 604 of time is mentioned in the OWASP Secure Coding Practices, the 605 required accuracy/granularity of the recorded time is not discussed 606 [OWASP_SCP]. Development guidance SHOULD include clarifying that 607 times need to be recorded with at least the granularity of a second. 609 7.5. Source Port Translation in Endpoint Infrastructure 611 In cases where endpoint infrastructure terminates incoming 612 connections (proxies, load balancers, etc.), and the infrastructure 613 translates incoming source port information, there is a risk that the 614 important crime attribution information may be lost. One possibility 615 is to log source port information at the endpoing infrastructure and 616 this may be an appropriate solution in some cases. However, this may 617 lead to an excessive volume of logging, depending on the particular 618 scenario. For example if the intermediate infrastructure is being 619 used to mitigate DDoS attacks, logging all incoming traffic would 620 potentially lead to logging of all incoming DDoS connections. This 621 would clearly be an undesirable outcome. 623 An alternative solution is to pass information about the original 624 connection (before mapping/translation of connection information 625 takes place) to the actual endopint. Solutions to achieve this 626 already exist for certain application layer protocols. The Forwarded 627 HTTP Extention [RFC7239], for example, supports (as an optional 628 feature) the tranfer of source port information in the "Forwarded 629 For" header, and this technique can also support multiple layers of 630 proxying without loss of attribution. Therefore, endpoint 631 infrastructure that translates source ports SHOULD pass the original 632 connection information through to the Internet-facing server for 633 logging purposes. 635 8. IANA Considerations 637 This memo includes no request to IANA. 639 9. Security Considerations 641 Clearly a balance needs to be struck between individual right to 642 privacy and law enforcement access to data during criminal 643 investigations. On the one hand, the routine logging of any 644 additional information has the potential to introduce risks related 645 to privacy and human rights. On the other hand, there is a societal, 646 crime prevention requirement to address the information gap created 647 by large-scale address sharing technologies. Across the world there 648 are also a broad spectrum of legislative regimes and human rights 649 challenges, interpretation of which relate directly to this question. 651 IP addresses are routinely logged today and this information can be 652 used for identification of people online in some cases. The cases in 653 which an IP addresses does not identify an individual directly are 654 not necessarily apparent to the person performing the logging (who 655 cannot tell, for example, if the true source of the traffic is behind 656 a NAT or other form of proxy) and the same is true even if source 657 port is logged. It is not apparent that there is any additional risk 658 to individual privacy between the case when a single piece of 659 endpoint identifying information (source IP address) is logged versus 660 the case when two pieces of endpoint identifying information (source 661 IP address and source port) are logged. Balancing this against the 662 significant advantages from the crime attribution point of view 663 suggests that this may be a worthwhile approach. 665 10. Acknowledgements 667 Several members of the v6ops mailing list provided valuable feedback 668 and discussion on early drafts of this document. In particular, Tom 669 Herbert, Ca By, Ole Troan, Lee Howard, Erik Nygren, Fred Baker, 670 Fernando Gont, Gert Doering, Mark Smith, Jordi Palet Martinez, DY 671 Kim, Mark Andrews and T. Petch. Special acknowledgement also goes 672 to Mohamed Boucadiar who has provided ongoing feedback throughout the 673 document development process. 675 11. References 677 11.1. Informative References 679 [I-D.ietf-behave-ipfix-nat-logging] 680 Sivakumar, S. and R. Penno, "IPFIX Information Elements 681 for logging NAT Events", draft-ietf-behave-ipfix-nat- 682 logging-13 (work in progress), January 2017. 684 [I-D.shirasaki-nat444] 685 Yamagata, I., Shirasaki, Y., Nakagawa, A., Yamaguchi, J., 686 and H. Ashida, "NAT444", draft-shirasaki-nat444-06 (work 687 in progress), July 2012. 689 11.2. Normative References 691 [ANALOG_LOG_CONFIG] 692 Analog, "Analog 6.0: Log formats", 2017, 693 . 695 [AWSTATS_LOG_CONFIG] 696 AWStats, "AWStats Installation, Configuration and 697 Reporting (for version 7.6)", 2017, 698 . 700 [EUROPOL_IOCTA] 701 Europol, "The Internet Organised Crime Threat Assessment", 702 2016, . 706 [MSDN_IIS_LOG] 707 Microsoft, "IIS 8.5 - How to log client port number", 708 2015, . 711 [OWASP_SCP] 712 OWASP, "OWASP Secure Coding Practices Quick Reference 713 Guide", 2010, . 716 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 717 Requirement Levels", BCP 14, RFC 2119, 718 DOI 10.17487/RFC2119, March 1997, 719 . 721 [RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch, 722 "Network Time Protocol Version 4: Protocol and Algorithms 723 Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010, 724 . 726 [RFC6146] Bagnulo, M., Matthews, P., and I. van Beijnum, "Stateful 727 NAT64: Network Address and Protocol Translation from IPv6 728 Clients to IPv4 Servers", RFC 6146, DOI 10.17487/RFC6146, 729 April 2011, . 731 [RFC6269] Ford, M., Ed., Boucadair, M., Durand, A., Levis, P., and 732 P. Roberts, "Issues with IP Address Sharing", RFC 6269, 733 DOI 10.17487/RFC6269, June 2011, 734 . 736 [RFC6302] Durand, A., Gashinsky, I., Lee, D., and S. Sheppard, 737 "Logging Recommendations for Internet-Facing Servers", 738 BCP 162, RFC 6302, DOI 10.17487/RFC6302, June 2011, 739 . 741 [RFC6333] Durand, A., Droms, R., Woodyatt, J., and Y. Lee, "Dual- 742 Stack Lite Broadband Deployments Following IPv4 743 Exhaustion", RFC 6333, DOI 10.17487/RFC6333, August 2011, 744 . 746 [RFC6346] Bush, R., Ed., "The Address plus Port (A+P) Approach to 747 the IPv4 Address Shortage", RFC 6346, 748 DOI 10.17487/RFC6346, August 2011, 749 . 751 [RFC6888] Perreault, S., Ed., Yamagata, I., Miyakawa, S., Nakagawa, 752 A., and H. Ashida, "Common Requirements for Carrier-Grade 753 NATs (CGNs)", BCP 127, RFC 6888, DOI 10.17487/RFC6888, 754 April 2013, . 756 [RFC7239] Petersson, A. and M. Nilsson, "Forwarded HTTP Extension", 757 RFC 7239, DOI 10.17487/RFC7239, June 2014, 758 . 760 [RFC7422] Donley, C., Grundemann, C., Sarawat, V., Sundaresan, K., 761 and O. Vautrin, "Deterministic Address Mapping to Reduce 762 Logging in Carrier-Grade NAT Deployments", RFC 7422, 763 DOI 10.17487/RFC7422, December 2014, 764 . 766 [RFC7596] Cui, Y., Sun, Q., Boucadair, M., Tsou, T., Lee, Y., and I. 767 Farrer, "Lightweight 4over6: An Extension to the Dual- 768 Stack Lite Architecture", RFC 7596, DOI 10.17487/RFC7596, 769 July 2015, . 771 [RFC7597] Troan, O., Ed., Dec, W., Li, X., Bao, C., Matsushima, S., 772 Murakami, T., and T. Taylor, Ed., "Mapping of Address and 773 Port with Encapsulation (MAP-E)", RFC 7597, 774 DOI 10.17487/RFC7597, July 2015, 775 . 777 [RFC7599] Li, X., Bao, C., Dec, W., Ed., Troan, O., Matsushima, S., 778 and T. Murakami, "Mapping of Address and Port using 779 Translation (MAP-T)", RFC 7599, DOI 10.17487/RFC7599, July 780 2015, . 782 [RFC7620] Boucadair, M., Ed., Chatras, B., Reddy, T., Williams, B., 783 and B. Sarikaya, "Scenarios with Host Identification 784 Complications", RFC 7620, DOI 10.17487/RFC7620, August 785 2015, . 787 [RFC7768] Tsou, T., Li, W., Taylor, T., and J. Huang, "Port 788 Management to Reduce Logging in Large-Scale NATs", 789 RFC 7768, DOI 10.17487/RFC7768, January 2016, 790 . 792 Appendix A. Support for Source Port Logging in Various Server Software 794 The table below enumerates the findings of best-effort, open-source 795 review of documentation of the various products. Where it has been 796 indicated that it is not possible to log source port then either (a) 797 no reference has been identified in online documentation to indicate 798 how source port logging can be enabled, or (b) a reference positively 799 indicating that logging of source port is not possible has been 800 found. 802 +---------+------------+------------+----------+----------+---------+ 803 | Categor | Server | Version | Possible | Feasible | Default | 804 | y | | | | | | 805 +---------+------------+------------+----------+----------+---------+ 806 | HTTP | Apache | 2.4.25 | Yes | Yes | No | 807 | | HTTPD | | | | | 808 | HTTP | IIS | 10 | Yes | Yes | No | 809 | HTTP | Tomcat | 8.5.15 | Yes | Yes | No | 810 | HTTP | Squid | 3.5.25 | Yes | Yes | No | 811 | HTTP | nginx | 1.12.0 | Yes | Yes | No | 812 | Mail | sendmail | 8.15.2 | Yes | Yes | No | 813 | Mail | Microsoft | 2016 | Yes | No | No | 814 | | Exchange | | | | | 815 | | Server | | | | | 816 | Mail | Postfix | 2.10.0 | Yes | Yes | No | 817 | Mail | Exim | 4.89 | Yes | Yes | No | 818 | Mail | Dovecot | 2.2.30.1 | Yes | Yes | No | 819 | Mail | UW IMAP | imap-2007f | No | No | No | 820 | DBase | Oracle | 12.2.0.1 | No | No | No | 821 | DBase | MySQL | 5.7.18 | No | No | No | 822 | DBase | Microsoft | 2016 | Yes | No | No | 823 | | SQL Server | | | | | 824 | DBase | PostgreSQL | 9.6.3 | Yes | Yes | No | 825 | SSH | OpenSSHD | 7.5 | Yes | Yes | Yes | 826 +---------+------------+------------+----------+----------+---------+ 828 Table 2: Support for Logging Incoming Source Port 830 Author's Address 832 David O'Reilly 833 Ireland 835 Email: rfc@daveor.com