idnits 2.17.1 draft-daveor-cgn-logging-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 2 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 3, 2018) is 2276 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force D. O'Reilly 3 Internet-Draft January 3, 2018 4 Intended status: Informational 5 Expires: July 7, 2018 7 Approaches to Address the Availability of Information in Criminal 8 Investigations Involving Large-Scale IP Address Sharing Technologies 9 draft-daveor-cgn-logging-02 11 Abstract 13 The use of large-scale IP address sharing technologies (commonly 14 known as "Carrier-Grade NAT" and "A+P") presents a challenge for law 15 enforcement agencies due to the fact that incoming source port 16 information is not routinely logged by Internet-facing servers. The 17 absence of this information means that it is becoming increasingly 18 difficult for law enforcement agencies to identify suspects in 19 criminal activity online. This document considers the reasons why 20 source port information is not routinely logged by Internet-facing 21 servers and proposes some immediate-term actions that can be taken to 22 help improve the situation. A deployment maturity model has been 23 developed and a study of the support for logging incoming source port 24 information in common server software is also presented. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on July 7, 2018. 43 Copyright Notice 45 Copyright (c) 2018 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 2. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 3. Centralised Connection Logging . . . . . . . . . . . . . . . 5 63 4. Challenges to Capturing Source Port . . . . . . . . . . . . . 7 64 4.1. Lack of Awareness . . . . . . . . . . . . . . . . . . . . 7 65 4.2. Lack of Support for Logging Source Port . . . . . . . . . 8 66 4.3. Additional Storage Requirements . . . . . . . . . . . . . 8 67 4.4. Default Log Formats . . . . . . . . . . . . . . . . . . . 8 68 4.5. Breaking Existing Tooling . . . . . . . . . . . . . . . . 9 69 4.6. Accuracy of Recorded Time . . . . . . . . . . . . . . . . 9 70 4.7. Translation of Source Port by Intermediate Infrastructure 9 71 5. Comparison Model . . . . . . . . . . . . . . . . . . . . . . 10 72 6. Support for Logging Source Port . . . . . . . . . . . . . . . 10 73 7. Conclusions and Next Steps . . . . . . . . . . . . . . . . . 11 74 7.1. Raise Awareness of the Importance of Logging Source Port 12 75 7.2. Increase Support for Logging Source Port . . . . . . . . 12 76 7.3. Update Default Log Formats . . . . . . . . . . . . . . . 12 77 7.4. Parallel Logging to a Connection Log . . . . . . . . . . 12 78 7.5. Adequate Timestamp Accuracy in Logs . . . . . . . . . . . 13 79 7.6. Address Source Port Translation in Intermediate 80 Infrastructure . . . . . . . . . . . . . . . . . . . . . 13 81 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 82 9. Security Considerations . . . . . . . . . . . . . . . . . . . 14 83 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 84 10.1. Informative References . . . . . . . . . . . . . . . . . 14 85 10.2. Normative References . . . . . . . . . . . . . . . . . . 15 86 Appendix A. Support for Source Port Logging in Various Server 87 Software . . . . . . . . . . . . . . . . . . . . . . 17 88 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 17 90 1. Introduction 92 Large-scale IP address sharing technologies (often collectively 93 referred to as "Carrier-Grade NAT", [RFC6888]) are a helpful tool for 94 extending the life of IPv4 addresses by allowing multiple endpoints 95 to share a small number of IPv4 addresses. A number of such 96 technologies have been discussed and deployed, such as Dual-Stack 97 Lite [RFC6333], NAT64 [RFC6146] and NAT444 [I-D.shirasaki-nat444]. A 98 related category of technologies, known as "Address plus Port", or 99 "A+P" [RFC6346], are also used for large-scale IP address sharing, 100 achieved in these cases by using some of the port number bits for 101 addressing purposes. Multiple examples of this category of 102 technologies are also available, including Lightweight 4over6 103 [RFC7596], MAP-E [RFC7597] and MAP-T [RFC7599]. 105 All of these technologies involve extending the space of available 106 IPv4 addresses by mapping communication from multiple endpoints to a 107 single, or small number of shared addresses, through the use of port 108 numbers. The detail of how this is achieved in each technology 109 varies, but the principle remains the same in all cases. 111 From the perspective of a server on the Internet, endpoint traffic 112 that has passed through IP address sharing infrastructure appears to 113 be originating from the IP address of the address sharing appliance. 114 Common practice at the present time is for servers to log the 115 connection time and source IP address of incoming connections. 116 However, the IP address of the address sharing appliance is not 117 sufficient to identify the true source of the traffic because 118 potentially hundreds or thousands of individual endpoints were using 119 that IP address at the same time. If the need arises during a 120 criminal investigation to identify the source of a specific 121 connection, the source port and exact connection time will also be 122 required. Without this additional information it is highly unlikely 123 that it will be possible for law enforcement authorities to progress 124 their investigations. 126 Information is required from at least two sources to establish the 127 link from the logs of an Internet-facing server to a specific 128 subscriber endpoint: 130 1. The administrator of the Internet-facing server must have logged 131 enough information to enable the operator of the IP address 132 sharing infrastructure to isolate a specific subscriber endpoint. 134 2. The operator of the IP address sharing infrastructure must have 135 logged sufficient information (for a sufficient length of time) 136 to be able, when provided with adequate data by a law enforcement 137 agency, to isolate the relevant subscriber endpoint. 139 The operators of large-scale IP address sharing infrastructure, 140 typically Internet Service Providers, are usually required by law to 141 maintain records of which endpoint was using a particular IP address 142 and port at a particular time. The period of time for which these 143 records must be retained is defined by national legislation. 145 Irrespective of whether (and for how long) these records are 146 available, a starting point is needed to indicate to an investigating 147 law enforcement agency that a particular endpoint was involved in a 148 suspected criminal activity under investigation. Without such a 149 starting point, it would be very difficult to progress the 150 investigation even as far as engagement with the operator of the 151 address sharing infrastructure. The records of Internet-facing 152 servers are often a crucial source of this type of evidence. 154 It has been recognised for some time that IP address sharing presents 155 a challenge to the ability to trace network use and abuse [RFC7620]. 156 Further, it has also been recognised that this challenge is likely to 157 become more severe and widespread with the increased use of large- 158 scale address sharing [RFC6269]. More recently, Europol has 159 highlighted the issue of large-scale IP address sharing as a threat 160 to Internet governance [EUROPOL_IOCTA]. It is reported that the 161 problem of crime attribution related to the use of carrier-grade NAT 162 technologies is regularly encountered by 90% of respondents to a 163 survey on the topic. 165 Address sharing, including large-scale address sharing, is required 166 as long as the use of IPv4 continues. Full deployment of IPv6 has 167 the potential to ultimately eliminate the current attribution issues 168 arising from the use of large-scale address sharing technologies, 169 although presumably new attribution challenges will arise in that 170 scenario. Since it is impossible to anticipate if or when full 171 migration to IPv6 will take place, it is prudent to consider the 172 implications of the transitionary technologies until the need for 173 them has been eliminated. 175 2. Scope 177 Previous work has already suggested as best practice the logging by 178 Internet-facing servers of source IP address, source port and exact 179 connection time [RFC6302]. However, this continues to be 180 exceptional, rather than routine, logging practice. The purpose of 181 this document is to consider in more detail how it might be possible 182 to bring about routine logging by Internet-facing servers of the 183 information needed to re-establish the ability to trace network abuse 184 for criminal investigative purposes. This document specifically does 185 not address or consider the logging requirements of operators of 186 large-scale address sharing infrastructre. Instead, the focus is on 187 the logging considerations of operators of Internet-facing servers. 188 The main contributions of this document are: 190 1. To consider the reasons why source port logging is not routinely 191 carried out. 193 2. To identify some possible solutions and workarounds for the 194 reasons that source port logging is not routinely carried out. 196 3. To examine the feasibility of source port logging from the 197 perspective of software support for this feature. 199 Clearly no single solution will address the problem of crime 200 attribution on the Internet. Load balancers, proxies and other 201 network infrastructure may also, intentionally or as a side-effect, 202 obfuscate the true source of Internet traffic and these problems will 203 continue to exist with or without the presence of large-scale address 204 sharing technologies (like Carrier-Grade NAT and A+P). Nevertheless, 205 at the time of writing large-scale address sharing technologies 206 present a significant challenge to crime attribution, as highlighted 207 by Europol in the above referenced link, and this document attempts 208 to consider the challenges specifically presented by that category of 209 technologies. 211 The discussion begins by considering whether centralised connection 212 logging is a viable solution to the problem of subscriber 213 identification in criminal investigations. This is followed by an 214 examination of the reasons why source port logging is not currently 215 routinely carried out. A model has been developed for the comparison 216 of the maturity of various server deployments to log source port and 217 a study of common server software has been performed to assess the 218 status of support for this functionality. Many, but not all, 219 enterprise server solutions that were examined made the logging of 220 source port either "Possible" or "Feasible", as defined in the 221 maturity model. Only one type of server software examined made the 222 logging of source port "Default". 224 3. Centralised Connection Logging 226 When large-scale IP address sharing technologies are used, source IP 227 address is no longer a sufficient identifier of an individual 228 subscriber. At a minimum, source port and accurate timestamp 229 information are also required to distinguish between the potentially 230 large number of individual users of a specific IP address at a 231 particular time. [RFC6269] points out that there are two solutions 232 to the question of how adequate information can be recorded to 233 identify the parties to a particular connection. They are: 235 1. Operators of IP address sharing infrastructure log mappings 236 between (source IP address, source port) combinations and their 237 subscribers. Server operators log the IP address and source port 238 of incoming connections. This is referred to as source port 239 logging. 241 2. Instead of relying on server operators to log the source port of 242 incoming connections, operators of IP address sharing 243 infrastructure log all combinations of (external IP address, 244 external port, destination IP address) for outgoing connections. 245 This is referred to as connection logging. Server operators log 246 the IP address and timestamp of incoming connections, which is 247 the common current practice. 249 Two challenges to the use of connection logging by operators of IP 250 address sharing infrastructure are also presented in RFC6269. 251 Briefly: 253 o The volumes of data involved make centralised recording of 254 destination IP addresses infeasible. 256 o Many individuals using the same IP address to access a popular 257 destination (e.g. a popular website) might mean that it is not 258 possible to distinguish between the activity of one subscriber and 259 another, even if connection records are kept by the operator of 260 the address sharing infrastructure. 262 The first issue raised is that the volumes of data involved make 263 centralised recording of destination IP addresses infeasible. 264 Whether destination IP addresses are recorded or not, the volume of 265 logs generated by a large-scale IP address sharing infrastructure 266 will be substantial, and some approaches have been proposed to 267 address this hurdle and make central connection logging more 268 feasible, such as deterministic allocation of ports 269 [RFC6269],[RFC7422] or allocation of port ranges [RFC7768], 270 [RFC6346]. While arguments of infeasibility are not arguments in 271 principle why such logging cannot be done, the volumes of data 272 involved in recording every single outgoing connection in a large 273 Internet service provider represent legitimate technical, commercial 274 and operational arguments for why it can not work in practice. Some 275 representative figures for the scales of data involved can be found 276 in [RFC7422], wherein it is estimated that the logging overhead would 277 be of the order of 150MB per subscriber, per month. For a service 278 provider with one million subscribers, this would produce a volume of 279 logs (uncompressed) of the order of 150 terabytes per month. Aside 280 from the technical overhead of storing such a volume of data, 281 searching and locating relevant records over an extended, legally 282 mandated retention period would also present a significant technical 283 challenge. 285 The second point raised in [RFC6269] against connection logging by 286 operators of IP address sharing infrastructure suggests that even if 287 connection logs store all combinations of (timestamp, source IP, 288 source port, destination IP), if this information is queried in the 289 absence of source port because source port has not been recorded by 290 the destination IP, this would not be sufficient to distinguish the 291 activity of one individual from another in cases where the 292 destination IP is a popular one. This problem is further exacerbated 293 in the case of protocols that make multiple connections per session 294 (e.g. HTTP/HTTPS). The implication of this point is that connection 295 logging, despite potential significant technical and operational 296 overhead, cannot guarantee that the information retained is 297 sufficient to identify an individual suspect, even when all required 298 records are available. 300 Finally, the privacy concerns arising from connection logging in this 301 scenario have been repeatedly raised [RFC6888] and 302 [I-D.ietf-behave-ipfix-nat-logging]. 304 In summary, it is certainly clear that operators of address sharing 305 infrastructure need to retain records to enable the identification of 306 suspects, and such records must consist of, at least, sufficient 307 information to identify an individual subscriber when provided with a 308 timestamp, source IP, source port and destination IP. However, there 309 is no centralised solution available that removes the need for server 310 operators to retain source port information. 312 4. Challenges to Capturing Source Port 314 It is relatively easy to articulate the reason why the operator of an 315 Internet-facing server would wish to retain source port information 316 for incoming connections. If the server operator (or the users that 317 they serve) finds themselves the victim of a crime, it is preferable 318 that all information that could be needed by the server operator to 319 facilitate a criminal investigation is available. On the other hand, 320 there are reasons why a server operator might not have the required 321 source port information. This section enumerates the factors that 322 could negatively influence both the ability and the inclination of 323 server operators to capture and record source port information. 325 4.1. Lack of Awareness 327 Server operators are principally focussed on delivering the services 328 for which they are operating their infrastructure. One of the main 329 problems with the increasing use of IP address sharing technologies 330 is the lack of awareness on the part of server operators that there 331 are direct implications for them in case they should become the 332 victim of a crime. 334 At the time of writing, a minimal amount of material is available 335 online concerning this issue, even for those actively seeking to find 336 out about source port logging. Where specific guidance or 337 information has been provided by vendors in relation to the 338 configuration of source port logging, no explanation is provided for 339 why this might be something that server operators might consider 340 desirable. For example [MSDN_IIS_LOG]. 342 There is, therefore, a considerable awareness gap between the 343 importance of this issue for the purpose of investigating criminal 344 activity online and the awareness of those who need to act in advance 345 of any criminality taking place to ensure that the information needed 346 to facilitate a future investigation is available. 348 4.2. Lack of Support for Logging Source Port 350 Before a server operator can decide to log source port information, 351 the server software must support logging of the source port of 352 incoming connections. Many, but not all major software distributions 353 support the logging of the source port of incoming connections. 354 Clearly lack of support in server software is a technical obstacle 355 for a server operator to logging source port at the endpoint. It may 356 still be possible to log source port at some location before the 357 server endpoint (e.g. at a reverse proxy) but absence of support in 358 server software will mean that endpoint logging will not be possible. 360 4.3. Additional Storage Requirements 362 In cases where it is possible to simply add source port to the list 363 of fields recorded in log entries, the additional storage required to 364 preserve source port data is minimal; in the region of six bytes per 365 log entry (maximum of five ASCII digits for the source port plus an 366 additional delimiter). 368 However, in some cases where software supports logging source port of 369 incoming connections, it has been noted that this can only be 370 achieved by enabling verbose or debug logging in the software. This 371 would substantially (and unnecessarily) increase the size of logs 372 produced by the server and would also, in all probability, reduce the 373 production performance of the server. These factors would 374 undoubtedly negatively influence the decision by a server operator to 375 log incoming source port. 377 4.4. Default Log Formats 379 Many major software distributions provide default log formats in 380 their configuration files. A review of the default log format of 381 some common server software has been carried out and in only one case 382 was it found that the source port of incoming connections is logged 383 by any of the default log formats. 385 4.5. Breaking Existing Tooling 387 Much commercial and free log analysis software, by default, expects 388 logs to be in a particular format. Consider, for example, the 389 ubiquity of the Apache Common and Extended Log Formats. The software 390 can usually be configured to parse arbitrary log formats, but this is 391 additional configuration work for a server operator. For example: 392 [ANALOG_LOG_CONFIG],[AWSTATS_LOG_CONFIG]. Without migration 393 planning, a change to default log formats would most likely cause 394 substantial disruption to a considerable amount of downstream 395 processing of server log files. In addition to commercially 396 available software, many administrators have developed or downloaded 397 scripts that expect logs to be in a standard log format. 399 Therefore, log processing software, and in particular custom scripts, 400 may break if default log formats change unexpectedly. At least, the 401 tooling may need to be updated to correctly process the additional 402 fields newly present in log file. 404 4.6. Accuracy of Recorded Time 406 As well as recording the IP address and source port of the 407 connection, it is important to record the exact time of the 408 connection. It has been suggested that there is a need for keeping 409 the exact time against some sort of global standard (e.g. NTP) 410 [RFC6302], however this may not be possible for practical, security 411 or legacy reasons. In practice, it is usually not necessary to keep 412 time against a global standard, as long as time is recorded 413 consistently. The reason for this is that any time offset between 414 the server and the time recorded in another organisation's records 415 (running address sharing infrastructure) can be calculated and 416 compensated for manually. Time offsets of this nature are commonly 417 encountered and well understood in the digital forensics world. 419 4.7. Translation of Source Port by Intermediate Infrastructure 421 It is common for an incoming connection to terminate somewhere other 422 than the actual server that is intended to ultimately handle the 423 connection. For example, it is possible that a server operator has 424 deployed intermediate infrastructure to improve the efficiency or 425 availability of their platform. Load balancers, proxies or denial of 426 service countermeasures may be present, any one of which could 427 potentially terminate the incoming connection. The operation of 428 these types of intermediate infrastructure can cause translation of 429 the incoming connection parameters (including source port) before the 430 connection is established to the actual server endpoint. 432 In such cases the source port presented at the server endpoint is a 433 source port that only has meaning in the intermediate infrastructure 434 and in most cases will not carry any information about the source 435 port in use at the connection origin. In the worst case scenario 436 (from the point of view of crime attribution), the intermediate 437 infrastructure may obfuscate the true source connection information 438 in a way that is unrecoverable. 440 5. Comparison Model 442 A model has been developed to assist with comparison of the maturity 443 of server software deployments to store and retrieve source port 444 information for incoming connections. The model is depicted in 445 Figure 1. 447 +-------------------------------------------------------------+ 448 | Possible -> Feasible -> Default -> Manageable -> Accessible | 449 +-------------------------------------------------------------+ 451 Figure 1 453 o "Possible": Means that the server software supports, in any way, 454 the ability to record source ports for incoming connections. 456 o "Feasible": Means that it there are no significant performance or 457 storage implications for enabling the storage of source ports. 459 o "Default": Means that, at a minimum, at least one of the default 460 log formats provided with the software distribution enables the 461 storage of source ports. 463 o "Manageable": Means that tooling is, or has been, build or adapted 464 to support the storage of source ports. 466 o "Accessible": Means that it is possible to identify and retrieve 467 relevant records in the stored log data. 469 6. Support for Logging Source Port 471 Open-source research has been conducted to assess the status of 472 support for logging of source port information in common server 473 software. 475 The assessment criteria were as follows: 477 o Server software is categorised as "Possible" if there was any way 478 identified to cause the logging of source port. 480 o Server software is categorised as "Feasible" if the logging of 481 source port does not require increasing the log level to cause the 482 logging of source port to be possible. In other words, if a 483 server requires enabling verbose, debug or audit logging in order 484 to be able to record source port then logging is "Possible" but 485 not "Feasible". 487 o Server software is categorised as "Default" if at least one of the 488 available default log formats enables logging of the incoming 489 source port, or if source port is logged by default. 491 o The "Manageable" and "Accessible" aspects of the comparison model 492 relate to specific deployments and are therefore not considered in 493 the assessment of server software support. 495 The latest versions of 16 common server software packages have been 496 examined and documentation has been research to identify if and how 497 source port logging can be enabled. The findings are described in 498 Appendix A. Online documentation has been examined to identify if 499 and how source port logging can be enabled. The results are 500 presented in the following table: 502 +----------+----------+---------+------------+------------+ 503 | Possible | Feasible | Default | Manageable | Accessible | 504 +----------+----------+---------+------------+------------+ 505 | 13 | 11 | 1 | N/A | N/A | 506 +----------+----------+---------+------------+------------+ 508 Table 1: Support Table 510 It was noted that only one of the server software packages examined 511 (OpenSSH version 7.5) enables the logging of incoming source port by 512 default. This conclusion has been reached despite using the most 513 generous possible interpretation of "Default", whereby meeting the 514 criteria for "Default" is achieved when logging of source port is 515 offered as a possible default, rather than requiring that logging of 516 source port is enabled by default. In due course, as awareness of 517 this issue increases, it is envisioned that a stricter interpretation 518 of "Default" would be more appropriate, requiring that the logging of 519 source port be enabled by default. 521 7. Conclusions and Next Steps 523 There is clearly substantial work to be done to bring about the 524 regular recording of source port information at Internet-facing 525 servers and there are undoubtedly criminals free right now because 526 the information required to identify them from their online activity 527 is not available. 529 The next steps presented below are some possible courses of action 530 that have been identified based on the current state of source port 531 logging and the challenges described above. 533 7.1. Raise Awareness of the Importance of Logging Source Port 535 Publishers of both free and commercial software should consider 536 releasing deployment guidance or best practice that describes why 537 server administrators need to be recording source port information, 538 with instructions for how this can be done. This will help to 539 address the lack of awareness of the importance of this issue. 541 Considering also the awareness of those who are building software 542 applications, or otherwise involved with coding of Internet-facing 543 applications, secure coding guidance should be updated to include 544 reference to source port information, particularly where such 545 guidance already touches on the issue of logging. For example the 546 OWASP Secure Coding Practices specifies a list of important log event 547 data [OWASP_SCP]. However the "important log event data" list does 548 not, at the time of writing, include source port. 550 7.2. Increase Support for Logging Source Port 552 Many software packages support logging of source port information, 553 but only ten out of the sixteen examined support logging in a way 554 that would not significantly negatively impact the operation of the 555 server software. Software publishers therefore need to consider 556 their level of support of logging source port. In particular, 557 software should support the logging of source port without needing to 558 enable a verbose logging level. 560 7.3. Update Default Log Formats 562 In cases where a particular software package has support for logging 563 of incoming source port, one possibility would be to incorporate one 564 or more log formats that include incoming source port as a field 565 logged by default. Obviously this will not have any impact on 566 deployments of the software that are already in place but for future 567 deployments, the incorporation of source port into the log format 568 will mean that those administrators that use the unaltered default 569 log format will automatically store the required information. 571 7.4. Parallel Logging to a Connection Log 573 Where possible, configuring parallel logging of connection 574 information to a separate log stream would be one possible solution 575 to address the fact that changes to log format might break downstream 576 tooling. This would also be a possible solution that could be used 577 by those server software types that log via syslog. In this case, 578 software publishers could produce guidance on how to configure syslog 579 to log connection information parallel to main log files. 581 Such a solution would help to ease the transition to an alternate log 582 format since current log formats would not need to be changed because 583 the required source port information is stored separately, but can 584 still be correlated with the main log files if needed. 586 7.5. Adequate Timestamp Accuracy in Logs 588 Operators of large-scale address sharing infrastructure will, most 589 likely need connection times specified with at least the granularity 590 of a second. Most, but not all, server software will log times with 591 this granularity by default but there is no guarantee that this is 592 the case. 594 Consideration should be given by server operators to making sure that 595 the times that are being recorded in their log files have sufficient 596 accuracy to allow identification of the required records. As 597 mentioned earlier, the times do not necessarily need to be recorded 598 with reference to a centralised time source (e.g. NTP) as long as 599 times are recorded consistently. 601 This factor also needs to be considered by software developers when 602 they are producing software and although the recording of time is 603 mentioned in the OWASP Secure Coding Practices, the required 604 accuracy/granularity of the recorded time is not discussed 605 [OWASP_SCP]. 607 7.6. Address Source Port Translation in Intermediate Infrastructure 609 In cases described above where intermediate infrastructure terminates 610 incoming connections (proxies, load balancers, etc.), and the 611 infrastructure is translating incoming source port information, there 612 is a risk that the important crime attribution information may be 613 lost. One possibility is to log source port information at the 614 intermediate infrastructure and this may be an appropriate solution 615 in some cases. The problem is that this may lead to an excessive 616 volume of logging, depending on the particular scenario. For example 617 if the intermediate infrastructure is being used to mitigate DDoS 618 attacks, logging all incoming traffic would potentially lead to 619 logging of all incoming DDoS connections. This would clearly be an 620 undesirable outcome. 622 An alternative solution is to pass information about the original 623 connection (before mapping/translation of connection information 624 takes place) to the actual endopint. Solutions to achieve this 625 already exist for certain application layer protocols. The Forwarded 626 HTTP Extention [RFC7239], for example, supports (as an optional 627 feature) the tranfer of source port information in the "Forwarded 628 For" header, and this technique can also support multiple layers of 629 proxying without loss of attribution. 631 8. IANA Considerations 633 This memo includes no request to IANA. 635 9. Security Considerations 637 Clearly a balance needs to be struck between individual right to 638 privacy and law enforcement access to data during criminal 639 investigations. On the one hand, the routine logging of any 640 additional information has the potential to introduce risks related 641 to privacy and human rights. On the other hand, it is fair to say 642 that there are criminals free today because the data required to 643 identify them is not available due to the use of large-scale address 644 sharing technologies. Across the world there are also a broad 645 spectrum of legislative regimes and human rights challenges, 646 interpretation of which relate directly to this question. 648 IP addresses are routinely logged today and this information can be 649 used for identification of people online in some cases. The cases in 650 which an IP addresses does not identify an individual directly are 651 not necessarily apparent to the person performing the logging (who 652 cannot tell, for example, if the true source of the traffic is behind 653 a NAT or other form of proxy) and the same is true even if source 654 port is logged. It is not apparent that there is any additional risk 655 to individual privacy between the case when a single piece of 656 endpoint identifying information (source IP address) is logged versus 657 the case when two pieces of endpoint identifying information (source 658 IP address and source port) are logged. Balancing this against the 659 significant advantages from the crime attribution point of view 660 suggests that this may be a worthwhile approach. 662 10. References 664 10.1. Informative References 666 [I-D.ietf-behave-ipfix-nat-logging] 667 Sivakumar, S. and R. Penno, "IPFIX Information Elements 668 for logging NAT Events", draft-ietf-behave-ipfix-nat- 669 logging-13 (work in progress), January 2017. 671 [I-D.shirasaki-nat444] 672 Yamagata, I., Shirasaki, Y., Nakagawa, A., Yamaguchi, J., 673 and H. Ashida, "NAT444", draft-shirasaki-nat444-06 (work 674 in progress), July 2012. 676 10.2. Normative References 678 [ANALOG_LOG_CONFIG] 679 Analog, "Analog 6.0: Log formats", 2017, 680 . 682 [AWSTATS_LOG_CONFIG] 683 AWStats, "AWStats Installation, Configuration and 684 Reporting (for version 7.6)", 2017, 685 . 687 [EUROPOL_IOCTA] 688 Europol, "The Internet Organised Crime Threat Assessment", 689 2016, . 693 [MSDN_IIS_LOG] 694 Microsoft, "IIS 8.5 - How to log client port number", 695 2015, . 698 [OWASP_SCP] 699 OWASP, "OWASP Secure Coding Practices Quick Reference 700 Guide", 2010, . 703 [RFC6146] Bagnulo, M., Matthews, P., and I. van Beijnum, "Stateful 704 NAT64: Network Address and Protocol Translation from IPv6 705 Clients to IPv4 Servers", RFC 6146, DOI 10.17487/RFC6146, 706 April 2011, . 708 [RFC6269] Ford, M., Ed., Boucadair, M., Durand, A., Levis, P., and 709 P. Roberts, "Issues with IP Address Sharing", RFC 6269, 710 DOI 10.17487/RFC6269, June 2011, 711 . 713 [RFC6302] Durand, A., Gashinsky, I., Lee, D., and S. Sheppard, 714 "Logging Recommendations for Internet-Facing Servers", 715 BCP 162, RFC 6302, DOI 10.17487/RFC6302, June 2011, 716 . 718 [RFC6333] Durand, A., Droms, R., Woodyatt, J., and Y. Lee, "Dual- 719 Stack Lite Broadband Deployments Following IPv4 720 Exhaustion", RFC 6333, DOI 10.17487/RFC6333, August 2011, 721 . 723 [RFC6346] Bush, R., Ed., "The Address plus Port (A+P) Approach to 724 the IPv4 Address Shortage", RFC 6346, 725 DOI 10.17487/RFC6346, August 2011, 726 . 728 [RFC6888] Perreault, S., Ed., Yamagata, I., Miyakawa, S., Nakagawa, 729 A., and H. Ashida, "Common Requirements for Carrier-Grade 730 NATs (CGNs)", BCP 127, RFC 6888, DOI 10.17487/RFC6888, 731 April 2013, . 733 [RFC7239] Petersson, A. and M. Nilsson, "Forwarded HTTP Extension", 734 RFC 7239, DOI 10.17487/RFC7239, June 2014, 735 . 737 [RFC7422] Donley, C., Grundemann, C., Sarawat, V., Sundaresan, K., 738 and O. Vautrin, "Deterministic Address Mapping to Reduce 739 Logging in Carrier-Grade NAT Deployments", RFC 7422, 740 DOI 10.17487/RFC7422, December 2014, 741 . 743 [RFC7596] Cui, Y., Sun, Q., Boucadair, M., Tsou, T., Lee, Y., and I. 744 Farrer, "Lightweight 4over6: An Extension to the Dual- 745 Stack Lite Architecture", RFC 7596, DOI 10.17487/RFC7596, 746 July 2015, . 748 [RFC7597] Troan, O., Ed., Dec, W., Li, X., Bao, C., Matsushima, S., 749 Murakami, T., and T. Taylor, Ed., "Mapping of Address and 750 Port with Encapsulation (MAP-E)", RFC 7597, 751 DOI 10.17487/RFC7597, July 2015, 752 . 754 [RFC7599] Li, X., Bao, C., Dec, W., Ed., Troan, O., Matsushima, S., 755 and T. Murakami, "Mapping of Address and Port using 756 Translation (MAP-T)", RFC 7599, DOI 10.17487/RFC7599, July 757 2015, . 759 [RFC7620] Boucadair, M., Ed., Chatras, B., Reddy, T., Williams, B., 760 and B. Sarikaya, "Scenarios with Host Identification 761 Complications", RFC 7620, DOI 10.17487/RFC7620, August 762 2015, . 764 [RFC7768] Tsou, T., Li, W., Taylor, T., and J. Huang, "Port 765 Management to Reduce Logging in Large-Scale NATs", 766 RFC 7768, DOI 10.17487/RFC7768, January 2016, 767 . 769 Appendix A. Support for Source Port Logging in Various Server Software 771 The table below enumerates the findings of best-effort, open-source 772 review of documentation of the various products. Where it has been 773 indicated that it is not possible to log source port then either (a) 774 no reference has been identified in online documentation to indicate 775 how source port logging can be enabled, or (b) a reference positively 776 indicating that logging of source port is not possible has been 777 found. 779 +---------+------------+------------+----------+----------+---------+ 780 | Categor | Server | Version | Possible | Feasible | Default | 781 | y | | | | | | 782 +---------+------------+------------+----------+----------+---------+ 783 | HTTP | Apache | 2.4.25 | Yes | Yes | No | 784 | | HTTPD | | | | | 785 | HTTP | IIS | 10 | Yes | Yes | No | 786 | HTTP | Tomcat | 8.5.15 | Yes | Yes | No | 787 | HTTP | Squid | 3.5.25 | Yes | Yes | No | 788 | HTTP | nginx | 1.12.0 | Yes | Yes | No | 789 | Mail | sendmail | 8.15.2 | Yes | Yes | No | 790 | Mail | Microsoft | 2016 | Yes | No | No | 791 | | Exchange | | | | | 792 | | Server | | | | | 793 | Mail | Postfix | 2.10.0 | Yes | Yes | No | 794 | Mail | Exim | 4.89 | Yes | Yes | No | 795 | Mail | Dovecot | 2.2.30.1 | Yes | Yes | No | 796 | Mail | UW IMAP | imap-2007f | No | No | No | 797 | DBase | Oracle | 12.2.0.1 | No | No | No | 798 | DBase | MySQL | 5.7.18 | No | No | No | 799 | DBase | Microsoft | 2016 | Yes | No | No | 800 | | SQL Server | | | | | 801 | DBase | PostgreSQL | 9.6.3 | Yes | Yes | No | 802 | SSH | OpenSSHD | 7.5 | Yes | Yes | Yes | 803 +---------+------------+------------+----------+----------+---------+ 805 Table 2: Support for Logging Incoming Source Port 807 Author's Address 808 David O'Reilly 809 Ireland 811 Email: rfc@daveor.com