| < draft-irtf-pearg-safe-internet-measurement-00.txt | draft-irtf-pearg-safe-internet-measurement-01.txt > | |||
|---|---|---|---|---|
| Network Working Group I. Learmonth | Network Working Group I. Learmonth | |||
| Internet-Draft Tor Project | Internet-Draft Tor Project | |||
| Intended status: Informational July 7, 2019 | Intended status: Informational July 8, 2019 | |||
| Expires: January 8, 2020 | Expires: January 9, 2020 | |||
| Guidelines for Performing Safe Measurement on the Internet | Guidelines for Performing Safe Measurement on the Internet | |||
| draft-irtf-pearg-safe-internet-measurement-00 | draft-irtf-pearg-safe-internet-measurement-01 | |||
| Abstract | Abstract | |||
| Researchers from industry and academia often use Internet | Researchers from industry and academia often use Internet | |||
| measurements as part of their work. While these measurements can | measurements as part of their work. While these measurements can | |||
| give insight into the functioning and usage of the Internet, they can | give insight into the functioning and usage of the Internet, they can | |||
| come at the cost of user privacy. This document describes guidelines | come at the cost of user privacy. This document describes guidelines | |||
| for ensuring that such measurements can be carried out safely. | for ensuring that such measurements can be carried out safely. | |||
| Note | Note | |||
| skipping to change at page 1, line 43 ¶ | skipping to change at page 1, line 43 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on January 8, 2020. | This Internet-Draft will expire on January 9, 2020. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2019 IETF Trust and the persons identified as the | Copyright (c) 2019 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 5, line 20 ¶ | skipping to change at page 5, line 20 ¶ | |||
| new behaviour to any user should be considered appropriate if some | new behaviour to any user should be considered appropriate if some | |||
| users are to remain with the old behavior. | users are to remain with the old behavior. | |||
| In the event that something does go wrong with the update, it should | In the event that something does go wrong with the update, it should | |||
| be easy for a user to discover that they have been part of an | be easy for a user to discover that they have been part of an | |||
| experiment and roll back the change, allowing for explicit refusal of | experiment and roll back the change, allowing for explicit refusal of | |||
| consent to override the presumed implied consent. | consent to override the presumed implied consent. | |||
| 3. Safety Considerations | 3. Safety Considerations | |||
| 3.1. Use a testbed | 3.1. Isolate risk with a dedicated testbed | |||
| Wherever possible, use a testbed. An isolated network means that | Wherever possible, use a testbed. An isolated network means that | |||
| there are no other users sharing the infrastructure you are using for | there are no other users sharing the infrastructure you are using for | |||
| your experiments. | your experiments. | |||
| When measuring performance, competing traffic can have negative | When measuring performance, competing traffic can have negative | |||
| effects on the performance of your test traffic and so the testbed | effects on the performance of your test traffic and so the testbed | |||
| approach can also produce more accurate and repeatable results than | approach can also produce more accurate and repeatable results than | |||
| experiments using the public Internet. | experiments using the public Internet. | |||
| WAN link conditions can be emulated through artificial delays and/or | WAN link conditions can be emulated through artificial delays and/or | |||
| packet loss using a tool like [netem]. Competing traffic can also be | packet loss using a tool like [netem]. Competing traffic can also be | |||
| emulated using traffic generators. | emulated using traffic generators. | |||
| 3.2. Only record your own traffic | 3.2. Be respectful of other's infrastructure | |||
| When performing active measurements be sure to only capture traffic | ||||
| that you have generated. Traffic may be identified by IP ranges or | ||||
| by some token that is unlikely to be used by other users. | ||||
| Again, this can help to improve the accuracy and repeatability of | ||||
| your experiment. [RFC2544], for performance benchmarking, requires | ||||
| that any frames received that were not part of the test traffic are | ||||
| discarded and not counted in the results. | ||||
| 3.3. Be respectful of other's infrastructure | ||||
| If your experiment is designed to trigger a response from | If your experiment is designed to trigger a response from | |||
| infrastructure that is not your own, consider what the negative | infrastructure that is not your own, consider what the negative | |||
| consequences of that may be. At the very least your experiment will | consequences of that may be. At the very least your experiment will | |||
| consume bandwidth that may have to be paid for. | consume bandwidth that may have to be paid for. | |||
| In more extreme circumstances, you could cause traffic to be | In more extreme circumstances, you could cause traffic to be | |||
| generated that causes legal trouble for the owner of that | generated that causes legal trouble for the owner of that | |||
| infrastructure. The Internet is a global network crossing many legal | infrastructure. The Internet is a global network crossing many legal | |||
| jurisdictions and so what may be legal for you is not necessarily | jurisdictions and so what may be legal for you is not necessarily | |||
| legal for everyone. | legal for everyone. | |||
| If you are sending a lot of traffic quickly, or otherwise generally | If you are sending a lot of traffic quickly, or otherwise generally | |||
| deviate from typical client behaviour, a network may identify this as | deviate from typical client behaviour, a network may identify this as | |||
| an attack which means that you will not be collecting results that | an attack which means that you will not be collecting results that | |||
| are representative of what a typical client would see. | are representative of what a typical client would see. | |||
| 3.3.1. Maintain a "Do Not Scan" list | 3.2.1. Maintain a "Do Not Scan" list | |||
| When performing active measurements on a shared network, maintain a | When performing active measurements on a shared network, maintain a | |||
| list of hosts that you will never scan regardless of whether they | list of hosts that you will never scan regardless of whether they | |||
| appear in your target lists. When developing tools for performing | appear in your target lists. When developing tools for performing | |||
| active measurement, or traffic generation for use in a larger | active measurement, or traffic generation for use in a larger | |||
| measurement system, ensure that the tool will support the use of a | measurement system, ensure that the tool will support the use of a | |||
| "Do Not Scan" list. | "Do Not Scan" list. | |||
| If complaints are made that request you do not generate traffic | If complaints are made that request you do not generate traffic | |||
| towards a host or network, you must add that host or network to your | towards a host or network, you must add that host or network to your | |||
| skipping to change at page 6, line 43 ¶ | skipping to change at page 6, line 32 ¶ | |||
| you plan to share the reasoning when publishing your measurement | you plan to share the reasoning when publishing your measurement | |||
| results, e.g. in an academic paper, you must seek consent for this | results, e.g. in an academic paper, you must seek consent for this | |||
| from the requester. | from the requester. | |||
| Be aware that in publishing your measurement results, it may be | Be aware that in publishing your measurement results, it may be | |||
| possible to infer your "Do Not Scan" list from those results. For | possible to infer your "Do Not Scan" list from those results. For | |||
| example, if you measured a well-known list of popular websites then | example, if you measured a well-known list of popular websites then | |||
| it would be possible to correlate the results with that list to | it would be possible to correlate the results with that list to | |||
| determine which are missing. | determine which are missing. | |||
| 3.4. Only collect data that is safe to make public | 3.3. Data Minimization | |||
| When collecting, using, disclosing, and storing data from a | ||||
| measurement, use only the minimal data necessary to perform a task. | ||||
| Reducing the amount of data reduces the amount of data that can be | ||||
| misused or leaked. | ||||
| When deciding on the data to collect, assume that any data collected | When deciding on the data to collect, assume that any data collected | |||
| might become public. There are many ways that this could happen, | might be disclosed. There are many ways that this could happen, | |||
| through operation security mistakes or compulsion by a judicial | through operation security mistakes or compulsion by a judicial | |||
| system. | system. | |||
| 3.5. Minimization | When directly instrumenting a protocol to provide metrics to a | |||
| passive observer, see section 6.1 of RFC6973 [RFC6973] for data | ||||
| minimalization considerations specific to this use case. | ||||
| For all data collected, consider whether or not it is really needed. | 3.3.1. Discarding Data | |||
| 3.6. Aggregation | XXX: Discard data that is not required to perform the task. | |||
| When performing active measurements be sure to only capture traffic | ||||
| that you have generated. Traffic may be identified by IP ranges or | ||||
| by some token that is unlikely to be used by other users. | ||||
| Again, this can help to improve the accuracy and repeatability of | ||||
| your experiment. [RFC2544], for performance benchmarking, requires | ||||
| that any frames received that were not part of the test traffic are | ||||
| discarded and not counted in the results. | ||||
| 3.3.2. Masking Data | ||||
| XXX: Mask data that is not required to perform the task. | ||||
| Particularly useful for content of traffic to indicate that either a | ||||
| particular class of content existed or did not exist, or the length | ||||
| of the content, but not recording the content itself. Can also | ||||
| replace content with tokens, or encrypt. | ||||
| 3.3.3. Reduce Accuracy | ||||
| XXX: Binning, categorizing, geoip, noise. | ||||
| 3.3.4. Data Aggregation | ||||
| When collecting data, consider if the granularity can be limited by | When collecting data, consider if the granularity can be limited by | |||
| using bins or adding noise. XXX: Differential privacy. | using bins or adding noise. XXX: Differential privacy. | |||
| 3.7. Source Aggregation | XXX: Do this at the source, definitely do it before you write to | |||
| disk. | ||||
| Do this at the source, definitely do it before you write to disk. | ||||
| [Tor.2017-04-001] presents a case-study on the in-memory statistics | [Tor.2017-04-001] presents a case-study on the in-memory statistics | |||
| in the software used by the Tor network, as an example. | in the software used by the Tor network, as an example. | |||
| 4. Risk Analysis | 4. Risk Analysis | |||
| The benefits should outweigh the risks. Consider auxiliary data | The benefits should outweigh the risks. Consider auxiliary data | |||
| (e.g. third-party data sets) when assessing the risks. | (e.g. third-party data sets) when assessing the risks. | |||
| 5. Security Considerations | 5. Security Considerations | |||
| skipping to change at page 8, line 31 ¶ | skipping to change at page 8, line 49 ¶ | |||
| [RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for | [RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for | |||
| Network Interconnect Devices", RFC 2544, | Network Interconnect Devices", RFC 2544, | |||
| DOI 10.17487/RFC2544, March 1999, | DOI 10.17487/RFC2544, March 1999, | |||
| <https://www.rfc-editor.org/info/rfc2544>. | <https://www.rfc-editor.org/info/rfc2544>. | |||
| [RFC6349] Constantine, B., Forget, G., Geib, R., and R. Schrage, | [RFC6349] Constantine, B., Forget, G., Geib, R., and R. Schrage, | |||
| "Framework for TCP Throughput Testing", RFC 6349, | "Framework for TCP Throughput Testing", RFC 6349, | |||
| DOI 10.17487/RFC6349, August 2011, | DOI 10.17487/RFC6349, August 2011, | |||
| <https://www.rfc-editor.org/info/rfc6349>. | <https://www.rfc-editor.org/info/rfc6349>. | |||
| [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., | ||||
| Morris, J., Hansen, M., and R. Smith, "Privacy | ||||
| Considerations for Internet Protocols", RFC 6973, July | ||||
| 2013, <https://www.rfc-editor.org/info/rfc6937>. | ||||
| [Tor.2017-04-001] | [Tor.2017-04-001] | |||
| Herm, K., "Privacy analysis of Tor's in-memory | Herm, K., "Privacy analysis of Tor's in-memory | |||
| statistics", Tor Tech Report 2017-04-001, April 2017, | statistics", Tor Tech Report 2017-04-001, April 2017, | |||
| <https://research.torproject.org/techreports/ | <https://research.torproject.org/techreports/ | |||
| privacy-in-memory-2017-04-28.pdf>. | privacy-in-memory-2017-04-28.pdf>. | |||
| [TorSafetyBoard] | [TorSafetyBoard] | |||
| Tor Project, "Tor Research Safety Board", | Tor Project, "Tor Research Safety Board", | |||
| <https://research.torproject.org/safetyboard/>. | <https://research.torproject.org/safetyboard/>. | |||
| End of changes. 13 change blocks. | ||||
| 26 lines changed or deleted | 49 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||