| < draft-ietf-ipfix-anon-05.txt | draft-ietf-ipfix-anon-06.txt > | |||
|---|---|---|---|---|
| IPFIX Working Group E. Boschi | IPFIX Working Group E. Boschi | |||
| Internet-Draft B. Trammell | Internet-Draft B. Trammell | |||
| Intended status: Experimental ETH Zurich | Intended status: Experimental ETH Zurich | |||
| Expires: April 23, 2011 October 20, 2010 | Expires: July 23, 2011 January 19, 2011 | |||
| IP Flow Anonymisation Support | IP Flow Anonymization Support | |||
| draft-ietf-ipfix-anon-05.txt | draft-ietf-ipfix-anon-06.txt | |||
| Abstract | Abstract | |||
| This document describes anonymisation techniques for IP flow data and | This document describes anonymization techniques for IP flow data and | |||
| the export of anonymised data using the IPFIX protocol. It | the export of anonymized data using the IPFIX protocol. It | |||
| categorizes common anonymisation schemes and defines the parameters | categorizes common anonymization schemes and defines the parameters | |||
| needed to describe them. It provides guidelines for the | needed to describe them. It provides guidelines for the | |||
| implementation of anonymised data export and storage over IPFIX, and | implementation of anonymized data export and storage over IPFIX, and | |||
| describes an information model and Options-based method for | describes an information model and Options-based method for | |||
| anonymisation metadata export within the IPFIX protocol or storage in | anonymization metadata export within the IPFIX protocol or storage in | |||
| IPFIX Files. | IPFIX Files. | |||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on April 23, 2011. | This Internet-Draft will expire on July 23, 2011. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2010 IETF Trust and the persons identified as the | Copyright (c) 2011 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.1. IPFIX Protocol Overview . . . . . . . . . . . . . . . . . 4 | 1.1. IPFIX Protocol Overview . . . . . . . . . . . . . . . . . 4 | |||
| 1.2. IPFIX Documents Overview . . . . . . . . . . . . . . . . . 5 | 1.2. IPFIX Documents Overview . . . . . . . . . . . . . . . . . 5 | |||
| 1.3. Anonymisation within the IPFIX Architecture . . . . . . . 5 | 1.3. Anonymization within the IPFIX Architecture . . . . . . . 5 | |||
| 1.4. Supporting Experimentation with Anonymization . . . . . . 6 | ||||
| 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 3. Categorisation of Anonymisation Techniques . . . . . . . . . . 6 | 3. Categorization of Anonymization Techniques . . . . . . . . . . 7 | |||
| 4. Anonymisation of IP Flow Data . . . . . . . . . . . . . . . . 8 | 4. Anonymization of IP Flow Data . . . . . . . . . . . . . . . . 8 | |||
| 4.1. IP Address Anonymisation . . . . . . . . . . . . . . . . . 9 | 4.1. IP Address Anonymization . . . . . . . . . . . . . . . . . 10 | |||
| 4.1.1. Truncation . . . . . . . . . . . . . . . . . . . . . . 9 | 4.1.1. Truncation . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 4.1.2. Reverse Truncation . . . . . . . . . . . . . . . . . . 10 | 4.1.2. Reverse Truncation . . . . . . . . . . . . . . . . . . 11 | |||
| 4.1.3. Permutation . . . . . . . . . . . . . . . . . . . . . 10 | 4.1.3. Permutation . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 4.1.4. Prefix-preserving Pseudonymisation . . . . . . . . . . 11 | 4.1.4. Prefix-preserving Pseudonymization . . . . . . . . . . 12 | |||
| 4.2. MAC Address Anonymisation . . . . . . . . . . . . . . . . 11 | 4.2. MAC Address Anonymization . . . . . . . . . . . . . . . . 12 | |||
| 4.2.1. Truncation . . . . . . . . . . . . . . . . . . . . . . 12 | 4.2.1. Truncation . . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 4.2.2. Reverse Truncation . . . . . . . . . . . . . . . . . . 12 | 4.2.2. Reverse Truncation . . . . . . . . . . . . . . . . . . 13 | |||
| 4.2.3. Permutation . . . . . . . . . . . . . . . . . . . . . 13 | 4.2.3. Permutation . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 4.2.4. Structured Pseudonymisation . . . . . . . . . . . . . 13 | 4.2.4. Structured Pseudonymization . . . . . . . . . . . . . 14 | |||
| 4.3. Timestamp Anonymisation . . . . . . . . . . . . . . . . . 13 | 4.3. Timestamp Anonymization . . . . . . . . . . . . . . . . . 15 | |||
| 4.3.1. Precision Degradation . . . . . . . . . . . . . . . . 14 | 4.3.1. Precision Degradation . . . . . . . . . . . . . . . . 15 | |||
| 4.3.2. Enumeration . . . . . . . . . . . . . . . . . . . . . 14 | 4.3.2. Enumeration . . . . . . . . . . . . . . . . . . . . . 16 | |||
| 4.3.3. Random Shifts . . . . . . . . . . . . . . . . . . . . 15 | 4.3.3. Random Shifts . . . . . . . . . . . . . . . . . . . . 16 | |||
| 4.4. Counter Anonymisation . . . . . . . . . . . . . . . . . . 15 | 4.4. Counter Anonymization . . . . . . . . . . . . . . . . . . 16 | |||
| 4.4.1. Precision Degradation . . . . . . . . . . . . . . . . 15 | 4.4.1. Precision Degradation . . . . . . . . . . . . . . . . 17 | |||
| 4.4.2. Binning . . . . . . . . . . . . . . . . . . . . . . . 15 | 4.4.2. Binning . . . . . . . . . . . . . . . . . . . . . . . 17 | |||
| 4.4.3. Random Noise Addition . . . . . . . . . . . . . . . . 16 | 4.4.3. Random Noise Addition . . . . . . . . . . . . . . . . 17 | |||
| 4.5. Anonymisation of Other Flow Fields . . . . . . . . . . . . 16 | 4.5. Anonymization of Other Flow Fields . . . . . . . . . . . . 17 | |||
| 4.5.1. Binning . . . . . . . . . . . . . . . . . . . . . . . 16 | 4.5.1. Binning . . . . . . . . . . . . . . . . . . . . . . . 18 | |||
| 4.5.2. Permutation . . . . . . . . . . . . . . . . . . . . . 17 | 4.5.2. Permutation . . . . . . . . . . . . . . . . . . . . . 18 | |||
| 5. Parameters for the Description of Anonymisation Techniques . . 17 | 5. Parameters for the Description of Anonymization Techniques . . 18 | |||
| 5.1. Stability . . . . . . . . . . . . . . . . . . . . . . . . 17 | 5.1. Stability . . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
| 5.2. Truncation Length . . . . . . . . . . . . . . . . . . . . 18 | 5.2. Truncation Length . . . . . . . . . . . . . . . . . . . . 19 | |||
| 5.3. Bin Map . . . . . . . . . . . . . . . . . . . . . . . . . 18 | 5.3. Bin Map . . . . . . . . . . . . . . . . . . . . . . . . . 20 | |||
| 5.4. Permutation . . . . . . . . . . . . . . . . . . . . . . . 18 | 5.4. Permutation . . . . . . . . . . . . . . . . . . . . . . . 20 | |||
| 5.5. Shift Amount . . . . . . . . . . . . . . . . . . . . . . . 19 | 5.5. Shift Amount . . . . . . . . . . . . . . . . . . . . . . . 20 | |||
| 6. Anonymisation Export Support in IPFIX . . . . . . . . . . . . 19 | 6. Anonymization Export Support in IPFIX . . . . . . . . . . . . 20 | |||
| 6.1. Anonymisation Records and the Anonymisation Options | 6.1. Anonymization Records and the Anonymization Options | |||
| Template . . . . . . . . . . . . . . . . . . . . . . . . . 19 | Template . . . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
| 6.2. Recommended Information Elements for Anonymisation | 6.2. Recommended Information Elements for Anonymization | |||
| Metadata . . . . . . . . . . . . . . . . . . . . . . . . . 21 | Metadata . . . . . . . . . . . . . . . . . . . . . . . . . 23 | |||
| 6.2.1. informationElementIndex . . . . . . . . . . . . . . . 21 | 6.2.1. informationElementIndex . . . . . . . . . . . . . . . 23 | |||
| 6.2.2. anonymisationTechnique . . . . . . . . . . . . . . . . 22 | 6.2.2. anonymizationTechnique . . . . . . . . . . . . . . . . 23 | |||
| 6.2.3. anonymisationFlags . . . . . . . . . . . . . . . . . . 23 | 6.2.3. anonymizationFlags . . . . . . . . . . . . . . . . . . 25 | |||
| 7. Applying Anonymisation Techniques to IPFIX Export and | 7. Applying Anonymization Techniques to IPFIX Export and | |||
| Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 | Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 | |||
| 7.1. Arrangement of Processes in IPFIX Anonymisation . . . . . 26 | 7.1. Arrangement of Processes in IPFIX Anonymization . . . . . 28 | |||
| 7.2. IPFIX-Specific Anonymisation Guidelines . . . . . . . . . 28 | 7.2. IPFIX-Specific Anonymization Guidelines . . . . . . . . . 30 | |||
| 7.2.1. Appropriate Use of Information Elements for | 7.2.1. Appropriate Use of Information Elements for | |||
| Anonymised Data . . . . . . . . . . . . . . . . . . . 28 | Anonymized Data . . . . . . . . . . . . . . . . . . . 30 | |||
| 7.2.2. Export of Perimeter-Based Anonymisation Policies . . . 29 | 7.2.2. Export of Perimeter-Based Anonymization Policies . . . 31 | |||
| 7.2.3. Anonymisation of Header Data . . . . . . . . . . . . . 30 | 7.2.3. Anonymization of Header Data . . . . . . . . . . . . . 32 | |||
| 7.2.4. Anonymisation of Options Data . . . . . . . . . . . . 30 | 7.2.4. Anonymization of Options Data . . . . . . . . . . . . 32 | |||
| 7.2.5. Special-Use Address Space Considerations . . . . . . . 32 | 7.2.5. Special-Use Address Space Considerations . . . . . . . 34 | |||
| 7.2.6. Protecting Out-of-Band Configuration and | 7.2.6. Protecting Out-of-Band Configuration and | |||
| Management Data . . . . . . . . . . . . . . . . . . . 32 | Management Data . . . . . . . . . . . . . . . . . . . 34 | |||
| 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 | 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 | |||
| 9. Security Considerations . . . . . . . . . . . . . . . . . . . 37 | 9. Security Considerations . . . . . . . . . . . . . . . . . . . 39 | |||
| 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 38 | 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 41 | |||
| 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 39 | 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 41 | |||
| 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 39 | 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 41 | |||
| 12.1. Normative References . . . . . . . . . . . . . . . . . . . 39 | 12.1. Normative References . . . . . . . . . . . . . . . . . . . 41 | |||
| 12.2. Informative References . . . . . . . . . . . . . . . . . . 40 | 12.2. Informative References . . . . . . . . . . . . . . . . . . 42 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 40 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 43 | |||
| 1. Introduction | 1. Introduction | |||
| The standardisation of an IP flow information export protocol | The standardization of an IP flow information export protocol | |||
| [RFC5101] and associated representations removes a technical barrier | [RFC5101] and associated representations removes a technical barrier | |||
| to the sharing of IP flow data across organizational boundaries and | to the sharing of IP flow data across organizational boundaries and | |||
| with network operations, security, and research communities for a | with network operations, security, and research communities for a | |||
| wide variety of purposes. However, with wider dissemination comes | wide variety of purposes. However, with wider dissemination comes | |||
| greater risks to the privacy of the users of networks under | greater risks to the privacy of the users of networks under | |||
| measurement, and to the security of those networks. While it is not | measurement, and to the security of those networks. While it is not | |||
| a complete solution to the issues posed by distribution of IP flow | a complete solution to the issues posed by distribution of IP flow | |||
| information, anonymisation (i.e., the deletion or transformation of | information, anonymization (i.e., the deletion or transformation of | |||
| information that is considered sensitive and could be used to reveal | information that is considered sensitive and could be used to reveal | |||
| the identity of subjects involved in a communication) is an important | the identity of subjects involved in a communication) is an important | |||
| tool for the protection of privacy within network measurement | tool for the protection of privacy within network measurement | |||
| infrastructures. | infrastructures. | |||
| This document presents a mechanism for representing anonymised data | This document presents a mechanism for representing anonymized data | |||
| within IPFIX and guidelines for using it. It begins with a | within IPFIX and guidelines for using it. It is not intended as a | |||
| categorization of anonymisation techniques. It then describes | general statement on the applicability of specific flow data | |||
| applicability of each technique to commonly anonymisable fields of IP | anonymization techniques to specific situations, or as a | |||
| flow data, organized by information element data type and semantics | recommendation of any particular application of anonymization to flow | |||
| as in [RFC5102]; enumerates the parameters required by each of the | data export. Exporters or publishers of anonymized data must take | |||
| applicable anonymisation techniques; and provides guidelines for the | care that the applied anonymization technique is appropriate for the | |||
| use of each of these techniques in accordance with best practices in | data source, the purpose, and the risk of deanonymization of a given | |||
| data protection. Finally, it specifies a mechanism for exporting | application. | |||
| anonymised data and binding anonymisation metadata to Templates and | ||||
| Options Templates using IPFIX Options. | It begins with a categorization of anonymization techniques. It then | |||
| describes applicability of each technique to commonly anonymizable | ||||
| fields of IP flow data, organized by information element data type | ||||
| and semantics as in [RFC5102]; enumerates the parameters required by | ||||
| each of the applicable anonymization techniques; and provides | ||||
| guidelines for the use of each of these techniques in accordance with | ||||
| current best practices in data protection. Finally, it specifies a | ||||
| mechanism for exporting anonymized data and binding anonymization | ||||
| metadata to Templates and Options Templates using IPFIX Options. | ||||
| 1.1. IPFIX Protocol Overview | 1.1. IPFIX Protocol Overview | |||
| In the IPFIX protocol, { type, length, value } tuples are expressed | In the IPFIX protocol, { type, length, value } tuples are expressed | |||
| in Templates containing { type, length } pairs, specifying which { | in Templates containing { type, length } pairs, specifying which { | |||
| value } fields are present in data records conforming to the | value } fields are present in data records conforming to the | |||
| Template, giving great flexibility as to what data is transmitted. | Template, giving great flexibility as to what data is transmitted. | |||
| Since Templates are sent very infrequently compared with Data | Since Templates are sent very infrequently compared with Data | |||
| Records, this results in significant bandwidth savings. Various | Records, this results in significant bandwidth savings. Various | |||
| different data formats may be transmitted simply by sending new | different data formats may be transmitted simply by sending new | |||
| skipping to change at page 5, line 36 ¶ | skipping to change at page 5, line 43 ¶ | |||
| applications of the IPFIX protocol and their use of information | applications of the IPFIX protocol and their use of information | |||
| exported via IPFIX, and relates the IPFIX architecture to other | exported via IPFIX, and relates the IPFIX architecture to other | |||
| measurement architectures and frameworks. | measurement architectures and frameworks. | |||
| Additionally, "Specification of the IPFIX File Format" [RFC5655] | Additionally, "Specification of the IPFIX File Format" [RFC5655] | |||
| describes a file format based upon the IPFIX Protocol for the storage | describes a file format based upon the IPFIX Protocol for the storage | |||
| of flow data. | of flow data. | |||
| This document references the Protocol and Architecture documents for | This document references the Protocol and Architecture documents for | |||
| terminology, and extends the IPFIX Information Model to provide new | terminology, and extends the IPFIX Information Model to provide new | |||
| Information Elements for anonymisation metadata. The anonymisation | Information Elements for anonymization metadata. The anonymization | |||
| techniques described herein are equally applicable to the IPFIX | techniques described herein are equally applicable to the IPFIX | |||
| Protocol and data stored in IPFIX Files. | Protocol and data stored in IPFIX Files. | |||
| 1.3. Anonymisation within the IPFIX Architecture | 1.3. Anonymization within the IPFIX Architecture | |||
| According to [RFC5470], IPFIX Message anonymisation is optionally | According to [RFC5470], IPFIX Message anonymization is optionally | |||
| performed as the final operation before handing the Message to the | performed as the final operation before handing the Message to the | |||
| transport protocol for export. While no provision is made in the | transport protocol for export. While no provision is made in the | |||
| architecture for anonymisation metadata as in Section 6, this | architecture for anonymization metadata as in Section 6, this | |||
| arrangement does allow for the rewriting necessary for comprehensive | arrangement does allow for the rewriting necessary for comprehensive | |||
| anonymisation of IPFIX export as in Section 7. The development of | anonymization of IPFIX export as in Section 7. The development of | |||
| the IPFIX Mediation [I-D.ietf-ipfix-mediators-framework] framework | the IPFIX Mediation [I-D.ietf-ipfix-mediators-framework] framework | |||
| and the IPFIX File Format [RFC5655] expand upon this initial | and the IPFIX File Format [RFC5655] expand upon this initial | |||
| architectural allowance for anonymisation by adding to the list of | architectural allowance for anonymization by adding to the list of | |||
| places that anonymisation may be applied. The former specifies IPFIX | places that anonymization may be applied. The former specifies IPFIX | |||
| Mediators, which rewrite existing IPFIX Messages, and the latter | Mediators, which rewrite existing IPFIX Messages, and the latter | |||
| specifies a method for storage of IPFIX data in files. | specifies a method for storage of IPFIX data in files. | |||
| More detail on the applicable architectural arrangements for | More detail on the applicable architectural arrangements for | |||
| anonymisation can be found in Section 7.1 | anonymization can be found in Section 7.1 | |||
| 1.4. Supporting Experimentation with Anonymization | ||||
| The intended status of this document is Experimental, reflecting the | ||||
| experimental nature of anonymization export support. Research on | ||||
| network trace anonymization techniques and attacks against them is | ||||
| ongoing. Indeed, there is increasing evidence that anonymization | ||||
| applied to network trace or flow data its own is insufficient for | ||||
| many data protection applications as in [Bur10]. Therefore, this | ||||
| document explicitly does not recommend any particular technique or | ||||
| implementation thereof. | ||||
| The intention of this document is to provide a common basis for | ||||
| interoperable exchange of anonymized data, furthering research in | ||||
| this area, both on anonymization techniques themselves as well as to | ||||
| the application of anonymized data to network measurement. To that | ||||
| end, the classification in Section 3 and anonymization export support | ||||
| in Section 6 can be used to describe and export information even | ||||
| about data anonymized using techniques that are unacceptably weak for | ||||
| general application to production data sets on their own. | ||||
| While the specification herein is designed to be implementation- and | ||||
| technique-independent, open research in this area may necessitate | ||||
| future updates to the specification. Assuming the future successful | ||||
| application of this specification to anonymized data publication and | ||||
| exchange, it may be brought back to the IPFIX working group for | ||||
| further development and publication on the standards track. | ||||
| 2. Terminology | 2. Terminology | |||
| Terms used in this document that are defined in the Terminology | Terms used in this document that are defined in the Terminology | |||
| section of the IPFIX Protocol [RFC5101] document are to be | section of the IPFIX Protocol [RFC5101] document are to be | |||
| interpreted as defined there. In addition, this document defines the | interpreted as defined there. In addition, this document defines the | |||
| following terms: | following terms: | |||
| Anonymisation Record: A record, defined by the Anonymisation | Anonymization Record: A record, defined by the Anonymization | |||
| Options Template in section Section 6.1, that defines the | Options Template in section Section 6.1, that defines the | |||
| properties of the anonymisation applied to a single Information | properties of the anonymization applied to a single Information | |||
| Element within a single Template or Options Template. | Element within a single Template or Options Template. | |||
| Anonymised Data Record: A Data Record within a Data Set containing | Anonymized Data Record: A Data Record within a Data Set containing | |||
| at least one Information Element with anonymised values. The | at least one Information Element with anonymized values. The | |||
| Information Element(s) within the Template or Options Template | Information Element(s) within the Template or Options Template | |||
| describing this Data Record SHOULD have a corresponding | describing this Data Record SHOULD have a corresponding | |||
| Anonymisation Record. | Anonymization Record. | |||
| Intermediate Anonymisation Process: An intermediate process which | Intermediate Anonymization Process: An intermediate process which | |||
| takes Data Records and and transforms them into Anonymised Data | takes Data Records and and transforms them into Anonymized Data | |||
| Records. | Records. | |||
| Note that there is an explicit difference in this document between a | Note that there is an explicit difference in this document between a | |||
| "Data Set" (which is defined as in [RFC5101]) and a "data set". When | "Data Set" (which is defined as in [RFC5101]) and a "data set". When | |||
| in lower case, this term refers to any collection of data (usually, | in lower case, this term refers to any collection of data (usually, | |||
| within the context of this document, flow or packet data) which may | within the context of this document, flow or packet data) which may | |||
| contain identifying information and is therefore subject to | contain identifying information and is therefore subject to | |||
| anonymisation. | anonymization. | |||
| Note also that when the term Template is used in this document, | Note also that when the term Template is used in this document, | |||
| unless otherwise noted, it applies both to Templates and Options | unless otherwise noted, it applies both to Templates and Options | |||
| Templates as defined in [RFC5101]. Specifically, Anonymisation | Templates as defined in [RFC5101]. Specifically, Anonymization | |||
| Records may apply to both Templates and Options Templates. | Records may apply to both Templates and Options Templates. | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
| 3. Categorisation of Anonymisation Techniques | 3. Categorization of Anonymization Techniques | |||
| Anonymisation modifies a data set in order to protect the identity of | Anonymization, as described by this document, is the modification of | |||
| the people or entities described by the data set from disclosure. | a data set in order to protect the identity of the people or entities | |||
| With respect to network traffic data, anonymisation generally | described by the data set from disclosure. With respect to network | |||
| attempts to preserve some set of properties of the network traffic | traffic data, anonymization generally attempts to preserve some set | |||
| useful for a given application or applications, while ensuring the | of properties of the network traffic useful for a given application | |||
| data cannot be traced back to the specific networks, hosts, or users | or applications, while ensuring the data cannot be traced back to the | |||
| generating the traffic. | specific networks, hosts, or users generating the traffic. | |||
| Anonymisation may be broadly classified according to two properties: | Anonymization may be broadly classified according to two properties: | |||
| recoverability and countability. All anonymisation techniques map | recoverability and countability. All anonymization techniques map | |||
| the real space of identifiers or values into a separate, anonymised | the real space of identifiers or values into a separate, anonymized | |||
| space, according to some function. A technique is said to be | space, according to some function. A technique is said to be | |||
| recoverable when the function used is invertible or can otherwise be | recoverable when the function used is invertible or can otherwise be | |||
| reversed and a real identifier can be recovered from a given | reversed and a real identifier can be recovered from a given | |||
| replacement identifier. | replacement identifier. Techniques wherein the function used can | |||
| only be reversed using additional information, such as an encryption | ||||
| key, or knowledge of injected traffic within the data set; | ||||
| "recoverability" as used within this categorization does not refer to | ||||
| recoverability under attack. | ||||
| Countability compares the dimension of the anonymised space (N) to | Countability compares the dimension of the anonymized space (N) to | |||
| the dimension of the real space (M), and denotes how the count of | the dimension of the real space (M), and denotes how the count of | |||
| unique values is preserved by the anonymisation function. If the | unique values is preserved by the anonymization function. If the | |||
| anonymised space is smaller than the real space, then the function is | anonymized space is smaller than the real space, then the function is | |||
| said to generalise the input, mapping more than one input point to | said to generalize the input, mapping more than one input point to | |||
| each anonymous value (e.g., as with aggregation). By definition, | each anonymous value (e.g., as with aggregation). By definition, | |||
| generalisation is not recoverable. | generalization is not recoverable. | |||
| If the dimensions of the anonymised and real spaces are the same, | If the dimensions of the anonymized and real spaces are the same, | |||
| such that the count of unique values is preserved, then the function | such that the count of unique values is preserved, then the function | |||
| is said to be a direct substitution function. If the dimension of | is said to be a direct substitution function. If the dimension of | |||
| the anonymised space is larger, such that each real value maps to a | the anonymized space is larger, such that each real value maps to a | |||
| set of anonymised values, then the function is said to be a set | set of anonymized values, then the function is said to be a set | |||
| substitution function. Note that with set substitution functions, | substitution function. Note that with set substitution functions, | |||
| the sets of anonymised values are not necessarily disjoint. Either | the sets of anonymized values are not necessarily disjoint. Either | |||
| direct or set substitution functions are said to be one-way if there | direct or set substitution functions are said to be one-way if there | |||
| exists no non-brute force method for recovering the real data point | exists no non-brute force method for recovering the real data point | |||
| from an anonymised one in isolation (i.e., if the only way to recover | from an anonymized one in isolation (i.e., if the only way to recover | |||
| the data point is to attack the anonymised data set as a whole, e.g. | the data point is to attack the anonymized data set as a whole, e.g. | |||
| through fingerprinting or data injection). | through fingerprinting or data injection). | |||
| This classification is summarised in the table below. | This classification is summarized in the table below. | |||
| +------------------------+-----------------+------------------------+ | +------------------------+-----------------+------------------------+ | |||
| | Recoverability / | Recoverable | Non-recoverable | | | Recoverability / | Recoverable | Non-recoverable | | |||
| | Countability | | | | | Countability | | | | |||
| +------------------------+-----------------+------------------------+ | +------------------------+-----------------+------------------------+ | |||
| | N < M | N.A. | Generalisation | | | N < M | N.A. | Generalization | | |||
| | N = M | Direct | One-way Direct | | | N = M | Direct | One-way Direct | | |||
| | | Substitution | Substitution | | | | Substitution | Substitution | | |||
| | N > M | Set | One-way Set | | | N > M | Set | One-way Set | | |||
| | | Substitution | Substitution | | | | Substitution | Substitution | | |||
| +------------------------+-----------------+------------------------+ | +------------------------+-----------------+------------------------+ | |||
| 4. Anonymisation of IP Flow Data | 4. Anonymization of IP Flow Data | |||
| Due to the restricted semantics of IP flow data, there is a | In anonymizing IP flow data as treated by this document, the goal is | |||
| relatively limited set of specific anonymisation techniques available | generally two-way address untraceability: to remove the ability to | |||
| on flow data, though each falls into the broad categories above. | assert that endpoint X contacted endpoint Y at time T. Address | |||
| Each type of field that may commonly appear in a flow record may have | untraceability is important as IP addresses are the most suitable | |||
| its own applicable specific techniques. | field in IP flow records to identify real-world entities. Each IP | |||
| address is associated with an interface on a network host, and can | ||||
| potentially be identified with a single user. Additionally, IP | ||||
| addresses are structured identifiers; that is, partial IP address | ||||
| prefixes may be used to identify networks just as full IP addresses | ||||
| identify hosts. This leads IP flow data anonymization to be | ||||
| concerned first and foremost with IP address anonymization. | ||||
| While anonymisation is generally applied at the resolution of single | Any form of aggregation which combines flows from multiple endpoints | |||
| fields within a flow record, attacks against anonymisation use entire | into a single record (e.g., aggregation by subnetwork, aggregation | |||
| removing addressing completely) may also provide address | ||||
| untraceability; however, anonymization by aggregation is out of scope | ||||
| for this document. Additionally of potential interest in this | ||||
| problem space but out of scope are anonymization techniques which are | ||||
| applied over multiple fields or multiple records in a way which | ||||
| introduces dependencies among anonymized fields or records. This | ||||
| document is concerned solely with anonymization techniques applied at | ||||
| the resolution of single fields within a flow record. | ||||
| Even so, attacks against these anonymization techniques use entire | ||||
| flows and relationships between hosts and flows within a given data | flows and relationships between hosts and flows within a given data | |||
| set. Therefore, fields which may not necessarily be identifying by | set. Therefore, fields which may not necessarily be identifying by | |||
| themselves may be anonymised in order to increase the anonymity of | themselves may be anonymized in order to increase the anonymity of | |||
| the data set as a whole. | the data set as a whole. | |||
| Of all the fields in an IP flow record, IP addresses are the most | Due to the restricted semantics of IP flow data, there is a | |||
| likely to be used to directly identify entities in the real world. | relatively limited set of specific anonymization techniques available | |||
| Each IP address is associated with an interface on a network host, | on flow data, though each falls into the broad categories discussed | |||
| and can potentially be identified with a single user. Additionally, | in the previous section. Each type of field that may commonly appear | |||
| IP addresses are structured identifiers; that is, partial IP address | in a flow record may have its own applicable specific techniques. | |||
| prefixes may be used to identify networks just as full IP addresses | ||||
| identify hosts. This makes anonymisation of IP addresses | ||||
| particularly important. | ||||
| MAC addresses uniquely identify devices on the network; while they | As with IP addresses, MAC addresses uniquely identify devices on the | |||
| are not often available in traffic data collected at Layer 3, and | network; while they are not often available in traffic data collected | |||
| cannot be used to locate devices within the network, some traces may | at Layer 3, and cannot be used to locate devices within the network, | |||
| contain sub-IP data including MAC address data. Hardware addresses | some traces may contain sub-IP data including MAC address data. | |||
| may be mappable to device serial numbers, and to the entities or | Hardware addresses may be mappable to device serial numbers, and to | |||
| individuals who purchased the devices, when combined with external | the entities or individuals who purchased the devices, when combined | |||
| databases. MAC addresses are also often used in constructing IPv6 | with external databases. MAC addresses are also often used in | |||
| addresses (see section 2.5.1 of [RFC4291]), and as such may be used | constructing IPv6 addresses (see section 2.5.1 of [RFC4291]), and as | |||
| to reconstruct the low-order bits of anonymised IPv6 addresses in | such may be used to reconstruct the low-order bits of anonymized IPv6 | |||
| certain circumstances. Therefore, MAC address anonymisation is also | addresses in certain circumstances. Therefore, MAC address | |||
| important. | anonymization is also important. | |||
| Port numbers identify abstract entities (applications) as opposed to | Port numbers identify abstract entities (applications) as opposed to | |||
| real-world entities, but they can be used to classify hosts and user | real-world entities, but they can be used to classify hosts and user | |||
| behavior. Passive port fingerprinting, both of well-known and | behavior. Passive port fingerprinting, both of well-known and | |||
| ephemeral ports, can be used to determine the operating system | ephemeral ports, can be used to determine the operating system | |||
| running on a host. Relative data volumes by port can also be used to | running on a host. Relative data volumes by port can also be used to | |||
| determine the host's function (workstation, web server, etc.); this | determine the host's function (workstation, web server, etc.); this | |||
| information can be used to identify hosts and users. | information can be used to identify hosts and users. | |||
| While not identifiers in and of themselves, timestamps and counters | While not identifiers in and of themselves, timestamps and counters | |||
| can reveal the behavior of the hosts and users on a network. Any | can reveal the behavior of the hosts and users on a network. Any | |||
| given network activity is recognizable by a pattern of relative time | given network activity is recognizable by a pattern of relative time | |||
| differences and data volumes in the associated sequence of flows, | differences and data volumes in the associated sequence of flows, | |||
| even without host address information. They can therefore be used to | even without host address information. They can therefore be used to | |||
| identify hosts and users. Timestamps and counters are also | identify hosts and users. Timestamps and counters are also | |||
| vulnerable to traffic injection attacks, where traffic with a known | vulnerable to traffic injection attacks, where traffic with a known | |||
| pattern is injected into a network under measurement, and this | pattern is injected into a network under measurement, and this | |||
| pattern is later identified in the anonymised data set. | pattern is later identified in the anonymized data set. | |||
| The simplest and most extreme form of anonymisation, which can be | The simplest and most extreme form of anonymization, which can be | |||
| applied to any field of a flow record, is black-marker anonymisation, | applied to any field of a flow record, is black-marker anonymization, | |||
| or complete deletion of a given field. Note that black-marker | or complete deletion of a given field. Note that black-marker | |||
| anonymisation is equivalent to simply not exporting the field(s) in | anonymization is equivalent to simply not exporting the field(s) in | |||
| question. | question. | |||
| While black-marker anonymisation completely protects the data in the | While black-marker anonymization completely protects the data in the | |||
| deleted fields from the risk of disclosure, it also reduces the | deleted fields from the risk of disclosure, it also reduces the | |||
| utility of the anonymised data set as a whole. Techniques that | utility of the anonymized data set as a whole. Techniques that | |||
| retain some information while reducing (though not eliminating) the | retain some information while reducing (though not eliminating) the | |||
| disclosure risk will be extensively discussed in the following | disclosure risk will be extensively discussed in the following | |||
| sections; note that the techniques specifically applicable to IP | sections; note that the techniques specifically applicable to IP | |||
| addresses, timestamps, ports, and counters will be discussed in | addresses, timestamps, ports, and counters will be discussed in | |||
| separate sections. | separate sections. | |||
| 4.1. IP Address Anonymisation | 4.1. IP Address Anonymization | |||
| Since IP addresses are the most common identifiers within flow data | Since IP addresses are the most common identifiers within flow data | |||
| that can be used to directly identify a person, organization, or | that can be used to directly identify a person, organization, or | |||
| host, most of the work on flow and trace data anonymisation has gone | host, most of the work on flow and trace data anonymization has gone | |||
| into IP address anonymisation techniques. Indeed, the aim of most | into IP address anonymization techniques. Indeed, the aim of most | |||
| attacks against anonymisation is to recover the map from anonymised | attacks against anonymization is to recover the map from anonymized | |||
| IP addresses to original IP addresses thereby identifying the | IP addresses to original IP addresses thereby identifying the | |||
| identified hosts. There is therefore a wide range of IP address | identified hosts. There is therefore a wide range of IP address | |||
| anonymisation schemes that fit into the following categories. | anonymization schemes that fit into the following categories. | |||
| +------------------------------------+---------------------+ | +------------------------------------+---------------------+ | |||
| | Scheme | Action | | | Scheme | Action | | |||
| +------------------------------------+---------------------+ | +------------------------------------+---------------------+ | |||
| | Truncation | Generalisation | | | Truncation | Generalization | | |||
| | Reverse Truncation | Generalisation | | | Reverse Truncation | Generalization | | |||
| | Permutation | Direct Substitution | | | Permutation | Direct Substitution | | |||
| | Prefix-preserving Pseudonymisation | Direct Substitution | | | Prefix-preserving Pseudonymization | Direct Substitution | | |||
| +------------------------------------+---------------------+ | +------------------------------------+---------------------+ | |||
| 4.1.1. Truncation | 4.1.1. Truncation | |||
| Truncation removes "n" of the least significant bits from an IP | Truncation removes "n" of the least significant bits from an IP | |||
| address, replacing them with zeroes. In effect, it replaces a host | address, replacing them with zeroes. In effect, it replaces a host | |||
| address with a network address for some fixed netblock; for IPv4 | address with a network address for some fixed netblock; for IPv4 | |||
| addresses, 8-bit truncation corresponds to replacement with a /24 | addresses, 8-bit truncation corresponds to replacement with a /24 | |||
| network address. Truncation is a non-reversible generalisation | network address. Truncation is a non-reversible generalization | |||
| scheme. Note that while truncation is effective for making hosts | scheme. Note that while truncation is effective for making hosts | |||
| non-identifiable, it preserves information which can be used to | non-identifiable, it preserves information which can be used to | |||
| identify an organization, a geographic region, a country, or a | identify an organization, a geographic region, a country, or a | |||
| continent. | continent. | |||
| Truncation to an address length of 0 is equivalent to black-marker | Truncation to an address length of 0 is equivalent to black-marker | |||
| anonymisation. Complete removal of IP address information is only | anonymization. Complete removal of IP address information is only | |||
| recommended for analysis tasks which have no need to separate flow | recommended for analysis tasks which have no need to separate flow | |||
| data by host or network; e.g. as a first stage to per-application | data by host or network; e.g. as a first stage to per-application | |||
| (port) or time-series total volume analyses. | (port) or time-series total volume analyses. | |||
| 4.1.2. Reverse Truncation | 4.1.2. Reverse Truncation | |||
| Reverse truncation removes "n" of the most significant bits from an | Reverse truncation removes "n" of the most significant bits from an | |||
| IP address, replacing them with zeroes. Reverse truncation is a non- | IP address, replacing them with zeroes. Reverse truncation is a non- | |||
| reversible generalisation scheme. Reverse truncation is effective | reversible generalization scheme. Reverse truncation is effective | |||
| for making networks unidentifiable, partially or completely removing | for making networks unidentifiable, partially or completely removing | |||
| information which can be used to identify an organization, a | information which can be used to identify an organization, a | |||
| geographic region, a country, or a continent (or RIR region of | geographic region, a country, or a continent (or RIR region of | |||
| responsibility). However, it may cause ambiguity when applied to | responsibility). However, it may cause ambiguity when applied to | |||
| data collected from more than one network, since it treats all the | data collected from more than one network, since it treats all the | |||
| hosts with the same address on different networks as if they are the | hosts with the same address on different networks as if they are the | |||
| same host. It is not particularly useful when publishing data where | same host. It is not particularly useful when publishing data where | |||
| the network of origin is known or can be easily guessed by virtue of | the network of origin is known or can be easily guessed by virtue of | |||
| the identity of the publisher. | the identity of the publisher. | |||
| Like truncation, reverse truncation to an address length of 0 is | Like truncation, reverse truncation to an address length of 0 is | |||
| equivalent to black-marker anonymisation. | equivalent to black-marker anonymization. | |||
| 4.1.3. Permutation | 4.1.3. Permutation | |||
| Permutation is a direct substitution technique, replacing each IP | Permutation is a direct substitution technique, replacing each IP | |||
| address with an address selected from the set of possible IP | address with an address selected from the set of possible IP | |||
| addresses, such that each anonymised address represents a unique | addresses, such that each anonymized address represents a unique | |||
| original address. The selection function is often random, though it | original address. The selection function is often random, though it | |||
| is not necessarily so. Permutation does not preserve any structural | is not necessarily so. Permutation does not preserve any structural | |||
| information about a network, but it does preserve the unique count of | information about a network, but it does preserve the unique count of | |||
| IP addresses. Any application that requires more structure than | IP addresses. Any application that requires more structure than | |||
| host-uniqueness will not be able to use permuted IP addresses. | host-uniqueness will not be able to use permuted IP addresses. | |||
| While permutation ideally guarantees that each anonymised address | There are many variations of permutation functions, each of which has | |||
| represents a unique original address, such requires significant state | tradeoffs in performance, security, and guarantees of non-collision; | |||
| in the Intermediate Anonymisation Process. Therefore, permutation | evaluating these tradeoffs is implementation independent. However, | |||
| may be implemented by hashing for performance reasons, with hash | in general permutation functions applied to anonymization SHOULD be | |||
| functions that may have relatively small collision probabilities. | difficult to reverse without knowing the parameters (e.g., a secret | |||
| Such techniques are still essentially direct substitution techniques, | key for HMAC). Given the relatively small space of IPv4 addresses in | |||
| despite the nonzero error probability. | particular, hash functions applied without additional parameters | |||
| could be reversed through brute force if the hash function is known, | ||||
| and SHOULD NOT be used as permutation functions. Permutation | ||||
| functions may guarantee noncollision (i.e., that each anonymized | ||||
| address represents a unique original address), but need not; however, | ||||
| the probability of collision SHOULD be low. We treat even | ||||
| permutations with low but nonzero collision probability as direct | ||||
| substitution nevertheless. Beyond these guidelines, recommendations | ||||
| for specific permutation functions are out of scope for this | ||||
| document. | ||||
| 4.1.4. Prefix-preserving Pseudonymisation | 4.1.4. Prefix-preserving Pseudonymization | |||
| Prefix-preserving pseudonymisation is a direct substitution | Prefix-preserving pseudonymization is a direct substitution | |||
| technique, like permutation but further restricted such that the | technique, like permutation but further restricted such that the | |||
| structure of subnets is preserved at each level while anonymising IP | structure of subnets is preserved at each level while anonymising IP | |||
| addresses. If two real IP addresses match on a prefix of "n" bits, | addresses. If two real IP addresses match on a prefix of "n" bits, | |||
| the two anonymised IP addresses will match on a prefix of "n" bits as | the two anonymized IP addresses will match on a prefix of "n" bits as | |||
| well. This is useful when relationships among networks must be | well. This is useful when relationships among networks must be | |||
| preserved for a given analysis task, but introduces structure into | preserved for a given analysis task, but introduces structure into | |||
| the anonymised data which can be exploited in attacks against the | the anonymized data which can be exploited in attacks against the | |||
| anonymisation technique. | anonymization technique. | |||
| Scanning in Internet background traffic can cause particular problems | Scanning in Internet background traffic can cause particular problems | |||
| with this technique: if a scanner uses a predictable and known | with this technique: if a scanner uses a predictable and known | |||
| sequence of addresses, this information can be used to reverse the | sequence of addresses, this information can be used to reverse the | |||
| substitution. The low order portion of the address can be left | substitution. The low order portion of the address can be left | |||
| unanonymized as a partial defense against this attack. | unanonymized as a partial defense against this attack. | |||
| 4.2. MAC Address Anonymisation | 4.2. MAC Address Anonymization | |||
| Flow data containing sub-IP information can also contain identifying | Flow data containing sub-IP information can also contain identifying | |||
| information in the form of the hardware (MAC) address. While MAC | information in the form of the hardware (MAC) address. While MAC | |||
| address information cannot be used to locate a node within a network, | address information cannot be used to locate a node within a network, | |||
| it can be used to directly uniquely identify a specific device. | it can be used to directly uniquely identify a specific device. | |||
| Vendors or organizations within the supply chain may then have the | Vendors or organizations within the supply chain may then have the | |||
| information necessary to identify the entity or individual that | information necessary to identify the entity or individual that | |||
| purchased the device. | purchased the device. | |||
| MAC address information is not as structured as IP address | MAC address information is not as structured as IP address | |||
| skipping to change at page 12, line 8 ¶ | skipping to change at page 13, line 18 ¶ | |||
| Note that MAC address information also appear within IPv6 addresses, | Note that MAC address information also appear within IPv6 addresses, | |||
| as the EAP-64 address, or EAP-48 address encoded as an EAP-64 | as the EAP-64 address, or EAP-48 address encoded as an EAP-64 | |||
| address, is used as the least significant 64 bits of the IPv6 address | address, is used as the least significant 64 bits of the IPv6 address | |||
| in the case of link local addressing or stateless autoconfiguration; | in the case of link local addressing or stateless autoconfiguration; | |||
| the considerations and techniques in this section may then apply to | the considerations and techniques in this section may then apply to | |||
| such IPv6 addresses as well. | such IPv6 addresses as well. | |||
| +-----------------------------+---------------------+ | +-----------------------------+---------------------+ | |||
| | Scheme | Action | | | Scheme | Action | | |||
| +-----------------------------+---------------------+ | +-----------------------------+---------------------+ | |||
| | Truncation | Generalisation | | | Truncation | Generalization | | |||
| | Reverse Truncation | Generalisation | | | Reverse Truncation | Generalization | | |||
| | Permutation | Direct Substitution | | | Permutation | Direct Substitution | | |||
| | Structured Pseudonymisation | Direct Substitution | | | Structured Pseudonymization | Direct Substitution | | |||
| +-----------------------------+---------------------+ | +-----------------------------+---------------------+ | |||
| 4.2.1. Truncation | 4.2.1. Truncation | |||
| Truncation removes "n" of the least significant bits from a MAC | Truncation removes "n" of the least significant bits from a MAC | |||
| address, replacing them with zeroes. In effect, it retains bits of | address, replacing them with zeroes. In effect, it retains bits of | |||
| OUI, which identifies the manufacturer, while removing the least | OUI, which identifies the manufacturer, while removing the least | |||
| significant bits identifying the particular device. Truncation of 24 | significant bits identifying the particular device. Truncation of 24 | |||
| bits of an EAP-48 or 40 bits of an EAP-64 address zeroes out the | bits of an EAP-48 or 40 bits of an EAP-64 address zeroes out the | |||
| device identifier while retaining the OUI. | device identifier while retaining the OUI. | |||
| Truncation is effective for making device manufacturers partially or | Truncation is effective for making device manufacturers partially or | |||
| completely identifiable within a dataset while deleting unique host | completely identifiable within a dataset while deleting unique host | |||
| identifiers; this can be used to retain and aggregate MAC layer | identifiers; this can be used to retain and aggregate MAC layer | |||
| behavior by vendor. | behavior by vendor. | |||
| Truncation to an address length of 0 is equivalent to black-marker | Truncation to an address length of 0 is equivalent to black-marker | |||
| anonymisation. | anonymization. | |||
| 4.2.2. Reverse Truncation | 4.2.2. Reverse Truncation | |||
| Reverse truncation removes "n" of the most significant bits from a | Reverse truncation removes "n" of the most significant bits from a | |||
| MAC address, replacing them with zeroes. Reverse truncation is a | MAC address, replacing them with zeroes. Reverse truncation is a | |||
| non-reversible generalisation scheme. This has the effect of | non-reversible generalization scheme. This has the effect of | |||
| removing bits of the OUI, which identify manufacturers, before | removing bits of the OUI, which identify manufacturers, before | |||
| removing the least significant bits. Reverse truncation of 24 bits | removing the least significant bits. Reverse truncation of 24 bits | |||
| zeroes out the OUI. | zeroes out the OUI. | |||
| Reverse truncation is effective for making device manufacturers | Reverse truncation is effective for making device manufacturers | |||
| partially or completely unidentifiable within a dataset. However, it | partially or completely unidentifiable within a dataset. However, it | |||
| may cause ambiguity by introducing the possibility of truncated MAC | may cause ambiguity by introducing the possibility of truncated MAC | |||
| address collision. Also note that the utility of removing | address collision. Also note that the utility of removing | |||
| manufacturer information is not particularly well-covered by the | manufacturer information is not particularly well-covered by the | |||
| literature. | literature. | |||
| Reverse truncation to an address length of 0 is equivalent to black- | Reverse truncation to an address length of 0 is equivalent to black- | |||
| marker anonymisation. | marker anonymization. | |||
| 4.2.3. Permutation | 4.2.3. Permutation | |||
| Permutation is a direct substitution technique, replacing each MAC | Permutation is a direct substitution technique, replacing each MAC | |||
| address with an address selected from the set of possible MAC | address with an address selected from the set of possible MAC | |||
| addresses, such that each anonymised address represents a unique | addresses, such that each anonymized address represents a unique | |||
| original address. The selection function is often random, though it | original address. The selection function is often random, though it | |||
| is not necessarily so. Permutation does not preserve any structural | is not necessarily so. Permutation does not preserve any structural | |||
| information about a network, but it does preserve the unique count of | information about a network, but it does preserve the unique count of | |||
| devices on the network. Any application that requires more structure | devices on the network. Any application that requires more structure | |||
| than host-uniqueness will not be able to use permuted MAC addresses. | than host-uniqueness will not be able to use permuted MAC addresses. | |||
| While permutation ideally guarantees that each anonymised address | There are many variations of permutation functions, each of which has | |||
| represents a unique original address, such requires significant state | tradeoffs in performance, security, and guarantees of non-collision; | |||
| in the Intermediate Anonymisation Process. Therefore, permutation | evaluating these tradeoffs is implementation independent. However, | |||
| may be implemented by hashing for performance reasons, with hash | in general permutation functions applied to anonymization SHOULD be | |||
| functions that may have relatively small collision probabilities. | difficult to reverse without knowing the parameters (e.g., a secret | |||
| Such techniques are still essentially direct substitution techniques, | key for HMAC). While the EAP-48 space is larger than the IPv4 | |||
| despite the nonzero error probability. | address space, hash functions applied without additional parameters | |||
| could be reversed through brute force if the hash function is known, | ||||
| and SHOULD NOT be used as permutation functions. Permutation | ||||
| functions may guarantee noncollision (i.e., that each anonymized | ||||
| address represents a unique original address), but need not; however, | ||||
| the probability of collision SHOULD be low. We treat even | ||||
| permutations with low but nonzero collision probability as direct | ||||
| substitution nevertheless. Beyond these guidelines, recommendations | ||||
| for specific permutation functions are out of scope for this | ||||
| document. | ||||
| 4.2.4. Structured Pseudonymisation | 4.2.4. Structured Pseudonymization | |||
| Structured pseudonymisation for MAC addresses is a direct | Structured pseudonymization for MAC addresses is a direct | |||
| substitution technique, like permutation, but restricted such that | substitution technique, like permutation, but restricted such that | |||
| the OUI (the most significant three bytes) is permuted separately | the OUI (the most significant three bytes) is permuted separately | |||
| from the node identifier, the remainder. This is useful when the | from the node identifier, the remainder. This is useful when the | |||
| uniqueness of OUIs must be preserved for a given analysis task, but | uniqueness of OUIs must be preserved for a given analysis task, but | |||
| introduces structure into the anonymised data which can be exploited | introduces structure into the anonymized data which can be exploited | |||
| in attacks against the anonymisation technique. | in attacks against the anonymization technique. | |||
| 4.3. Timestamp Anonymisation | 4.3. Timestamp Anonymization | |||
| The particular time at which a flow began or ended is not | The particular time at which a flow began or ended is not | |||
| particularly identifiable information, but it can be used as part of | particularly identifiable information, but it can be used as part of | |||
| attacks against other anonymisation techniques or for user profiling. | attacks against other anonymization techniques or for user profiling, | |||
| Precise timestamps can be used in injected-traffic fingerprinting | e.g. as in [Mur07]. Timestamps can be used in traffic injection | |||
| attacks, which use known information about a set of traffic generated | attacks, which use known information about a set of traffic generated | |||
| or otherwise known by an attacker to recover mappings of other | or otherwise known by an attacker to recover mappings of other | |||
| anonymised fields, as well as to identify certain activity by | anonymized fields, as well as to identify certain activity by | |||
| response delay and size fingerprinting, which compares response sizes | response delay and size fingerprinting, which compares response sizes | |||
| and inter-flow times in anonymised data to known values. Therefore, | and inter-flow times in anonymized data to known values. Note that | |||
| timestamp information may be anonymised in order to ensure the | these attacks have been shown to be relatively robust against | |||
| protection of the entire data set. | timestamp anonymization techniques (see [Bur10]), so the techniques | |||
| presented in this section are relatively weak and should be used with | ||||
| care. | ||||
| +-----------------------+----------------------------+ | +-----------------------+----------------------------+ | |||
| | Scheme | Action | | | Scheme | Action | | |||
| +-----------------------+----------------------------+ | +-----------------------+----------------------------+ | |||
| | Precision Degradation | Generalisation | | | Precision Degradation | Generalization | | |||
| | Enumeration | Direct or Set Substitution | | | Enumeration | Direct or Set Substitution | | |||
| | Random Shifts | Direct Substitution | | | Random Shifts | Direct Substitution | | |||
| +-----------------------+----------------------------+ | +-----------------------+----------------------------+ | |||
| 4.3.1. Precision Degradation | 4.3.1. Precision Degradation | |||
| Precision Degradation is a generalisation technique that removes the | Precision Degradation is a generalization technique that removes the | |||
| most precise components of a timestamp, accounting all events | most precise components of a timestamp, accounting all events | |||
| occurring in each given interval (e.g. one millisecond for | occurring in each given interval (e.g. one millisecond for | |||
| millisecond level degradation) as simultaneous. This has the effect | millisecond level degradation) as simultaneous. This has the effect | |||
| of potentially collapsing many timestamps into one. With this | of potentially collapsing many timestamps into one. With this | |||
| technique time precision is reduced, and sequencing may be lost, but | technique time precision is reduced, and sequencing may be lost, but | |||
| the information at which time the event occurred is preserved. The | the information at which time the event occurred is preserved. The | |||
| anonymised data may not be generally useful for applications which | anonymized data may not be generally useful for applications which | |||
| require strict sequencing of flows. | require strict sequencing of flows. | |||
| Note that flow meters with low time precision (e.g. second precision, | Note that flow meters with low time precision (e.g. second precision, | |||
| or millisecond precision on high-capacity networks) perform the | or millisecond precision on high-capacity networks) perform the | |||
| equivalent of precision degradation anonymisation by their design. | equivalent of precision degradation anonymization by their design. | |||
| Note also that degradation to a very low precision (e.g. on the order | Note also that degradation to a very low precision (e.g. on the order | |||
| of minutes, hours, or days) is commonly used in analyses operating on | of minutes, hours, or days) is commonly used in analyses operating on | |||
| time-series aggregated data, and may also be described as binning; | time-series aggregated data, and may also be described as binning; | |||
| though the time scales are longer and applicability more restricted, | though the time scales are longer and applicability more restricted, | |||
| this is in principle the same operation. | this is in principle the same operation. | |||
| Precision degradation to infinitely low precision is equivalent to | Precision degradation to infinitely low precision is equivalent to | |||
| black-marker anonymisation. Removal of timestamp information is only | black-marker anonymization. Removal of timestamp information is only | |||
| recommended for analysis tasks which have no need to separate flows | recommended for analysis tasks which have no need to separate flows | |||
| in time, for example for counting total volumes or unique occurrences | in time, for example for counting total volumes or unique occurrences | |||
| of other flow keys in an entire dataset. | of other flow keys in an entire dataset. | |||
| 4.3.2. Enumeration | 4.3.2. Enumeration | |||
| Enumeration is a substitution function that retains the chronological | Enumeration is a substitution function that retains the chronological | |||
| order in which events occurred while eliminating time information. | order in which events occurred while eliminating time information. | |||
| Timestamps are substituted by equidistant timestamps (or numbers) | Timestamps are substituted by equidistant timestamps (or numbers) | |||
| starting from a randomly chosen start value. The resulting data is | starting from a randomly chosen start value. The resulting data is | |||
| useful for applications requiring strict sequencing, but not for | useful for applications requiring strict sequencing, but not for | |||
| those requiring good timing information (e.g. delay- or jitter- | those requiring good timing information (e.g. delay- or jitter- | |||
| measurement for quality-of-service (QoS) applications or service- | measurement for quality-of-service (QoS) applications or service- | |||
| level agreement (SLA) validation). | level agreement (SLA) validation). | |||
| Note that enumeration is functionally equivalent to precision | ||||
| degradation in any environment into which traffic can be regularly | ||||
| injected to serve as a clock at the precision of the frequency of the | ||||
| injected flows. | ||||
| 4.3.3. Random Shifts | 4.3.3. Random Shifts | |||
| Random time shifts add a random offset to every timestamp within a | Random time shifts add a random offset to every timestamp within a | |||
| dataset. This reversible substitution technique therefore retains | dataset. This reversible substitution technique therefore retains | |||
| duration and inter-event interval information as well as | duration and inter-event interval information as well as | |||
| chronological order of flows. It is primarily intended to defeat | chronological order of flows. Random time shifts are quite weak, and | |||
| traffic injection fingerprinting attacks. | relatively easy to reverse in the presence of external knowledge | |||
| about traffic on the measured network. | ||||
| 4.4. Counter Anonymisation | 4.4. Counter Anonymization | |||
| Counters (such as packet and octet volumes per flow) are subject to | Counters (such as packet and octet volumes per flow) are subject to | |||
| fingerprinting and injection attacks against anonymisation, or for | fingerprinting and injection attacks against anonymization, or for | |||
| user profiling as timestamps are. Counter anonymisation can help | user profiling as timestamps are. Data sets with anonymized counters | |||
| defeat these attacks, but are only usable for analysis tasks for | are useful only for analysis tasks for which relative or imprecise | |||
| which relative or imprecise magnitudes of activity are useful. | magnitudes of activity are useful. Counter information can also be | |||
| Counter information can also be completely removed, but this is only | completely removed, but this is only recommended for analysis tasks | |||
| recommended for analysis tasks which have no need to evaluate the | which have no need to evaluate the removed counter, for example for | |||
| removed counter, for example for counting only unique occurrences of | counting only unique occurrences of other flow keys. | |||
| other flow keys. | ||||
| +-----------------------+----------------------------+ | +-----------------------+----------------------------+ | |||
| | Scheme | Action | | | Scheme | Action | | |||
| +-----------------------+----------------------------+ | +-----------------------+----------------------------+ | |||
| | Precision Degradation | Generalisation | | | Precision Degradation | Generalization | | |||
| | Binning | Generalisation | | | Binning | Generalization | | |||
| | Random noise addition | Direct or Set Substitution | | | Random noise addition | Direct or Set Substitution | | |||
| +-----------------------+----------------------------+ | +-----------------------+----------------------------+ | |||
| 4.4.1. Precision Degradation | 4.4.1. Precision Degradation | |||
| As with precision degradation in timestamps, precision degradation of | As with precision degradation in timestamps, precision degradation of | |||
| counters removes lower-order bits of the counters, treating all the | counters removes lower-order bits of the counters, treating all the | |||
| counters in a given range as having the same value. Depending on the | counters in a given range as having the same value. Depending on the | |||
| precision reduction, this loses information about the relationships | precision reduction, this loses information about the relationships | |||
| between sizes of similarly-sized flows, but keeps relative magnitude | between sizes of similarly-sized flows, but keeps relative magnitude | |||
| information. Precision degradation to an infinitely low precision is | information. Precision degradation to an infinitely low precision is | |||
| equivalent to black-marker anonymisation. | equivalent to black-marker anonymization. | |||
| 4.4.2. Binning | 4.4.2. Binning | |||
| Binning can be seen as a special case of precision degradation; the | Binning can be seen as a special case of precision degradation; the | |||
| operation is identical, except for in precision degradation the | operation is identical, except for in precision degradation the | |||
| counter ranges are uniform, and in binning they need not be. For | counter ranges are uniform, and in binning they need not be. For | |||
| example, a common counter binning scheme for packet counters could be | example, consider separating unopened TCP connections from | |||
| to bin values 1-2 together, and 3-infinity together, thereby | potentially opened TCP connections. Here, packet counters per flow | |||
| separating potentially completely-opened TCP connections from | would be binned into two bins, one for 1-2 packet flows, and one for | |||
| unopened ones. Binning schemes are generally chosen to keep | flows with 3 or more packets. Binning schemes are generally chosen | |||
| precisely the amount of information required in a counter for a given | to keep precisely the amount of information required in a counter for | |||
| analysis task. Note that, also unlike precision degradation, the bin | a given analysis task. Note that, also unlike precision degradation, | |||
| label need not be within the bin's range. Binning counters to a | the bin label need not be within the bin's range. Binning counters | |||
| single bin is equivalent to black-marker anonymisation. | to a single bin is equivalent to black-marker anonymization. | |||
| 4.4.3. Random Noise Addition | 4.4.3. Random Noise Addition | |||
| Random noise addition adds a random amount to a counter in each flow; | Random noise addition adds a random amount to a counter in each flow; | |||
| this is used to keep relative magnitude information and minimize the | this is used to keep relative magnitude information and minimize the | |||
| disruption to size relationship information while avoiding | disruption to size relationship information while avoiding | |||
| fingerprinting attacks against anonymisation. Note that there is no | fingerprinting attacks against anonymization. Note that there is no | |||
| guarantee that random noise addition will maintain ranking order by a | guarantee that random noise addition will maintain ranking order by a | |||
| counter among members of a set. Random noise addition is | counter among members of a set. Random noise addition is | |||
| particularly useful when the derived analysis data will not be | particularly useful when the derived analysis data will not be | |||
| presented in such a way as to require the lower-order bits of the | presented in such a way as to require the lower-order bits of the | |||
| counters. | counters. | |||
| 4.5. Anonymisation of Other Flow Fields | 4.5. Anonymization of Other Flow Fields | |||
| Other fields, particularly port numbers and protocol numbers, can be | Other fields, particularly port numbers and protocol numbers, can be | |||
| used to partially identify the applications that generated the | used to partially identify the applications that generated the | |||
| traffic in a a given flow trace. This information can be used in | traffic in a a given flow trace. This information can be used in | |||
| fingerprinting attacks, and may be of interest on its own (e.g., to | fingerprinting attacks, and may be of interest on its own (e.g., to | |||
| reveal that a certain application with suspected vulnerabilities is | reveal that a certain application with suspected vulnerabilities is | |||
| running on a given network). These fields are generally anonymised | running on a given network). These fields are generally anonymized | |||
| using one of two techniques. | using one of two techniques. | |||
| +-------------+---------------------+ | +-------------+---------------------+ | |||
| | Scheme | Action | | | Scheme | Action | | |||
| +-------------+---------------------+ | +-------------+---------------------+ | |||
| | Binning | Generalisation | | | Binning | Generalization | | |||
| | Permutation | Direct Substitution | | | Permutation | Direct Substitution | | |||
| +-------------+---------------------+ | +-------------+---------------------+ | |||
| 4.5.1. Binning | 4.5.1. Binning | |||
| Binning is a generalisation technique mapping a set of potentially | Binning is a generalization technique mapping a set of potentially | |||
| non-uniform ranges into a set of arbitrarily labeled bins. Common | non-uniform ranges into a set of arbitrarily labeled bins. Common | |||
| bin arrangements depend on the field type and the analysis | bin arrangements depend on the field type and the analysis | |||
| application. For example, an IP protocol bin arrangement may | application. For example, an IP protocol bin arrangement may | |||
| preserve 1, 6, and 17 for ICMP, UDP, and TCP traffic, and bin all | preserve 1, 6, and 17 for ICMP, UDP, and TCP traffic, and bin all | |||
| other protocols into a single bin, to mitigate the use of uncommon | other protocols into a single bin, to mitigate the use of uncommon | |||
| protocols in fingerprinting attacks. Another example arrangement may | protocols in fingerprinting attacks. Another example arrangement may | |||
| bin source and destination ports into low (0-1023) and high (1024- | bin source and destination ports into low (0-1023) and high (1024- | |||
| 65535) bins in order to tell service from ephemeral ports without | 65535) bins in order to tell service from ephemeral ports without | |||
| identifying individual applications. | identifying individual applications. | |||
| Binning other flow key fields to a single bin is equivalent to black- | Binning other flow key fields to a single bin is equivalent to black- | |||
| marker anonymisation. Removal of other flow key information is only | marker anonymization. Removal of other flow key information is only | |||
| recommended for analysis tasks which have no need to differentiate | recommended for analysis tasks which have no need to differentiate | |||
| flows on the removed keys, for example for total traffic counts or | flows on the removed keys, for example for total traffic counts or | |||
| unique counts of other flow keys. | unique counts of other flow keys. | |||
| 4.5.2. Permutation | 4.5.2. Permutation | |||
| Permutation is a direct substitution technique, replacing each value | Permutation is a direct substitution technique, replacing each value | |||
| with an value selected from the set of possible range, such that each | with an value selected from the set of possible range, such that each | |||
| anonymised value represents a unique original value. This is used to | anonymized value represents a unique original value. This is used to | |||
| preserve the count of unique values without preserving information | preserve the count of unique values without preserving information | |||
| about, or the ordering of, the values themselves. | about, or the ordering of, the values themselves. | |||
| While permutation ideally guarantees that each anonymised value | While permutation ideally guarantees that each anonymized value | |||
| represents a unique original value, such may require significant | represents a unique original value, such may require significant | |||
| state in the Intermediate Anonymisation Process. Therefore, | state in the Intermediate Anonymization Process. Therefore, | |||
| permutation may be implemented by hashing for performance reasons, | permutation may be implemented by hashing for performance reasons, | |||
| with hash functions that may have relatively small collision | with hash functions that may have relatively small collision | |||
| probabilities. Such techniques are still essentially direct | probabilities. Such techniques are still essentially direct | |||
| substitution techniques, despite the nonzero error probability. | substitution techniques, despite the nonzero error probability. | |||
| 5. Parameters for the Description of Anonymisation Techniques | 5. Parameters for the Description of Anonymization Techniques | |||
| This section details the abstract parameters used to describe the | This section details the abstract parameters used to describe the | |||
| anonymisation techniques examined in the previous section, on a per- | anonymization techniques examined in the previous section, on a per- | |||
| parameter basis. These parameters and their export safety inform the | parameter basis. These parameters and their export safety inform the | |||
| design of the IPFIX anonymisation metadata export specified in the | design of the IPFIX anonymization metadata export specified in the | |||
| following section. | following section. | |||
| 5.1. Stability | 5.1. Stability | |||
| A stable anonymisation will always map a given value in the real | A stable anonymization will always map a given value in the real | |||
| space to a given value in the anonymised space, while an unstable | space to a given value in the anonymized space, while an unstable | |||
| anonymisation will change this mapping over time; a completely | anonymization will change this mapping over time; a completely | |||
| unstable anonymisation is essentially indistinguishable from black- | unstable anonymization is essentially indistinguishable from black- | |||
| marker anonymisation. Any given anonymisation technique may be | marker anonymization. Any given anonymization technique may be | |||
| applied with a varying range of stability. Stability is important | applied with a varying range of stability. Stability is important | |||
| for assessing the comparability of anonymised information in | for assessing the comparability of anonymized information in | |||
| different data sets, or in the same data set over different time | different data sets, or in the same data set over different time | |||
| periods. In practice, an anonymisation may also be stable for every | periods. In practice, an anonymization may also be stable for every | |||
| data set published by an a particular producer to a particular | data set published by an a particular producer to a particular | |||
| consumer, stable for a stated time period within a dataset or across | consumer, stable for a stated time period within a dataset or across | |||
| datasets, or stable only for a single data set. | datasets, or stable only for a single data set. | |||
| If no information about stability is available, users of anonymised | If no information about stability is available, users of anonymized | |||
| data MAY assume that the techniques used are stable across the entire | data MAY assume that the techniques used are stable across the entire | |||
| dataset, but unstable across datasets. Note that stability presents | dataset, but unstable across datasets. Note that stability presents | |||
| a risk-utility tradeoff, as completely stable anonymisation can be | a risk-utility tradeoff, as completely stable anonymization can be | |||
| used for longer-term trend analysis tasks but also presents more risk | used for longer-term trend analysis tasks but also presents more risk | |||
| of attack given the stable mapping. Information about the stability | of attack given the stable mapping. Information about the stability | |||
| of a mapping SHOULD be exported along with the anonymised data. | of a mapping SHOULD be exported along with the anonymized data. | |||
| 5.2. Truncation Length | 5.2. Truncation Length | |||
| Truncation and precision degradation are described by the truncation | Truncation and precision degradation are described by the truncation | |||
| length, or the amount of data still remaining in the anonymised field | length, or the amount of data still remaining in the anonymized field | |||
| after anonymisation. | after anonymization. | |||
| Truncation length can generally be inferred from a given data set, | Truncation length can generally be inferred from a given data set, | |||
| and need not be specially exported or protected. For bit-level | and need not be specially exported or protected. For bit-level | |||
| truncation, the truncated bits are generally inferable by the least | truncation, the truncated bits are generally inferable by the least | |||
| significant bit set for an instance of an Information Element | significant bit set for an instance of an Information Element | |||
| described by a given Template (or the most significant bit set, in | described by a given Template (or the most significant bit set, in | |||
| the case of reverse truncation). For precision degradation, the | the case of reverse truncation). For precision degradation, the | |||
| truncation is inferable from the maximum precision given. Note that | truncation is inferable from the maximum precision given. Note that | |||
| while this inference method is generally applicable, it is data- | while this inference method is generally applicable, it is data- | |||
| dependent: there is no guarantee that it will recover the exact | dependent: there is no guarantee that it will recover the exact | |||
| skipping to change at page 18, line 37 ¶ | skipping to change at page 20, line 13 ¶ | |||
| length alongside the address. | length alongside the address. | |||
| 5.3. Bin Map | 5.3. Bin Map | |||
| Binning is described by the specification of a bin mapping function. | Binning is described by the specification of a bin mapping function. | |||
| This function can be generally expressed in terms of an associative | This function can be generally expressed in terms of an associative | |||
| array that maps each point in the original space to a bin, although | array that maps each point in the original space to a bin, although | |||
| from an implementation standpoint most bin functions are much simpler | from an implementation standpoint most bin functions are much simpler | |||
| and more efficient. | and more efficient. | |||
| Since knowledge of the bin mapping function can be used to partially | Since the bin map for a bin mapping function is in essence the bin | |||
| deanonymise binned data, depending on the degree of generalisation, | mapping key, and can be used to partially deanonymize binned data, | |||
| information about the bin mapping function SHOULD NOT be exported. | depending on the degree of generalization, information about the bin | |||
| mapping function SHOULD NOT be exported. | ||||
| 5.4. Permutation | 5.4. Permutation | |||
| Like binning, permutation is described by the specification of a | Like binning, permutation is described by the specification of a | |||
| permutation function. In the general case, this can be expressed in | permutation function. In the general case, this can be expressed in | |||
| terms of an associative array that maps each point in the original | terms of an associative array that maps each point in the original | |||
| space to a point in the anonymised space. Unlike binning, each point | space to a point in the anonymized space. Unlike binning, each point | |||
| in the anonymised space corresponds to a single, unique point in the | in the anonymized space corresponds to a single, unique point in the | |||
| original space. | original space. | |||
| Since knowledge of the permutation function may, depending on the | Since the parameters of the permutation function are in essence key- | |||
| function, be used to completely deanonymise permuted data, | like (indeed, for cryptographic permutation functions, they are the | |||
| information about the permutation function or its parameters SHOULD | keys themselves), information about the permutation function or its | |||
| NOT be exported. | parameters SHOULD NOT be exported. | |||
| 5.5. Shift Amount | 5.5. Shift Amount | |||
| Shifting requires an amount to shift each value by. Since the shift | Shifting requires an amount to shift each value by. Since the shift | |||
| amount can be used to deanonymise data protected by shifting, | amount is the only key to a shift function, and can be used to | |||
| information about the shift amount SHOULD NOT be exported. | trivially deanonymize data protected by shifting, information about | |||
| the shift amount SHOULD NOT be exported. | ||||
| 6. Anonymisation Export Support in IPFIX | 6. Anonymization Export Support in IPFIX | |||
| Anonymised data exported via IPFIX SHOULD be annotated with | Anonymized data exported via IPFIX SHOULD be annotated with | |||
| anonymisation metadata, which details which fields described by which | anonymization metadata, which details which fields described by which | |||
| Templates are anonymised, and provides appropriate information on the | Templates are anonymized, and provides appropriate information on the | |||
| anonymisation techniques used. This metadata SHOULD be exported in | anonymization techniques used. This metadata SHOULD be exported in | |||
| Data Records described by the recommended Options Templates described | Data Records described by the recommended Options Templates described | |||
| in this section; these Options Templates use the additional | in this section; these Options Templates use the additional | |||
| Information Elements described in the following subsection. | Information Elements described in the following subsection. | |||
| Note that fields anonymised using the black-marker (removal) | Note that fields anonymized using the black-marker (removal) | |||
| technique do not require any special metadata support: black-marker | technique do not require any special metadata support: black-marker | |||
| anonymised fields SHOULD NOT be exported at all, by omitting the | anonymized fields SHOULD NOT be exported at all, by omitting the | |||
| corresponding Information Elements from Template describing the Data | corresponding Information Elements from Template describing the Data | |||
| Set. In the case where application requirements dictate that a black- | Set. In the case where application requirements dictate that a black- | |||
| marker anonymised field must remain in a Template, then an Exporting | marker anonymized field must remain in a Template, then an Exporting | |||
| Process MAY export black-marker anonymised fields with their native | Process MAY export black-marker anonymized fields with their native | |||
| length as all-zeros, but only in cases where enough contextual | length as all-zeros, but only in cases where enough contextual | |||
| information exists within the record to differentiate a black-marker | information exists within the record to differentiate a black-marker | |||
| anonymised field exported in this way from a real zero value. | anonymized field exported in this way from a real zero value. | |||
| 6.1. Anonymisation Records and the Anonymisation Options Template | 6.1. Anonymization Records and the Anonymization Options Template | |||
| The Anonymisation Options Template describes Anonymisation Records, | The Anonymization Options Template describes Anonymization Records, | |||
| which allow anonymisation metadata to be exported inline over IPFIX | which allow anonymization metadata to be exported inline over IPFIX | |||
| or stored in an IPFIX File, by binding information about | or stored in an IPFIX File, by binding information about | |||
| anonymisation techniques to Information Elements within defined | anonymization techniques to Information Elements within defined | |||
| Templates or Options Templates. IPFIX Exporting Processes SHOULD | Templates or Options Templates. IPFIX Exporting Processes SHOULD | |||
| export anonymisation records for any Template describing exported | export anonymization records for any Template describing exported | |||
| anonymised Data Records; IPFIX Collecting Processes and processes | anonymized Data Records; IPFIX Collecting Processes and processes | |||
| downstream from them MAY use anonymisation records to treat | downstream from them MAY use anonymization records to treat | |||
| anonymised data differently depending on the applied technique. | anonymized data differently depending on the applied technique. | |||
| Anonymisation Records contain ancillary information bound to a | Anonymization Records contain ancillary information bound to a | |||
| Template, so many of the considerations for Templates apply to | Template, so many of the considerations for Templates apply to | |||
| Anonymisation Records as well. First, reliability is important: an | Anonymization Records as well. First, reliability is important: an | |||
| Exporting Process SHOULD export Anonymisation Records after the | Exporting Process SHOULD export Anonymization Records after the | |||
| Templates they describe have been exported, and SHOULD export | Templates they describe have been exported, and SHOULD export | |||
| anonymisation records reliably. | anonymization records reliably if supported by the underlying | |||
| transport (i.e., without partial reliability when using SCTP) | ||||
| Anonymisation Records MUST be handled by Collecting Processes as | Anonymization Records MUST be handled by Collecting Processes as | |||
| scoped to the Template to which they apply within the Transport | scoped to the Template to which they apply within the Transport | |||
| Session in which they are sent. When a Template is withdrawn via a | Session in which they are sent. When a Template is withdrawn via a | |||
| Template Withdrawal Message or expires during a UDP transport | Template Withdrawal Message or expires during a UDP transport | |||
| session, the accompanying Anonymisation Records are withdrawn or | session, the accompanying Anonymization Records are withdrawn or | |||
| expire as well, and do not apply to subsequent Templates with the | expire as well, and do not apply to subsequent Templates with the | |||
| same Template ID within the Session unless re-exported. | same Template ID within the Session unless re-exported. | |||
| The Stability Class within the anonymisationFlags IE can be used to | The Stability Class within the anonymizationFlags IE can be used to | |||
| declare that a given anonymisation technique's mapping will remain | declare that a given anonymization technique's mapping will remain | |||
| stable across multiple sessions, but this does not mean that | stable across multiple sessions, but this does not mean that | |||
| anonymisation technique information given in the Anonymisation | anonymization technique information given in the Anonymization | |||
| Records themselves persist across Sessions. Each new Transport | Records themselves persist across Sessions. Each new Transport | |||
| Session MUST contain new Anonymisation Records for each Template | Session MUST contain new Anonymization Records for each Template | |||
| describing anonymised Data Sets. | describing anonymized Data Sets. | |||
| SCTP per-stream export [I-D.ietf-ipfix-export-per-sctp-stream] may be | SCTP per-stream export [I-D.ietf-ipfix-export-per-sctp-stream] may be | |||
| used to ease management of Anonymisation Records if appropriate for | used to ease management of Anonymization Records if appropriate for | |||
| the application. | the application. | |||
| The fields of the Anonymisation Options template are as follows: | The fields of the Anonymization Options template are as follows: | |||
| +-------------------------+-----------------------------------------+ | +-------------------------+-----------------------------------------+ | |||
| | IE | Description | | | IE | Description | | |||
| +-------------------------+-----------------------------------------+ | +-------------------------+-----------------------------------------+ | |||
| | templateId [scope] | The Template ID of the Template or | | | templateId [scope] | The Template ID of the Template or | | |||
| | | Options Template containing the | | | | Options Template containing the | | |||
| | | Information Element described by this | | | | Information Element described by this | | |||
| | | anonymisation record. This Information | | | | anonymization record. This Information | | |||
| | | Element MUST be defined as a Scope | | | | Element MUST be defined as a Scope | | |||
| | | Field. | | | | Field. | | |||
| | informationElementId | The Information Element identifier of | | | informationElementId | The Information Element identifier of | | |||
| | [scope] | the Information Element described by | | | [scope] | the Information Element described by | | |||
| | | this anonymisation record. This | | | | this anonymization record. This | | |||
| | | Information Element MUST be defined as | | | | Information Element MUST be defined as | | |||
| | | a Scope Field. Exporting Processes | | | | a Scope Field. Exporting Processes | | |||
| | | MUST clear then Enterprise bit of the | | | | MUST clear then Enterprise bit of the | | |||
| | | informationElementId and Collecting | | | | informationElementId and Collecting | | |||
| | | Processes SHOULD ignore it; information | | | | Processes SHOULD ignore it; information | | |||
| | | about enterprise-specific Information | | | | about enterprise-specific Information | | |||
| | | Elements is exported via the | | | | Elements is exported via the | | |||
| | | privateEnterpriseNumber Information | | | | privateEnterpriseNumber Information | | |||
| | | Element. | | | | Element. | | |||
| | privateEnterpriseNumber | The Private Enterprise Number of the | | | privateEnterpriseNumber | The Private Enterprise Number of the | | |||
| | [scope] [optional] | enterprise-specific Information Element | | | [scope] [optional] | enterprise-specific Information Element | | |||
| | | described by this anonymisation record. | | | | described by this anonymization record. | | |||
| | | This Information Element MUST be | | | | This Information Element MUST be | | |||
| | | defined as a Scope Field if present. A | | | | defined as a Scope Field if present. A | | |||
| | | privateEnterpriseNumber of 0 signifies | | | | privateEnterpriseNumber of 0 signifies | | |||
| | | that the Information Element is | | | | that the Information Element is | | |||
| | | IANA-registered. | | | | IANA-registered. | | |||
| | informationElementIndex | The Information Element index of the | | | informationElementIndex | The Information Element index of the | | |||
| | [scope] [optional] | instance of the Information Element | | | [scope] [optional] | instance of the Information Element | | |||
| | | described by this anonymisation record | | | | described by this anonymization record | | |||
| | | identified by the informationElementId | | | | identified by the informationElementId | | |||
| | | within the Template. Optional; need | | | | within the Template. Optional; need | | |||
| | | only be present when describing | | | | only be present when describing | | |||
| | | Templates that have multiple instances | | | | Templates that have multiple instances | | |||
| | | of the same Information Element. This | | | | of the same Information Element. This | | |||
| | | Information Element MUST be defined as | | | | Information Element MUST be defined as | | |||
| | | a Scope Field if present. This | | | | a Scope Field if present. This | | |||
| | | Information Element is defined in | | | | Information Element is defined in | | |||
| | | Section 6.2, below. | | | | Section 6.2, below. | | |||
| | anonymisationFlags | Flags describing the mapping stability | | | anonymizationFlags | Flags describing the mapping stability | | |||
| | | and specialized modifications to the | | | | and specialized modifications to the | | |||
| | | Anonymisation Technique in use. SHOULD | | | | Anonymization Technique in use. SHOULD | | |||
| | | be present. This Information Element | | | | be present. This Information Element | | |||
| | | is defined in Section 6.2.3, below. | | | | is defined in Section 6.2.3, below. | | |||
| | anonymisationTechnique | The technique used to anonymise the | | | anonymizationTechnique | The technique used to anonymize the | | |||
| | | data. MUST be present. This | | | | data. MUST be present. This | | |||
| | | Information Element is defined in | | | | Information Element is defined in | | |||
| | | Section 6.2.2, below. | | | | Section 6.2.2, below. | | |||
| +-------------------------+-----------------------------------------+ | +-------------------------+-----------------------------------------+ | |||
| 6.2. Recommended Information Elements for Anonymisation Metadata | 6.2. Recommended Information Elements for Anonymization Metadata | |||
| 6.2.1. informationElementIndex | 6.2.1. informationElementIndex | |||
| Description: A zero-based index of an Information Element | Description: A zero-based index of an Information Element | |||
| referenced by informationElementId within a Template referenced by | referenced by informationElementId within a Template referenced by | |||
| templateId; used to disambiguate scope for templates containing | templateId; used to disambiguate scope for templates containing | |||
| multiple identical Information Elements. | multiple identical Information Elements. | |||
| Abstract Data Type: unsigned16 | Abstract Data Type: unsigned16 | |||
| Data Type Semantics: identifier | ||||
| ElementId: TBD3 | ElementId: TBD3 | |||
| Status: Proposed | Status: Current | |||
| 6.2.2. anonymisationTechnique | 6.2.2. anonymizationTechnique | |||
| Description: A description of the anonymisation technique applied | Description: A description of the anonymization technique applied | |||
| to a referenced Information Element within a referenced Template. | to a referenced Information Element within a referenced Template. | |||
| Each technique may be applicable only to certain Information | Each technique may be applicable only to certain Information | |||
| Elements and recommended only for certain Infomation Elements; | Elements and recommended only for certain Infomation Elements; | |||
| these restrictions are noted in the table below. | these restrictions are noted in the table below. | |||
| +-------+---------------------------+-----------------+-------------+ | +-------+---------------------------+-----------------+-------------+ | |||
| | Value | Description | Applicable to | Recommended | | | Value | Description | Applicable to | Recommended | | |||
| | | | | for | | | | | | for | | |||
| +-------+---------------------------+-----------------+-------------+ | +-------+---------------------------+-----------------+-------------+ | |||
| | 0 | Undefined: the Exporting | all | all | | | 0 | Undefined: the Exporting | all | all | | |||
| | | Process makes no | | | | | | Process makes no | | | | |||
| | | representation as to | | | | | | representation as to | | | | |||
| | | whether the defined field | | | | | | whether the defined field | | | | |||
| | | is anonymised or not. | | | | | | is anonymized or not. | | | | |||
| | | While the Collecting | | | | | | While the Collecting | | | | |||
| | | Process MAY assume that | | | | | | Process MAY assume that | | | | |||
| | | the field is not | | | | | | the field is not | | | | |||
| | | anonymised, it is not | | | | | | anonymized, it is not | | | | |||
| | | guaranteed not to be. | | | | | | guaranteed not to be. | | | | |||
| | | This is the default | | | | | | This is the default | | | | |||
| | | anonymisation technique. | | | | | | anonymization technique. | | | | |||
| | 1 | None: the values exported | all | all | | | 1 | None: the values exported | all | all | | |||
| | | are real. | | | | | | are real. | | | | |||
| | 2 | Precision | all | all | | | 2 | Precision | all | all | | |||
| | | Degradation/Truncation: | | | | | | Degradation/Truncation: | | | | |||
| | | the values exported are | | | | | | the values exported are | | | | |||
| | | anonymised using simple | | | | | | anonymized using simple | | | | |||
| | | precision degradation or | | | | | | precision degradation or | | | | |||
| | | truncation. The new | | | | | | truncation. The new | | | | |||
| | | precision or number of | | | | | | precision or number of | | | | |||
| | | truncated bits is | | | | | | truncated bits is | | | | |||
| | | implicit in the exported | | | | | | implicit in the exported | | | | |||
| | | data, and can be deduced | | | | | | data, and can be deduced | | | | |||
| | | by the Collecting | | | | | | by the Collecting | | | | |||
| | | Process. | | | | | | Process. | | | | |||
| | 3 | Binning: the values | all | all | | | 3 | Binning: the values | all | all | | |||
| | | exported are anonymised | | | | | | exported are anonymized | | | | |||
| | | into bins. | | | | | | into bins. | | | | |||
| | 4 | Enumeration: the values | all | timestamps | | | 4 | Enumeration: the values | all | timestamps | | |||
| | | exported are anonymised | | | | | | exported are anonymized | | | | |||
| | | by enumeration. | | | | | | by enumeration. | | | | |||
| | 5 | Permutation: the values | all | identifiers | | | 5 | Permutation: the values | all | identifiers | | |||
| | | exported are anonymised | | | | | | exported are anonymized | | | | |||
| | | by permutation. | | | | | | by permutation. | | | | |||
| | 6 | Structured Permutation: | addresses | | | | 6 | Structured Permutation: | addresses | | | |||
| | | the values exported are | | | | | | the values exported are | | | | |||
| | | anonymised by | | | | | | anonymized by | | | | |||
| | | permutation, preserving | | | | | | permutation, preserving | | | | |||
| | | bit-level structure as | | | | | | bit-level structure as | | | | |||
| | | appropriate; this | | | | | | appropriate; this | | | | |||
| | | represents | | | | | | represents | | | | |||
| | | prefix-preserving IP | | | | | | prefix-preserving IP | | | | |||
| | | address anonymisation or | | | | | | address anonymization or | | | | |||
| | | structured MAC address | | | | | | structured MAC address | | | | |||
| | | anonymisation. | | | | | | anonymization. | | | | |||
| | 7 | Reverse Truncation: the | addresses | | | | 7 | Reverse Truncation: the | addresses | | | |||
| | | values exported are | | | | | | values exported are | | | | |||
| | | anonymised using reverse | | | | | | anonymized using reverse | | | | |||
| | | truncation. The number | | | | | | truncation. The number | | | | |||
| | | of truncated bits is | | | | | | of truncated bits is | | | | |||
| | | implicit in the exported | | | | | | implicit in the exported | | | | |||
| | | data, and can be deduced | | | | | | data, and can be deduced | | | | |||
| | | by the Collecting | | | | | | by the Collecting | | | | |||
| | | Process. | | | | | | Process. | | | | |||
| | 8 | Noise: the values | non-identifiers | counters | | | 8 | Noise: the values | non-identifiers | counters | | |||
| | | exported are anonymised | | | | | | exported are anonymized | | | | |||
| | | by adding random noise to | | | | | | by adding random noise to | | | | |||
| | | each value. | | | | | | each value. | | | | |||
| | 9 | Offset: the values | all | timestamps | | | 9 | Offset: the values | all | timestamps | | |||
| | | exported are anonymised | | | | | | exported are anonymized | | | | |||
| | | by adding a single offset | | | | | | by adding a single offset | | | | |||
| | | to all values. | | | | | | to all values. | | | | |||
| +-------+---------------------------+-----------------+-------------+ | +-------+---------------------------+-----------------+-------------+ | |||
| Abstract Data Type: unsigned16 | Abstract Data Type: unsigned16 | |||
| Data Type Semantics: identifier | ||||
| ElementId: TBD2 | ElementId: TBD2 | |||
| Status: Proposed | Status: Current | |||
| 6.2.3. anonymisationFlags | 6.2.3. anonymizationFlags | |||
| Description: A flag word describing specialized modifications to | Description: A flag word describing specialized modifications to | |||
| the anonymisation policy in effect for the anonymisation technique | the anonymization policy in effect for the anonymization technique | |||
| applied to a referenced Information Element within a referenced | applied to a referenced Information Element within a referenced | |||
| Template. When flags are clear (0), the normal policy (as | Template. When flags are clear (0), the normal policy (as | |||
| described by anonymisationTechnique) applies without modification. | described by anonymizationTechnique) applies without modification. | |||
| MSB 14 13 12 11 10 9 8 7 6 5 4 3 2 1 LSB | MSB 14 13 12 11 10 9 8 7 6 5 4 3 2 1 LSB | |||
| +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | |||
| | Reserved |LOR|PmA| SC | | | Reserved |LOR|PmA| SC | | |||
| +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | |||
| anonymisationFlags IE | ||||
| anonymizationFlags IE | ||||
| +--------+----------+-----------------------------------------------+ | +--------+----------+-----------------------------------------------+ | |||
| | bit(s) | name | description | | | bit(s) | name | description | | |||
| | (LSB = | | | | | (LSB = | | | | |||
| | 0) | | | | | 0) | | | | |||
| +--------+----------+-----------------------------------------------+ | +--------+----------+-----------------------------------------------+ | |||
| | 0-1 | SC | Stability Class: see the Stability Class | | | 0-1 | SC | Stability Class: see the Stability Class | | |||
| | | | table below, and section Section 5.1. | | | | | table below, and section Section 5.1. | | |||
| | 2 | PmA | Perimeter Anonymisation: when set (1), | | | 2 | PmA | Perimeter Anonymization: when set (1), | | |||
| | | | source- Information Elements as described in | | | | | source- Information Elements as described in | | |||
| | | | [RFC5103] are interpreted as external | | | | | [RFC5103] are interpreted as external | | |||
| | | | addresses, and destination- Information | | | | | addresses, and destination- Information | | |||
| | | | Elements as described in [RFC5103] are | | | | | Elements as described in [RFC5103] are | | |||
| | | | interpreted as internal addresses, for the | | | | | interpreted as internal addresses, for the | | |||
| | | | purposes of associating | | | | | purposes of associating | | |||
| | | | anonymisationTechnique to Information | | | | | anonymizationTechnique to Information | | |||
| | | | Elements only; see Section 7.2.2 for details. | | | | | Elements only; see Section 7.2.2 for details. | | |||
| | | | This bit MUST NOT be set when associated with | | | | | This bit MUST NOT be set when associated with | | |||
| | | | a non-endpoint (i.e., source- or | | | | | a non-endpoint (i.e., source- or | | |||
| | | | destination-) Information Element. SHOULD be | | | | | destination-) Information Element. SHOULD be | | |||
| | | | consistent within a record (i.e., if a | | | | | consistent within a record (i.e., if a | | |||
| | | | source- Information Element has this flag | | | | | source- Information Element has this flag | | |||
| | | | set, the corresponding destination- element | | | | | set, the corresponding destination- element | | |||
| | | | SHOULD have this flag set, and vice-versa.) | | | | | SHOULD have this flag set, and vice-versa.) | | |||
| | 3 | LOR | Low-Order Unchanged: when set (1), the | | | 3 | LOR | Low-Order Unchanged: when set (1), the | | |||
| | | | low-order bits of the anonymised Information | | | | | low-order bits of the anonymized Information | | |||
| | | | Element contain real data. This modification | | | | | Element contain real data. This modification | | |||
| | | | is intended for the anonymisation of | | | | | is intended for the anonymization of | | |||
| | | | network-level addresses while leaving | | | | | network-level addresses while leaving | | |||
| | | | host-level addresses intact in order to | | | | | host-level addresses intact in order to | | |||
| | | | preserve host level-structure, which could | | | | | preserve host level-structure, which could | | |||
| | | | otherwise be used to reverse anonymisation. | | | | | otherwise be used to reverse anonymization. | | |||
| | | | MUST NOT be set when associated with a | | | | | MUST NOT be set when associated with a | | |||
| | | | truncation-based anonymisationTechnique. | | | | | truncation-based anonymizationTechnique. | | |||
| | 4-15 | Reserved | Reserved for future use: SHOULD be cleared | | | 4-15 | Reserved | Reserved for future use: SHOULD be cleared | | |||
| | | | (0) by the Exporting Process and MUST be | | | | | (0) by the Exporting Process and MUST be | | |||
| | | | ignored by the Collecting Process. | | | | | ignored by the Collecting Process. | | |||
| +--------+----------+-----------------------------------------------+ | +--------+----------+-----------------------------------------------+ | |||
| The Stability Class portion of this flags word describes the | The Stability Class portion of this flags word describes the | |||
| stability class of the anonymisation technique applied to a | stability class of the anonymization technique applied to a | |||
| referenced Information Element within a referenced Template. | referenced Information Element within a referenced Template. | |||
| Stability classes refer to the stability of the parameters of the | Stability classes refer to the stability of the parameters of the | |||
| anonymisation technique, and therefore the comparability of the | anonymization technique, and therefore the comparability of the | |||
| mapping between the real and anonymised values over time. This | mapping between the real and anonymized values over time. This | |||
| determines which anonymised datasets may be compared with each | determines which anonymized datasets may be compared with each | |||
| other. Values are as follows: | other. Values are as follows: | |||
| +-----+-----+-------------------------------------------------------+ | +-----+-----+-------------------------------------------------------+ | |||
| | Bit | Bit | Description | | | Bit | Bit | Description | | |||
| | 1 | 0 | | | | 1 | 0 | | | |||
| +-----+-----+-------------------------------------------------------+ | +-----+-----+-------------------------------------------------------+ | |||
| | 0 | 0 | Undefined: the Exporting Process makes no | | | 0 | 0 | Undefined: the Exporting Process makes no | | |||
| | | | representation as to how stable the mapping is, or | | | | | representation as to how stable the mapping is, or | | |||
| | | | over what time period values of this field will | | | | | over what time period values of this field will | | |||
| | | | remain comparable; while the Collecting Process MAY | | | | | remain comparable; while the Collecting Process MAY | | |||
| | | | assume Session level stability, Session level | | | | | assume Session level stability, Session level | | |||
| | | | stability is not guaranteed. Processes SHOULD assume | | | | | stability is not guaranteed. Processes SHOULD assume | | |||
| | | | this is the case in the absence of stability class | | | | | this is the case in the absence of stability class | | |||
| | | | information; this is the default stability class. | | | | | information; this is the default stability class. | | |||
| | 0 | 1 | Session: the Exporting Process will ensure that the | | | 0 | 1 | Session: the Exporting Process will ensure that the | | |||
| | | | parameters of the anonymisation technique are stable | | | | | parameters of the anonymization technique are stable | | |||
| | | | during the Transport Session. All the values of the | | | | | during the Transport Session. All the values of the | | |||
| | | | described Information Element for each Record | | | | | described Information Element for each Record | | |||
| | | | described by the referenced Template within the | | | | | described by the referenced Template within the | | |||
| | | | Transport Session are comparable. The Exporting | | | | | Transport Session are comparable. The Exporting | | |||
| | | | Process SHOULD endeavour to ensure at least this | | | | | Process SHOULD endeavour to ensure at least this | | |||
| | | | stability class. | | | | | stability class. | | |||
| | 1 | 0 | Exporter-Collector Pair: the Exporting Process will | | | 1 | 0 | Exporter-Collector Pair: the Exporting Process will | | |||
| | | | ensure that the parameters of the anonymisation | | | | | ensure that the parameters of the anonymization | | |||
| | | | technique are stable across Transport Sessions over | | | | | technique are stable across Transport Sessions over | | |||
| | | | time with the given Collecting Process, but may use | | | | | time with the given Collecting Process, but may use | | |||
| | | | different parameters for different Collecting | | | | | different parameters for different Collecting | | |||
| | | | Processes. Data exported to different Collecting | | | | | Processes. Data exported to different Collecting | | |||
| | | | Processes are not comparable. | | | | | Processes are not comparable. | | |||
| | 1 | 1 | Stable: the Exporting Process will ensure that the | | | 1 | 1 | Stable: the Exporting Process will ensure that the | | |||
| | | | parameters of the anonymisation technique are stable | | | | | parameters of the anonymization technique are stable | | |||
| | | | across Transport Sessions over time, regardless of | | | | | across Transport Sessions over time, regardless of | | |||
| | | | the Collecting Process to which it is sent. | | | | | the Collecting Process to which it is sent. | | |||
| +-----+-----+-------------------------------------------------------+ | +-----+-----+-------------------------------------------------------+ | |||
| Abstract Data Type: unsigned16 | Abstract Data Type: unsigned16 | |||
| Data Type Semantics: flags | ||||
| ElementId: TBD1 | ElementId: TBD1 | |||
| Status: Proposed | Status: Current | |||
| 7. Applying Anonymisation Techniques to IPFIX Export and Storage | 7. Applying Anonymization Techniques to IPFIX Export and Storage | |||
| When exporting or storing anonymised flow data using IPFIX, certain | When exporting or storing anonymized flow data using IPFIX, certain | |||
| interactions between the IPFIX Protocol and the anonymisation | interactions between the IPFIX Protocol and the anonymization | |||
| techniques in use must be considered; these are treated in the | techniques in use must be considered; these are treated in the | |||
| subsections below. | subsections below. | |||
| 7.1. Arrangement of Processes in IPFIX Anonymisation | 7.1. Arrangement of Processes in IPFIX Anonymization | |||
| Anonymisation may be applied to IPFIX data at three stages within the | Anonymization may be applied to IPFIX data at three stages within the | |||
| collection infrastructure: on initial export, at a mediator, or after | collection infrastructure: on initial export, at a mediator, or after | |||
| collection, as shown in Figure 1. Each of these locations has | collection, as shown in Figure 1. Each of these locations has | |||
| specific considerations and applicability. | specific considerations and applicability. | |||
| +==========================================+ | +==========================================+ | |||
| | Exporting Process | | | Exporting Process | | |||
| +==========================================+ | +==========================================+ | |||
| | | | | | | |||
| | (Anonymised at Original Exporter) | | | (Anonymized at Original Exporter) | | |||
| V | | V | | |||
| +=============================+ | | +=============================+ | | |||
| | Mediator | | | | Mediator | | | |||
| +=============================+ | | +=============================+ | | |||
| | | | | | | |||
| | (Anonymising Mediator) | | | (Anonymising Mediator) | | |||
| V V | V V | |||
| +==========================================+ | +==========================================+ | |||
| | Collecting Process | | | Collecting Process | | |||
| +==========================================+ | +==========================================+ | |||
| | | | | |||
| | (Anonymising CP/File Writer) | | (Anonymising CP/File Writer) | |||
| V | V | |||
| +--------------------+ | +--------------------+ | |||
| | IPFIX File Storage | | | IPFIX File Storage | | |||
| +--------------------+ | +--------------------+ | |||
| Figure 1: Potential Anonymisation Locations | Figure 1: Potential Anonymization Locations | |||
| Anonymisation is generally performed before the wider dissemination | Anonymization is generally performed before the wider dissemination | |||
| or repurposing of a flow data set, e.g., adapting operational | or repurposing of a flow data set, e.g., adapting operational | |||
| measurement data for research. Therefore, direct anonymisation of | measurement data for research. Therefore, direct anonymization of | |||
| flow data on initial export is only applicable in certain restricted | flow data on initial export is only applicable in certain restricted | |||
| circumstances: when the Exporting Process (EP) is "publishing" data | circumstances: when the Exporting Process (EP) is "publishing" data | |||
| to a Collecting Process (CP) directly, and the Exporting Process and | to a Collecting Process (CP) directly, and the Exporting Process and | |||
| Collecting Process are operated by different entities. Note that | Collecting Process are operated by different entities. Note that | |||
| certain guidelines in Section 7.2.3 with respect to timestamp | certain guidelines in Section 7.2.3 with respect to timestamp | |||
| anonymisation may not apply in this case, as the Collecting Process | anonymization may not apply in this case, as the Collecting Process | |||
| may be able to deduce certain timing information from the time at | may be able to deduce certain timing information from the time at | |||
| which each Message is received. | which each Message is received. | |||
| A much more flexible arrangement is to anonymise data within a | A much more flexible arrangement is to anonymize data within a | |||
| Mediator [I-D.ietf-ipfix-mediators-framework]. Here, original data | Mediator [I-D.ietf-ipfix-mediators-framework]. Here, original data | |||
| is sent to a Mediator, which performs the anonymisation function and | is sent to a Mediator, which performs the anonymization function and | |||
| re-exports the anonymised data. Such a Mediator could be located at | re-exports the anonymized data. Such a Mediator could be located at | |||
| the administrative domain boundary of the initial Exporting Process | the administrative domain boundary of the initial Exporting Process | |||
| operator, exporting anonymised data to other consumers outside the | operator, exporting anonymized data to other consumers outside the | |||
| organisation. In this case, the original Exporter SHOULD use TLS as | organization. In this case, the original Exporter SHOULD use TLS | |||
| specified in [RFC5101] to secure the channel to the Mediator, and the | [RFC5246] as specified in [RFC5101] to secure the channel to the | |||
| Mediator should follow the guidelines in Section 7.2, to mitigate the | Mediator, and the Mediator should follow the guidelines in | |||
| risk of original data disclosure. | Section 7.2, to mitigate the risk of original data disclosure. | |||
| When data is to be published as an anonymised data set in an IPFIX | When data is to be published as an anonymized data set in an IPFIX | |||
| File [RFC5655], the anonymisation may be done at the final Collecting | File [RFC5655], the anonymization may be done at the final Collecting | |||
| Process before storage and dissemination, as well. In this case, the | Process before storage and dissemination, as well. In this case, the | |||
| Collector should follow the guidelines in Section 7.2, especially as | Collector should follow the guidelines in Section 7.2, especially as | |||
| regards File-specific Options in Section 7.2.4 | regards File-specific Options in Section 7.2.4 | |||
| In each of these data flows, the anonymisation of records is | In each of these data flows, the anonymization of records is | |||
| undertaken by an Intermediate Anonymisation Process (IAP); the data | undertaken by an Intermediate Anonymization Process (IAP); the data | |||
| flows into and out of this IAP are shown in Figure 2 below. | flows into and out of this IAP are shown in Figure 2 below. | |||
| packets --+ +- IPFIX Messages -+ | packets --+ +- IPFIX Messages -+ | |||
| | | | | | | | | |||
| V V V | V V V | |||
| +==================+ +====================+ +=============+ | +==================+ +====================+ +=============+ | |||
| | Metering Process | | Collecting Process | | File Reader | | | Metering Process | | Collecting Process | | File Reader | | |||
| +==================+ +====================+ +=============+ | +==================+ +====================+ +=============+ | |||
| | Non-anonymised | Records | | | Non-anonymized | Records | | |||
| V V V | V V V | |||
| +=========================================================+ | +=========================================================+ | |||
| | Intermediate Anonymisation Process (IAP) | | | Intermediate Anonymization Process (IAP) | | |||
| +=========================================================+ | +=========================================================+ | |||
| | Anonymised ^ Anonymised | | | Anonymized ^ Anonymized | | |||
| | Records | Records | | | Records | Records | | |||
| V | V | V | V | |||
| +===================+ Anonymisation +=============+ | +===================+ Anonymization +=============+ | |||
| | Exporting Process |<--- Parameters ------>| File Writer | | | Exporting Process |<--- Parameters ------>| File Writer | | |||
| +===================+ +=============+ | +===================+ +=============+ | |||
| | | | | | | |||
| +------------> IPFIX Messages <----------+ | +------------> IPFIX Messages <----------+ | |||
| Figure 2: Data flows through the anonymisation process | Figure 2: Data flows through the anonymization process | |||
| Anonymisation parameters must also be available to the Exporting | Anonymization parameters must also be available to the Exporting | |||
| Process and/or File Writer in order to ensure header data is also | Process and/or File Writer in order to ensure header data is also | |||
| appropriately anonymised as in Section 7.2.3. | appropriately anonymized as in Section 7.2.3. | |||
| Following each of the data flows through the IAP, we describe five | Following each of the data flows through the IAP, we describe five | |||
| basic types of anonymisation arrangements within this framework in | basic types of anonymization arrangements within this framework in | |||
| Figure 3. In addition to the three arrangements described in detail | Figure 3. In addition to the three arrangements described in detail | |||
| above, anonymisation can also be done at a collocated Metering | above, anonymization can also be done at a collocated Metering | |||
| Process (MP) and File Writer (FW) (see section 7.3.2 of [RFC5655]), | Process (MP) and File Writer (FW) (see section 7.3.2 of [RFC5655]), | |||
| or at a file manipulator, which combines a File Writer with a File | or at a file manipulator, which combines a File Writer with a File | |||
| Reader (FR) (see section 7.3.7 of [RFC5655]). | Reader (FR) (see section 7.3.7 of [RFC5655]). | |||
| +----+ +-----+ +----+ | +----+ +-----+ +----+ | |||
| pkts -> | MP |->| IAP |->| EP |-> anonymisation on Original Exporter | pkts -> | MP |->| IAP |->| EP |-> anonymization on Original Exporter | |||
| +----+ +-----+ +----+ | +----+ +-----+ +----+ | |||
| +----+ +-----+ +----+ | +----+ +-----+ +----+ | |||
| pkts -> | MP |->| IAP |->| FW |-> Anonymising collocated MP/File Writer | pkts -> | MP |->| IAP |->| FW |-> Anonymising collocated MP/File Writer | |||
| +----+ +-----+ +----+ | +----+ +-----+ +----+ | |||
| +----+ +-----+ +----+ | +----+ +-----+ +----+ | |||
| IPFIX -> | CP |->| IAP |->| EP |-> Anonymising Mediator (Masq. Proxy) | IPFIX -> | CP |->| IAP |->| EP |-> Anonymising Mediator (Masq. Proxy) | |||
| +----+ +-----+ +----+ | +----+ +-----+ +----+ | |||
| +----+ +-----+ +----+ | +----+ +-----+ +----+ | |||
| IPFIX -> | CP |->| IAP |->| FW |-> Anonymising collocated CP/File Writer | IPFIX -> | CP |->| IAP |->| FW |-> Anonymising collocated CP/File Writer | |||
| +----+ +-----+ +----+ | +----+ +-----+ +----+ | |||
| +----+ +-----+ +----+ | +----+ +-----+ +----+ | |||
| IPFIX -> | FR |->| IAP |->| FW |-> Anonymising file manipulator | IPFIX -> | FR |->| IAP |->| FW |-> Anonymising file manipulator | |||
| File +----+ +-----+ +----+ | File +----+ +-----+ +----+ | |||
| Figure 3: Possible anonymisation arrangements in the IPFIX | Figure 3: Possible anonymization arrangements in the IPFIX | |||
| architecture | architecture | |||
| Note that anonymisation may occur at more than one location within a | Note that anonymization may occur at more than one location within a | |||
| given collection infrastructure, to provide varying levels of | given collection infrastructure, to provide varying levels of | |||
| anonymisation, disclosure risk, or data utility for specific | anonymization, disclosure risk, or data utility for specific | |||
| purposes. | purposes. | |||
| 7.2. IPFIX-Specific Anonymisation Guidelines | 7.2. IPFIX-Specific Anonymization Guidelines | |||
| In implementing and deploying the anonymisation techniques described | In implementing and deploying the anonymization techniques described | |||
| in this document, implementors should note that IPFIX already | in this document, implementors should note that IPFIX already | |||
| provides features that support anonymised data export, and use these | provides features that support anonymized data export, and use these | |||
| where appropriate. Care must also be taken that data structures | where appropriate. Care must also be taken that data structures | |||
| supporting the operation of the protocol itself do not leak data that | supporting the operation of the protocol itself do not leak data that | |||
| could be used to reverse the anonymisation applied to the flow data. | could be used to reverse the anonymization applied to the flow data. | |||
| Such data structures may appear in the header, or within the data | Such data structures may appear in the header, or within the data | |||
| stream itself, especially as options data. Each of these and their | stream itself, especially as options data. Each of these and their | |||
| impact on specific anonymisation techniques is noted in a separate | impact on specific anonymization techniques is noted in a separate | |||
| subsection below. | subsection below. | |||
| 7.2.1. Appropriate Use of Information Elements for Anonymised Data | 7.2.1. Appropriate Use of Information Elements for Anonymized Data | |||
| Note, as in Section 6 above, that black-marker anonymised fields | Note, as in Section 6 above, that black-marker anonymized fields | |||
| SHOULD NOT be exported at all; the absence of the field in a given | SHOULD NOT be exported at all; the absence of the field in a given | |||
| Data Set is implicitly declared by not including the corresponding | Data Set is implicitly declared by not including the corresponding | |||
| Information Element in the Template describing that Data Set. | Information Element in the Template describing that Data Set. | |||
| When using precision degradation of timestamps, Exporting Processes | When using precision degradation of timestamps, Exporting Processes | |||
| SHOULD export timing information using Information Elements of an | SHOULD export timing information using Information Elements of an | |||
| appropriate precision, as explained in Section 4.5 of [RFC5153]. For | appropriate precision, as explained in Section 4.5 of [RFC5153]. For | |||
| example, timestamps measured in millisecond-level precision and | example, timestamps measured in millisecond-level precision and | |||
| degraded to second-level precision should use flowStartSeconds and | degraded to second-level precision should use flowStartSeconds and | |||
| flowEndSeconds, not flowStartMilliseconds and flowEndMilliseconds. | flowEndSeconds, not flowStartMilliseconds and flowEndMilliseconds. | |||
| When exporting anonymised data and anonymisation metadata, Exporting | When exporting anonymized data and anonymization metadata, Exporting | |||
| Processes SHOULD ensure that the combination of Information Element | Processes SHOULD ensure that the combination of Information Element | |||
| and declared anonymisation technique are compatible. Specifically, | and declared anonymization technique are compatible. Specifically, | |||
| the applicable and recommended Information Element types and | the applicable and recommended Information Element types and | |||
| semantics for each technique are noted in the description of the | semantics for each technique are noted in the description of the | |||
| anonymisationTechnique Information Element in Section 6.2.2. In this | anonymizationTechnique Information Element in Section 6.2.2. In this | |||
| description, a timestamp is an Information Element with the data type | description, a timestamp is an Information Element with the data type | |||
| dateTimeSeconds, dataTimeMilliseconds, dateTimeMicroseconds, or | dateTimeSeconds, dataTimeMilliseconds, dateTimeMicroseconds, or | |||
| dateTimeNanoseconds; an address is an Information Element with the | dateTimeNanoseconds; an address is an Information Element with the | |||
| data type ipv4Address, ipv6Address, or macAddress; and an identifier | data type ipv4Address, ipv6Address, or macAddress; and an identifier | |||
| is an Information Element with identifier data type semantics. | is an Information Element with identifier data type semantics. | |||
| Exporting Process MUST NOT export Anonymisation Options records | Exporting Process MUST NOT export Anonymization Options records | |||
| binding techniques to Information Elements to which they are not | binding techniques to Information Elements to which they are not | |||
| applicable, and SHOULD NOT export Anonymisation Options records | applicable, and SHOULD NOT export Anonymization Options records | |||
| binding techniques to Information Elements for which they are not | binding techniques to Information Elements for which they are not | |||
| recommended. | recommended. | |||
| 7.2.2. Export of Perimeter-Based Anonymisation Policies | 7.2.2. Export of Perimeter-Based Anonymization Policies | |||
| Data collected from a single network may require different | Data collected from a single network may require different | |||
| anonymisation policies for addresses internal and external to the | anonymization policies for addresses internal and external to the | |||
| network. For example, internal addresses could be subject to simple | network. For example, internal addresses could be subject to simple | |||
| permutation, while external addresses could be aggregated into | permutation, while external addresses could be aggregated into | |||
| networks by truncation. When exporting anonymised perimeter | networks by truncation. When exporting anonymized perimeter | |||
| bidirectional flow (biflow) data as in section 5.2 of [RFC5103], this | bidirectional flow (biflow) data as in section 5.2 of [RFC5103], this | |||
| arrangement may be easily represented by specifying one technique for | arrangement may be easily represented by specifying one technique for | |||
| source endpoint information (which represents the external endpoint | source endpoint information (which represents the external endpoint | |||
| in a perimeter biflow) and one technique for destination endpoint | in a perimeter biflow) and one technique for destination endpoint | |||
| information (which represents the internal address in a perimeter | information (which represents the internal address in a perimeter | |||
| biflow). | biflow). | |||
| However, it can also be useful to represent perimeter-based | However, it can also be useful to represent perimeter-based | |||
| anonymisation policies with unidirectional flow (uniflow), or non- | anonymization policies with unidirectional flow (uniflow), or non- | |||
| perimeter biflow data. In this case, the Perimeter Anonymisation bit | perimeter biflow data. In this case, the Perimeter Anonymization bit | |||
| (bit 2) in the anonymisationFlags Information Element describing the | (bit 2) in the anonymizationFlags Information Element describing the | |||
| anonymised address Information Elements can be set to change the | anonymized address Information Elements can be set to change the | |||
| meaning of "source" and "destination" of Information Elements to mean | meaning of "source" and "destination" of Information Elements to mean | |||
| "external" and "internal" as with perimeter biflows, but only with | "external" and "internal" as with perimeter biflows, but only with | |||
| respect to anonymisation policies. | respect to anonymization policies. | |||
| 7.2.3. Anonymisation of Header Data | 7.2.3. Anonymization of Header Data | |||
| Each IPFIX Message contains a Message Header; within this Message | Each IPFIX Message contains a Message Header; within this Message | |||
| Header are contained two fields which may be used to break certain | Header are contained two fields which may be used to break certain | |||
| anonymisation techniques: the Export Time, and the Observation Domain | anonymization techniques: the Export Time, and the Observation Domain | |||
| ID | ID | |||
| Export of IPFIX Messages containing anonymised timestamp data where | Export of IPFIX Messages containing anonymized timestamp data where | |||
| the original Export Time Message header has some relationship to the | the original Export Time Message header has some relationship to the | |||
| anonymised timestamps SHOULD anonymise the Export Time header field | anonymized timestamps SHOULD anonymize the Export Time header field | |||
| so that the Export Time is consistent with the anonymised timestamp | so that the Export Time is consistent with the anonymized timestamp | |||
| data. Otherwise, relationships between export and flow time could be | data. Otherwise, relationships between export and flow time could be | |||
| used to partially or totally reverse timestamp anonymisation. When | used to partially or totally reverse timestamp anonymization. When | |||
| anonymising timestamps and the Export Time header field SHOULD avoid | anonymising timestamps and the Export Time header field SHOULD avoid | |||
| times too far in the past or future; while [RFC5101] does not make | times too far in the past or future; while [RFC5101] does not make | |||
| any allowance for Export Time error detection, it is sensible that | any allowance for Export Time error detection, it is sensible that | |||
| Collecting Processes may interpret Messages with seemingly | Collecting Processes may interpret Messages with seemingly | |||
| nonsensical Export Times as erroneous. Specific limits are | nonsensical Export Times as erroneous. Specific limits are | |||
| implementation-dependent, but this issue may cause interoperability | implementation-dependent, but this issue may cause interoperability | |||
| issues when anonymising the Export Time header field. | issues when anonymising the Export Time header field. | |||
| The similarity in size between an Observation Domain ID and an IPv4 | The similarity in size between an Observation Domain ID and an IPv4 | |||
| address (32 bits) may lead to a temptation to use an IPv4 interface | address (32 bits) may lead to a temptation to use an IPv4 interface | |||
| address on the Metering or Exporting Process as the Observation | address on the Metering or Exporting Process as the Observation | |||
| Domain ID. If this address bears some relation to the IP addresses | Domain ID. If this address bears some relation to the IP addresses | |||
| in the flow data (e.g., shares a network prefix with internal | in the flow data (e.g., shares a network prefix with internal | |||
| addresses) and the IP addresses in the flow data are anonymised in a | addresses) and the IP addresses in the flow data are anonymized in a | |||
| structure-preserving way, then the Observation Domain ID may be used | structure-preserving way, then the Observation Domain ID may be used | |||
| to break the IP address anonymisation. Use of an IPv4 interface | to break the IP address anonymization. Use of an IPv4 interface | |||
| address on the Metering or Exporting Process as the Observation | address on the Metering or Exporting Process as the Observation | |||
| Domain ID is NOT RECOMMENDED in this case. | Domain ID is NOT RECOMMENDED in this case. | |||
| 7.2.4. Anonymisation of Options Data | 7.2.4. Anonymization of Options Data | |||
| IPFIX uses the Options mechanism to export, among other things, | IPFIX uses the Options mechanism to export, among other things, | |||
| metadata about exported flows and the flow collection infrastructure. | metadata about exported flows and the flow collection infrastructure. | |||
| As with the IPFIX Message Header, certain Options recommended in | As with the IPFIX Message Header, certain Options recommended in | |||
| [RFC5101] and [RFC5655] containing flow timestamps and network | [RFC5101] and [RFC5655] containing flow timestamps and network | |||
| addresses of Exporting and Collecting Processes may be used to break | addresses of Exporting and Collecting Processes may be used to break | |||
| certain anonymisation techniques. When using these Options along | certain anonymization techniques. When using these Options along | |||
| anonymised data export and storage, values within the Options which | anonymized data export and storage, values within the Options which | |||
| could be used to break the anonymisation SHOULD themselves be | could be used to break the anonymization SHOULD themselves be | |||
| anonymised or omitted. | anonymized or omitted. | |||
| The Exporting Process Reliability Statistics Options Template, | The Exporting Process Reliability Statistics Options Template, | |||
| recommended in [RFC5101], contains an Exporting Process ID field, | recommended in [RFC5101], contains an Exporting Process ID field, | |||
| which may be an exportingProcessIPv4Address Information Element or an | which may be an exportingProcessIPv4Address Information Element or an | |||
| exportingProcessIPv6Address Information Element. If the Exporting | exportingProcessIPv6Address Information Element. If the Exporting | |||
| Process address bears some relation to the IP addresses in the flow | Process address bears some relation to the IP addresses in the flow | |||
| data (e.g., shares a network prefix with internal addresses) and the | data (e.g., shares a network prefix with internal addresses) and the | |||
| IP addresses in the flow data are anonymised in a structure- | IP addresses in the flow data are anonymized in a structure- | |||
| preserving way, then the Exporting Process address may be used to | preserving way, then the Exporting Process address may be used to | |||
| break the IP address anonymisation. Exporting Processes exporting | break the IP address anonymization. Exporting Processes exporting | |||
| anonymised data in this situation SHOULD mitigate the risk of attack | anonymized data in this situation SHOULD mitigate the risk of attack | |||
| either by omitting Options described by the Exporting Process | either by omitting Options described by the Exporting Process | |||
| Reliability Statistics Options Template, or by anonymising the | Reliability Statistics Options Template, or by anonymising the | |||
| Exporting Process address using a similar technique to that used to | Exporting Process address using a similar technique to that used to | |||
| anonymise the IP addresses in the exported data. | anonymize the IP addresses in the exported data. | |||
| Similarly, the Export Session Details Options Template and Message | Similarly, the Export Session Details Options Template and Message | |||
| Details Options Template specified for the IPFIX File Format | Details Options Template specified for the IPFIX File Format | |||
| [RFC5655] may contain the exportingProcessIPv4Address Information | [RFC5655] may contain the exportingProcessIPv4Address Information | |||
| Element or the exportingProcessIPv6Address Information Element to | Element or the exportingProcessIPv6Address Information Element to | |||
| identify an Exporting Process from which a flow record was received, | identify an Exporting Process from which a flow record was received, | |||
| and the collectingProcessIPv4Address Information Element or the | and the collectingProcessIPv4Address Information Element or the | |||
| collectingProcessIPv6Address Information Element to identify the | collectingProcessIPv6Address Information Element to identify the | |||
| Collecting Process which received it. If the Exporting Process or | Collecting Process which received it. If the Exporting Process or | |||
| Collecting Process address bears some relation to the IP addresses in | Collecting Process address bears some relation to the IP addresses in | |||
| the data set (e.g., shares a network prefix with internal addresses) | the data set (e.g., shares a network prefix with internal addresses) | |||
| and the IP addresses in the data set are anonymised in a structure- | and the IP addresses in the data set are anonymized in a structure- | |||
| preserving way, then the Exporting Process or Collecting Process | preserving way, then the Exporting Process or Collecting Process | |||
| address may be used to break the IP address anonymisation. Since | address may be used to break the IP address anonymization. Since | |||
| these Options Templates are primarily intended for storing IPFIX | these Options Templates are primarily intended for storing IPFIX | |||
| Transport Session data for auditing, replay, and testing purposes, it | Transport Session data for auditing, replay, and testing purposes, it | |||
| is NOT RECOMMENDED that storage of anonymised data include these | is NOT RECOMMENDED that storage of anonymized data include these | |||
| Options Templates in order to mitigate the risk of attack. | Options Templates in order to mitigate the risk of attack. | |||
| The Message Details Options Template specified for the IPFIX File | The Message Details Options Template specified for the IPFIX File | |||
| Format [RFC5655] also contains the collectionTimeMilliseconds | Format [RFC5655] also contains the collectionTimeMilliseconds | |||
| Information Element. As with the Export Time Message Header field, | Information Element. As with the Export Time Message Header field, | |||
| if the exported data set contains anonymised timestamp information, | if the exported data set contains anonymized timestamp information, | |||
| and the collectionTimeMilliseconds Information Element in a given | and the collectionTimeMilliseconds Information Element in a given | |||
| Message has some relationship to the anonymised timestamp | Message has some relationship to the anonymized timestamp | |||
| information, then this relationship can be exploited to reverse the | information, then this relationship can be exploited to reverse the | |||
| timestamp anonymisation. Since this Options Template is primarily | timestamp anonymization. Since this Options Template is primarily | |||
| intended for storing IPFIX Transport Session data for auditing, | intended for storing IPFIX Transport Session data for auditing, | |||
| replay, and testing purposes, it is NOT RECOMMENDED that storage of | replay, and testing purposes, it is NOT RECOMMENDED that storage of | |||
| anonymised data include this Options Template in order to mitigate | anonymized data include this Options Template in order to mitigate | |||
| the risk of attack. | the risk of attack. | |||
| Since the Time Window Options Template specified for the IPFIX File | Since the Time Window Options Template specified for the IPFIX File | |||
| Format [RFC5655] refers to the timestamps within the data set to | Format [RFC5655] refers to the timestamps within the data set to | |||
| provide partial table of contents information for an IPFIX File, | provide partial table of contents information for an IPFIX File, | |||
| Options described by this template SHOULD be written using the | Options described by this template SHOULD be written using the | |||
| anonymised timestamps instead of the original ones. | anonymized timestamps instead of the original ones. | |||
| 7.2.5. Special-Use Address Space Considerations | 7.2.5. Special-Use Address Space Considerations | |||
| When anonymising data for transport or storage using IPFIX containing | When anonymising data for transport or storage using IPFIX containing | |||
| anonymised IP addresses, and the analysis purpose permits doing so, | anonymized IP addresses, and the analysis purpose permits doing so, | |||
| it is RECOMMENDED to filter out or leave unanonymised data containing | it is RECOMMENDED to filter out or leave unanonymized data containing | |||
| the special-use IPv4 addresses enumerated in [RFC5735] or the | the special-use IPv4 addresses enumerated in [RFC5735] or the | |||
| special-use IPv6 addresses enumerated in [RFC5156]. Data containing | special-use IPv6 addresses enumerated in [RFC5156]. Data containing | |||
| these addresses (e.g. 0.0.0.0 and 169.254.0.0/16 for link-local | these addresses (e.g. 0.0.0.0 and 169.254.0.0/16 for link-local | |||
| autoconfiguration in IPv4 space) are often associated with specific, | autoconfiguration in IPv4 space) are often associated with specific, | |||
| well-known behavioral patterns. Detection of these patterns in | well-known behavioral patterns. Detection of these patterns in | |||
| anonymised data can lead to deanonymisation of these special-use | anonymized data can lead to deanonymization of these special-use | |||
| addresses, which increases the chance of a complete reversal of | addresses, which increases the chance of a complete reversal of | |||
| anonymisation by an attacker, especially of prefix-preserving | anonymization by an attacker, especially of prefix-preserving | |||
| techniques. | techniques. | |||
| 7.2.6. Protecting Out-of-Band Configuration and Management Data | 7.2.6. Protecting Out-of-Band Configuration and Management Data | |||
| Special care should be taken when exporting or sharing anonymised | Special care should be taken when exporting or sharing anonymized | |||
| data to avoid information leakage via the configuration or management | data to avoid information leakage via the configuration or management | |||
| planes of the IPFIX Device containing the Exporting Process or the | planes of the IPFIX Device containing the Exporting Process or the | |||
| File Writer. For example, adding noise to counters is useless if the | File Writer. For example, adding noise to counters is useless if the | |||
| receiver can deduce the values in the counters from SNMP information, | receiver can deduce the values in the counters from SNMP information, | |||
| and concealing the network under test is similarly useless if such | and concealing the network under test is similarly useless if such | |||
| information is available in a configuration document. As the | information is available in a configuration document. As the | |||
| specifics of these concerns are largely implementation- and | specifics of these concerns are largely implementation- and | |||
| deployment-dependent, specific mitigation is out of scope for this | deployment-dependent, specific mitigation is out of scope for this | |||
| draft. The general ground rule is that information of similar type | draft. The general ground rule is that information of similar type | |||
| to that anonymised SHOULD NOT be made available to the receiver by | to that anonymized SHOULD NOT be made available to the receiver by | |||
| any means, whether in the Data Records, in IPFIX protocol structures | any means, whether in the Data Records, in IPFIX protocol structures | |||
| such as Message Headers, or out-of-band. | such as Message Headers, or out-of-band. | |||
| 8. Examples | 8. Examples | |||
| In this example, consider the export or storage of an anonymised IPv4 | In this example, consider the export or storage of an anonymized IPv4 | |||
| data set from a single network described by a simple template | data set from a single network described by a simple template | |||
| containing a timestamp in seconds, a five-tuple, and packet and octet | containing a timestamp in seconds, a five-tuple, and packet and octet | |||
| counters. The template describing each record in this data set is | counters. The template describing each record in this data set is | |||
| shown in figure Figure 4. | shown in figure Figure 4. | |||
| 1 2 3 | 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Set ID = 2 | Length = 40 | | | Set ID = 2 | Length = 40 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| skipping to change at page 33, line 31 ¶ | skipping to change at page 35, line 31 ¶ | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| |0| packetDeltaCount 2 | Field Length = 4 | | |0| packetDeltaCount 2 | Field Length = 4 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| |0| octetDeltaCount 1 | Field Length = 4 | | |0| octetDeltaCount 1 | Field Length = 4 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| |0| protocolIdentifier 4 | Field Length = 1 | | |0| protocolIdentifier 4 | Field Length = 1 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Figure 4: Example Flow Template | Figure 4: Example Flow Template | |||
| Suppose that this data set is anonymised according to the following | Suppose that this data set is anonymized according to the following | |||
| policy: | policy: | |||
| o IP addresses within the network are protected by reverse | o IP addresses within the network are protected by reverse | |||
| truncation. | truncation. | |||
| o IP addresses outside the network are protected by prefix- | o IP addresses outside the network are protected by prefix- | |||
| preserving anonymisation. | preserving anonymization. | |||
| o Octet counts are exported using degraded precision in order to | o Octet counts are exported using degraded precision in order to | |||
| provide minimal protection against fingerprinting attacks. | provide minimal protection against fingerprinting attacks. | |||
| o All other fields are exported unanonymised. | o All other fields are exported unanonymized. | |||
| In order to export anonymisation records for this template and | In order to export anonymization records for this template and | |||
| policy, first, the Anonymisation Options Template shown in figure | policy, first, the Anonymization Options Template shown in figure | |||
| Figure 5 is exported. For this example, the optional | Figure 5 is exported. For this example, the optional | |||
| privateEnterpriseNumber and informationElementIndex Information | privateEnterpriseNumber and informationElementIndex Information | |||
| Elements are omitted, because they are not used. | Elements are omitted, because they are not used. | |||
| 1 2 3 | 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Set ID = 3 | Length = 26 | | | Set ID = 3 | Length = 26 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Template ID = 257 | Field Count = 4 | | | Template ID = 257 | Field Count = 4 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Scope Field Count = 2 |0| templateID 145 | | | Scope Field Count = 2 |0| templateID 145 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Field Length = 2 |0| informationElementId 303 | | | Field Length = 2 |0| informationElementId 303 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Field Length = 2 |0| anonymisationFlags TBD1 | | | Field Length = 2 |0| anonymizationFlags TBD1 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Field Length = 2 |0| anonymisationTechnique TBD2 | | | Field Length = 2 |0| anonymizationTechnique TBD2 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Field Length = 2 | | | Field Length = 2 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Figure 5: Example Anonymisation Options Template | Figure 5: Example Anonymization Options Template | |||
| Following the Anonymisation Options Template comes a Data Set | Following the Anonymization Options Template comes a Data Set | |||
| containing Anonymisation Records. This data set has an entry for | containing Anonymization Records. This data set has an entry for | |||
| each Information Element Specifier in Template 256 describing the | each Information Element Specifier in Template 256 describing the | |||
| flow records. This Data Set is shown in figure Figure 6. Note that | flow records. This Data Set is shown in figure Figure 6. Note that | |||
| sourceIPv4Address and destinationIPv4Address have the Perimeter | sourceIPv4Address and destinationIPv4Address have the Perimeter | |||
| Anonymisation (0x0004) flag set in anonymisationFlags, meaning that | Anonymization (0x0004) flag set in anonymizationFlags, meaning that | |||
| source address should be treated as network-external, and the | source address should be treated as network-external, and the | |||
| destination address as network-internal. | destination address as network-internal. | |||
| 1 2 3 | 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Set ID = 257 | Length = 68 | | | Set ID = 257 | Length = 68 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Template 256 | flowStartSeconds IE 150 | | | Template 256 | flowStartSeconds IE 150 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | no flags 0x0000 | Not Anonymised 1 | | | no flags 0x0000 | Not Anonymized 1 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Template 256 | sourceIPv4Address IE 8 | | | Template 256 | sourceIPv4Address IE 8 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Perimeter, Session SC 0x0005 | Structured Permutation 6 | | | Perimeter, Session SC 0x0005 | Structured Permutation 6 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Template 256 | destinationIPv4Address IE 12 | | | Template 256 | destinationIPv4Address IE 12 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Perimeter, Stable 0x0007 | Reverse Truncation 7 | | | Perimeter, Stable 0x0007 | Reverse Truncation 7 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Template 256 | sourceTransportPort IE 7 | | | Template 256 | sourceTransportPort IE 7 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | no flags 0x0000 | Not Anonymised 1 | | | no flags 0x0000 | Not Anonymized 1 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Template 256 | dest.TransportPort IE 11 | | | Template 256 | dest.TransportPort IE 11 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | no flags 0x0000 | Not Anonymised 1 | | | no flags 0x0000 | Not Anonymized 1 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Template 256 | packetDeltaCount IE 2 | | | Template 256 | packetDeltaCount IE 2 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | no flags 0x0000 | Not Anonymised 1 | | | no flags 0x0000 | Not Anonymized 1 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Template 256 | octetDeltaCount IE 1 | | | Template 256 | octetDeltaCount IE 1 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Stable 0x0003 | Precision Degradation 2 | | | Stable 0x0003 | Precision Degradation 2 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Template 256 | protocolIdentifier IE 4 | | | Template 256 | protocolIdentifier IE 4 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | no flags 0x0000 | Not Anonymised 1 | | | no flags 0x0000 | Not Anonymized 1 | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Figure 6: Example Anonymisation Records | Figure 6: Example Anonymization Records | |||
| Following the Anonymisation Records come the data sets containing the | Following the Anonymization Records come the data sets containing the | |||
| anonymised data, exported according to the template in figure | anonymized data, exported according to the template in figure | |||
| Figure 4. Bringing it all together, consider an IPFIX Message | Figure 4. Bringing it all together, consider an IPFIX Message | |||
| containing three real data records and the necessary templates to | containing three real data records and the necessary templates to | |||
| export them, shown in Figure 7. (Note that the scale of this message | export them, shown in Figure 7. (Note that the scale of this message | |||
| is 8-bytes per line, for compactness; lines of dots '. . . . . ' | is 8-bytes per line, for compactness; lines of dots '. . . . . ' | |||
| represent shifting of the example bit structure for clarity.) | represent shifting of the example bit structure for clarity.) | |||
| 1 2 3 4 5 6 | 1 2 3 4 5 6 | |||
| 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 | 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | 0x000a | length 135 | export time 1271227717 | msg | | 0x000a | length 135 | export time 1271227717 | msg | |||
| | sequence 0 | domain 1 | hdr | | sequence 0 | domain 1 | hdr | |||
| skipping to change at page 36, line 30 ¶ | skipping to change at page 38, line 30 ¶ | |||
| | packets 60 | bytes 2896 | | | packets 60 | bytes 2896 | | |||
| | prt 6 | . . . . . . . . . . . . . . . . . . . . . . . . . . . | | prt 6 | . . . . . . . . . . . . . . . . . . . . . . . . . . . | |||
| | time 1271227683 | sip 198.51.100.7 | | | time 1271227683 | sip 198.51.100.7 | | |||
| | dip 203.0.113.9 | sp 5092 | dp 80 | | | dip 203.0.113.9 | sp 5092 | dp 80 | | |||
| | packets 44 | bytes 2037 | | | packets 44 | bytes 2037 | | |||
| | prt 6 | | | prt 6 | | |||
| +---------+ | +---------+ | |||
| Figure 7: Example Real Message | Figure 7: Example Real Message | |||
| The corresponding anonymised message is then shown in Figure 8. The | The corresponding anonymized message is then shown in Figure 8. The | |||
| options template set describing Anonymisation Records and the | options template set describing Anonymization Records and the | |||
| Anonymisation Records themselves are added; IP addresses and byte | Anonymization Records themselves are added; IP addresses and byte | |||
| counts are anonymised as declared. | counts are anonymized as declared. | |||
| 1 2 3 4 5 6 | 1 2 3 4 5 6 | |||
| 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 | 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | 0x000a | length 233 | export time 1271227717 | msg | | 0x000a | length 233 | export time 1271227717 | msg | |||
| | sequence 0 | domain 1 | hdr | | sequence 0 | domain 1 | hdr | |||
| | SetID 2 | length 40 | tid 256 | fields 8 | tmpl | | SetID 2 | length 40 | tid 256 | fields 8 | tmpl | |||
| | IE 150 | length 4 | IE 8 | length 4 | set | | IE 150 | length 4 | IE 8 | length 4 | set | |||
| | IE 12 | length 4 | IE 7 | length 2 | | | IE 12 | length 4 | IE 7 | length 2 | | |||
| | IE 11 | length 2 | IE 2 | length 4 | | | IE 11 | length 2 | IE 2 | length 4 | | |||
| skipping to change at page 37, line 42 ¶ | skipping to change at page 39, line 42 ¶ | |||
| | time 1271227682 | sip 0.0.0.7 | | | time 1271227682 | sip 0.0.0.7 | | |||
| | dip 254.202.119.6 | sp 5091 | dp 80 | | | dip 254.202.119.6 | sp 5091 | dp 80 | | |||
| | packets 60 | bytes 2900 | | | packets 60 | bytes 2900 | | |||
| | prt 6 | . . . . . . . . . . . . . . . . . . . . . . . . . . . | | prt 6 | . . . . . . . . . . . . . . . . . . . . . . . . . . . | |||
| | time 1271227683 | sip 0.0.0.7 | | | time 1271227683 | sip 0.0.0.7 | | |||
| | dip 2.19.199.176 | sp 5092 | dp 80 | | | dip 2.19.199.176 | sp 5092 | dp 80 | | |||
| | packets 60 | bytes 2000 | | | packets 60 | bytes 2000 | | |||
| | prt 6 | | | prt 6 | | |||
| +---------+ | +---------+ | |||
| Figure 8: Corresponding Anonymised Message | Figure 8: Corresponding Anonymized Message | |||
| 9. Security Considerations | 9. Security Considerations | |||
| This document provides guidelines for exporting metadata about | This document provides guidelines for exporting metadata about | |||
| anonymised data in IPFIX, or storing metadata about anonymised data | anonymized data in IPFIX, or storing metadata about anonymized data | |||
| in IPFIX Files. It is not intended as a general statement on the | in IPFIX Files. It is not intended as a general statement on the | |||
| applicability of specific flow data anonymisation techniques. | applicability of specific flow data anonymization techniques. | |||
| Exporters or publishers of anonymised data must take care that the | Exporters or publishers of anonymized data must take care that the | |||
| applied anonymisation technique is appropriate for the data source, | applied anonymization technique is appropriate for the data source, | |||
| the purpose, and the risk of deanonymisation of a given application. | the purpose, and the risk of deanonymization of a given application. | |||
| Research in anonymization techniques, and techniques for | ||||
| deanonymization, is ongoing, and currently "safe" anonymization | ||||
| techniques may be rendered unsafe by future developments. | ||||
| We note specifically that anonymisation is not a replacement for | We note specifically that anonymization is not a replacement for | |||
| encryption for confidentiality. It is only appropriate for | encryption for confidentiality. It is only appropriate for | |||
| protecting identifying information in data to be used for purposes in | protecting identifying information in data to be used for purposes in | |||
| which the protected data is irrelevant. Confidentiality in export is | which the protected data is irrelevant. Confidentiality in export is | |||
| best served by using TLS or DTLS as in the Security Considerations | best served by using TLS [RFC5246] or DTLS [RFC4347] as in the | |||
| section of [RFC5101], and in long-term storage by implementation- | Security Considerations section of [RFC5101], and in long-term | |||
| specific protection applied as in the Security Considerations section | storage by implementation-specific protection applied as in the | |||
| of [RFC5655]. Indeed, confidentiality and anonymisation are not | Security Considerations section of [RFC5655]. Indeed, | |||
| mutually exclusive, as encryption for confidentiality may be applied | confidentiality and anonymization are not mutually exclusive, as | |||
| to anonymised data export or storage, as well, when the anonymised | encryption for confidentiality may be applied to anonymized data | |||
| data is not intended for public release. | export or storage, as well, when the anonymized data is not intended | |||
| for public release. | ||||
| When using pseudonymisation techniques that have a mutable mapping, | We note as well that care should be taken even with well-anonymized | |||
| data, and anonymized data should still be treated as privacy- | ||||
| sensitive. Anonymization reduces the risk of misuse, but is not a | ||||
| complete solution to the problem of protecting end-user privacy in | ||||
| network flow trace analysis. | ||||
| When using pseudonymization techniques that have a mutable mapping, | ||||
| there is an inherent tradeoff in the stability of the map between | there is an inherent tradeoff in the stability of the map between | |||
| long-term comparability and security of the data set against | long-term comparability and security of the data set against | |||
| deanonymisation. In general, deanonymisation attacks are more | deanonymization. In general, deanonymization attacks are more | |||
| effective given more information, so the longer a given mapping is | effective given more information, so the longer a given mapping is | |||
| valid, the more information can be applied to deanonymisation. The | valid, the more information can be applied to deanonymization. The | |||
| specific details of this are technique-dependent and therefore out of | specific details of this are technique-dependent and therefore out of | |||
| the scope of this document. | the scope of this document. | |||
| When releasing anonymised data, publishers need to ensure that data | When releasing anonymized data, publishers need to ensure that data | |||
| that could be used in deanonymisation is not leaked through the | that could be used in deanonymization is not leaked through a side | |||
| export protocol; guidelines for addressing this risk are provided in | channel. The entire workflow (hardware, software, operational | |||
| Section 7.2. | policies and procedures, etc.) for handling anonymized data must be | |||
| evaluated for risk of data leakage. While most of these possible | ||||
| side channels are out of scope for this document, guidelines for | ||||
| reducing the risk of information leakage specific to the IPFIX export | ||||
| protocol are provided in Section 7.2. | ||||
| Note as well that the Security Considerations section of [RFC5101] | Note as well that the Security Considerations section of [RFC5101] | |||
| applies as well to the export of anonymised data, and the Security | applies as well to the export of anonymized data, and the Security | |||
| Considerations section of [RFC5655] to the storage of anonymised | Considerations section of [RFC5655] to the storage of anonymized | |||
| data, or the publication of anonymised traces. | data, or the publication of anonymized traces. | |||
| 10. IANA Considerations | 10. IANA Considerations | |||
| This document specifies the creation of several new IPFIX Information | This document specifies the creation of several new IPFIX Information | |||
| Elements in the IPFIX Information Element registry located at | Elements in the IPFIX Information Element registry located at | |||
| http://www.iana.org/assignments/ipfix, as defined in Section 6.2 | http://www.iana.org/assignments/ipfix, as defined in Section 6.2 | |||
| above. IANA has assigned the following Information Element numbers | above. IANA has assigned the following Information Element numbers | |||
| for their respective Information Elements as specified below: | for their respective Information Elements as specified below: | |||
| o Information Element number TBD1 for the anonymisationFlags | o Information Element number TBD1 for the anonymizationFlags | |||
| Information Element. | Information Element. | |||
| o Information Element number TBD2 for the anonymisationTechnique | o Information Element number TBD2 for the anonymizationTechnique | |||
| Information Element. | Information Element. | |||
| o Information Element number TBD3 for the informationElementIndex | o Information Element number TBD3 for the informationElementIndex | |||
| Information Element. | Information Element. | |||
| [NOTE for IANA: The text TBDn should be replaced with the respective | [NOTE for IANA: The text TBDn should be replaced with the respective | |||
| assigned Information Element numbers where they appear in this | assigned Information Element numbers where they appear in this | |||
| document.] | document. Information Element numbers should be assigned outside the | |||
| NetFlow V9 compatibility range, as these Information Elements are not | ||||
| supported by NetFlow V9.] | ||||
| 11. Acknowledgments | 11. Acknowledgments | |||
| We thank Paul Aitken and John McHugh for their comments and insight, | We thank Paul Aitken and John McHugh for their comments and insight, | |||
| and Carsten Schmoll, Benoit Claise, Lothar Braun, and Dan Romascanu | and Carsten Schmoll, Benoit Claise, Lothar Braun, Dan Romascanu, | |||
| for their reviews. Special thanks to the ICT-PRISM project for its | Stewart Bryant, and Sean Turner for their reviews. Special thanks to | |||
| material support of this work. | the FP7 PRISM and DEMONS projects for their material support of this | |||
| work. | ||||
| 12. References | 12. References | |||
| 12.1. Normative References | 12.1. Normative References | |||
| [RFC5101] Claise, B., "Specification of the IP Flow Information | [RFC5101] Claise, B., "Specification of the IP Flow Information | |||
| Export (IPFIX) Protocol for the Exchange of IP Traffic | Export (IPFIX) Protocol for the Exchange of IP Traffic | |||
| Flow Information", RFC 5101, January 2008. | Flow Information", RFC 5101, January 2008. | |||
| [RFC5102] Quittek, J., Bryant, S., Claise, B., Aitken, P., and J. | [RFC5102] Quittek, J., Bryant, S., Claise, B., Aitken, P., and J. | |||
| skipping to change at page 40, line 18 ¶ | skipping to change at page 42, line 31 ¶ | |||
| "Architecture for IP Flow Information Export", RFC 5470, | "Architecture for IP Flow Information Export", RFC 5470, | |||
| March 2009. | March 2009. | |||
| [RFC5472] Zseby, T., Boschi, E., Brownlee, N., and B. Claise, "IP | [RFC5472] Zseby, T., Boschi, E., Brownlee, N., and B. Claise, "IP | |||
| Flow Information Export (IPFIX) Applicability", RFC 5472, | Flow Information Export (IPFIX) Applicability", RFC 5472, | |||
| March 2009. | March 2009. | |||
| [I-D.ietf-ipfix-mediators-framework] | [I-D.ietf-ipfix-mediators-framework] | |||
| Kobayashi, A., Claise, B., Muenz, G., and K. Ishibashi, | Kobayashi, A., Claise, B., Muenz, G., and K. Ishibashi, | |||
| "IPFIX Mediation: Framework", | "IPFIX Mediation: Framework", | |||
| draft-ietf-ipfix-mediators-framework-08 (work in | draft-ietf-ipfix-mediators-framework-09 (work in | |||
| progress), August 2010. | progress), October 2010. | |||
| [I-D.ietf-ipfix-export-per-sctp-stream] | [I-D.ietf-ipfix-export-per-sctp-stream] | |||
| Claise, B., Aitken, P., Johnson, A., and G. Muenz, "IPFIX | Claise, B., Aitken, P., Johnson, A., and G. Muenz, "IPFIX | |||
| Export per SCTP Stream", | Export per SCTP Stream", | |||
| draft-ietf-ipfix-export-per-sctp-stream-08 (work in | draft-ietf-ipfix-export-per-sctp-stream-08 (work in | |||
| progress), May 2010. | progress), May 2010. | |||
| [RFC5153] Boschi, E., Mark, L., Quittek, J., Stiemerling, M., and P. | [RFC5153] Boschi, E., Mark, L., Quittek, J., Stiemerling, M., and P. | |||
| Aitken, "IP Flow Information Export (IPFIX) Implementation | Aitken, "IP Flow Information Export (IPFIX) Implementation | |||
| Guidelines", RFC 5153, April 2008. | Guidelines", RFC 5153, April 2008. | |||
| [RFC3917] Quittek, J., Zseby, T., Claise, B., and S. Zander, | [RFC3917] Quittek, J., Zseby, T., Claise, B., and S. Zander, | |||
| "Requirements for IP Flow Information Export (IPFIX)", | "Requirements for IP Flow Information Export (IPFIX)", | |||
| RFC 3917, October 2004. | RFC 3917, October 2004. | |||
| [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing | [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing | |||
| Architecture", RFC 4291, February 2006. | Architecture", RFC 4291, February 2006. | |||
| [RFC4347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer | ||||
| Security", RFC 4347, April 2006. | ||||
| [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security | ||||
| (TLS) Protocol Version 1.2", RFC 5246, August 2008. | ||||
| [Bur10] Burkhart, M., Schatzmann, D., Trammell, B., and E. Boschi, | ||||
| "The Role of Network Trace Anonymization Under Attack", | ||||
| ACM Computer Communications Review, vol. 40, no. 1, pp. | ||||
| 6-11, January 2010. | ||||
| [Mur07] Murdoch, S. and P. Zielinski, "Sampled Traffic Analysis by | ||||
| Internet-Exchange-Level Adversaries", Proceedings of the | ||||
| 7th Workshop on Privacy Enhancing Technologies, Ottawa, | ||||
| Canada., June 2007. | ||||
| Authors' Addresses | Authors' Addresses | |||
| Elisa Boschi | Elisa Boschi | |||
| Swiss Federal Institute of Technology Zurich | Swiss Federal Institute of Technology Zurich | |||
| Gloriastrasse 35 | Gloriastrasse 35 | |||
| 8092 Zurich | 8092 Zurich | |||
| Switzerland | Switzerland | |||
| Email: boschie@tik.ee.ethz.ch | Email: boschie@tik.ee.ethz.ch | |||
| Brian Trammell | Brian Trammell | |||
| Swiss Federal Institute of Technology Zurich | Swiss Federal Institute of Technology Zurich | |||
| Gloriastrasse 35 | Gloriastrasse 35 | |||
| 8092 Zurich | 8092 Zurich | |||
| Switzerland | Switzerland | |||
| Phone: +41 44 632 70 13 | Phone: +41 44 632 70 13 | |||
| Email: trammell@tik.ee.ethz.ch | Email: trammell@tik.ee.ethz.ch | |||
| End of changes. 285 change blocks. | ||||
| 484 lines changed or deleted | 606 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||