idnits 2.17.1 draft-boschi-ipfix-anon-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 368. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 379. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 386. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 392. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 7, 2008) is 5771 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'IMC07' is mentioned on line 80, but not defined == Missing Reference: 'FloCon08' is mentioned on line 80, but not defined == Unused Reference: 'I-D.ietf-ipfix-architecture' is defined on line 315, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-ipfix-reducing-redundancy' is defined on line 320, but no explicit reference was found in the text ** Obsolete normative reference: RFC 5101 (Obsoleted by RFC 7011) ** Obsolete normative reference: RFC 5102 (Obsoleted by RFC 7012) Summary: 3 errors (**), 0 flaws (~~), 6 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IPFIX Working Group E. Boschi 3 Internet-Draft B. Trammell 4 Intended status: Experimental Hitachi Europe 5 Expires: January 8, 2009 July 7, 2008 7 IP Flow Anonymisation Support 8 draft-boschi-ipfix-anon-00.txt 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on January 8, 2009. 35 Abstract 37 This document describes anonymisation techniques for IP flow data. 38 It provides a categorization of common anomymisation schemes and 39 defines the parameters needed to describe them. It describes support 40 for anonymization within the IPFIX protocol, providing the basis for 41 the definition of information models for configuring anonymisation 42 techniques within an IPFIX Metering or Exporting Process, and for 43 reporting the technique in use to an IPFIX Collecting Process. 45 Table of Contents 47 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 48 1.1. IPFIX Protocol Overview . . . . . . . . . . . . . . . . . 3 49 1.2. IPFIX Documents Overview . . . . . . . . . . . . . . . . . 4 50 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 51 3. Categorisation of Anonymisation Techniques . . . . . . . . . . 5 52 4. Anonymisation of IP Flow Data . . . . . . . . . . . . . . . . 5 53 4.1. IP Address Anonymisation . . . . . . . . . . . . . . . . . 6 54 4.2. Timestamp Anonymisation . . . . . . . . . . . . . . . . . 6 55 4.3. Anonymisation of Other Flow Fields . . . . . . . . . . . . 7 56 5. Parameters for the Description of Anonymisation Techniques . . 7 57 6. Anonymisation Support in IPFIX . . . . . . . . . . . . . . . . 7 58 7. Security Considerations . . . . . . . . . . . . . . . . . . . 7 59 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 60 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 7 61 9.1. Normative References . . . . . . . . . . . . . . . . . . . 7 62 9.2. Informative References . . . . . . . . . . . . . . . . . . 8 63 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 8 64 Intellectual Property and Copyright Statements . . . . . . . . . . 10 66 1. Introduction 68 The standardisation of an IP flow information export protocol 69 [RFC5101] and associated representations removes a technical barrier 70 to the sharing of IP flow data across organizational boundaries and 71 with network operations, security, and research communities for a 72 wide variety of purposes. However, with wider dissemination comes 73 greater risks to the privacy of the users of networks under 74 measurement, and to the security of those networks. While it is not 75 a complete solution to the issues posed by distribution of IP flow 76 information, anonymisation is an important tool for the protection of 77 privacy within network measurement infrastructures. Additionally, 78 various jurisdictions define data protection laws and regulations 79 that flow measurement activities must comply with, and anonymisation 80 may be a part of such compliance [IMC07, FloCon08]. 82 This document presents a mechanism for representing anonymised data 83 within IPFIX and guidelines for using it. It begins with a 84 categorization of anonymisation techniques. It then describes 85 applicability of each technique to commonly anonymisable fields of IP 86 flow data, organized by information element data type and semantics 87 as in [RFC5102]; enumerates the parameters required by each of the 88 applicable anonymisation techniques; and provides guidelines for the 89 use of each of these techniques in accordance with best practices in 90 data protection. Finally, it specifies a mechanism for exporting 91 anonymised data and binding anonymisation metadata to templates using 92 IPFIX Options. 94 1.1. IPFIX Protocol Overview 96 In the IPFIX protocol, { type, length, value } tuples are expressed 97 in templates containing { type, length } pairs, specifying which { 98 value } fields are present in data records conforming to the 99 Template, giving great flexibility as to what data is transmitted. 101 Since Templates are sent very infrequently compared with Data 102 Records, this results in significant bandwidth savings. 104 Different Data Records may be transmitted simply by sending new 105 Templates specifying the { type, length } pairs for the new data 106 format. See [RFC5101] for more information. 108 The IPFIX information model [RFC5102] defines a large number of 109 standard Information Elements which provide the necessary { type } 110 information for Templates. 112 The use of standard elements enables interoperability among different 113 vendors' implementations. Additionally, non-standard enterprise- 114 specific elements may be defined for private use. 116 1.2. IPFIX Documents Overview 118 "Specification of the IPFIX Protocol for the Exchange of IP Traffic 119 Flow Information" [RFC5101] (informally, the IPFIX Protocol document) 120 and its associated documents define the IPFIX Protocol, which 121 provides network engineers and administrators with access to IP 122 traffic flow information. 124 "Architecture for IP Flow Information Export" [I-D.ietf-ipfix-arch] 125 (the IPFIX Architecture document) defines the architecture for the 126 export of measured IP flow information out of an IPFIX Exporting 127 Process to an IPFIX Collecting Process, and the basic terminology 128 used to describe the elements of this architecture, per the 129 requirements defined in "Requirements for IP Flow Information Export" 130 [RFC3917]. The IPFIX Protocol document [RFC5101] then covers the 131 details of the method for transporting IPFIX Data Records and 132 Templates via a congestion-aware transport protocol from an IPFIX 133 Exporting Process to an IPFIX Collecting Process. 135 "Information Model for IP Flow Information Export" [RFC5102] 136 (informally, the IPFIX Information Model document) describes the 137 Information Elements used by IPFIX, including details on Information 138 Element naming, numbering, and data type encoding. Finally, "IPFIX 139 Applicability" [I-D.ietf-ipfix-as] describes the various applications 140 of the IPFIX protocol and their use of information exported via 141 IPFIX, and relates the IPFIX architecture to other measurement 142 architectures and frameworks. 144 This document references the Protocol and Architecture documents for 145 terminology and extends the IPFIX Information Model to provide new 146 Information Elements for anonymisation metadata. 148 2. Terminology 150 The terminology used in this document is fully aligned with the 151 terminology defined in [RFC5101]. Therefore, the terms defined in 152 the IPFIX terminology are capitalized in this document, as in other 153 IPFIX drafts ([RFC5101], [RFC5102]). 155 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 156 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 157 document are to be interpreted as described in RFC 2119 [RFC2119]. 159 3. Categorisation of Anonymisation Techniques 161 Anonymisation modifies a data set in order to protect the identity of 162 the people or entities described by the data set from disclosure. 163 With respect to network traffic data, anonymisation generally 164 attempts to preserve some set of properties of the network traffic 165 useful for a given application or applications, while ensuring the 166 data cannot be traced back to the specific networks, hosts, or users 167 generating the traffic. 169 Anonymisation may be broadly split into three categories: 170 generalisation and reversible or irreversible substitution. When 171 generalisation is used, identifying information is grouped in sets, 172 and one single value is used to identify each set element. Note that 173 this may cause multiple records to become indistinguishable, thereby 174 aggregating them into a single record. Generalisation is an 175 irreversible operation, in that the information needed to identify a 176 single record from its "generalised value" is lost. 178 Substitution (or pseudonymization) substitutes a false identifier for 179 a real one, and can be reversible or irreversible. Reversible 180 substitution uses an invertible or otherwise reversible function, so 181 that the real identifier may be recovered later. Irreversible 182 substitution, likewise, uses a one-way or randomising function, so 183 that the real identifier cannot be recovered. 185 While anonymisation is generally applied at the resolution of single 186 fields within a record, attacks against anonymisation use entire 187 records and relationships between records within a data set. 188 Therefore, fields which may not necessarily be identifying by 189 themselves may be anonymised in order to increase the anonymity of 190 the data set as a whole. 192 4. Anonymisation of IP Flow Data 194 Due to the restricted semantics of IP flow data, there are a 195 relatively limited set of specific anonymisation techniques available 196 on flow data, though each falls into the broad categories above. 197 Each type of field that may commonly appear in a flow record may have 198 its own applicable specific techniques. 200 Of all the fields in an IP flow record, the most attention in the 201 literature has been paid to IP addresses [TODO: cite]. IP addresses 202 are structured identifiers, that is, partial IP address prefixes may 203 be used to identify networks just as full IP addresses identify 204 hosts. This leads to the application of prefix-preserving 205 anonymisation of IP address information [TODO: cite]. Prefix- 206 preserving anonymisation is a (generally irreversible) substitution 207 technique which has the additional property that the structure of the 208 IP address space is maintained in the anonymised data. 210 While not identifiers in and of themselves, timestamps are vulnerable 211 to fingerprinting attacks, wherein relationships between the start 212 and end timestamps of flows within a data set can be used to identify 213 hosts or networks [TODO: cite]. Therefore, a variety of 214 anonymisation techniques are available, including loss of precision 215 (a form of generalisation), or noise addition (substitution), which 216 may or may not preserve the sequencing of flows in the data set. 218 Counters and other flow values can also be used to break 219 anonymisation in fingerprinting attacks, so the same techniques, 220 precision loss and noise addition, are available for these fields as 221 well. 223 Of course, the simplest form of anomymisation and the most extreme 224 form of generalisation is black-marker anonymisation, or full 225 deletion of a field from each record of the flow data. The black 226 marker technique is available on any type of field in a flow record. 228 [TODO: This section is incomplete; the set of techniques should be 229 more exhaustive.] 231 4.1. IP Address Anonymisation 233 The following table gives an overview of the schemes for IP address 234 anonymization described in this document and their categorization. 236 +-----------------------+----------------+---------------+ 237 | Scheme | Action | Reversibility | 238 +-----------------------+----------------+---------------+ 239 | Truncation | Generalisation | N | 240 | Scrambling | Substitution | Y | 241 | Prefix-preserving | Substitution | Y | 242 | Random noise addition | Substitution | N | 243 +-----------------------+----------------+---------------+ 245 [TODO: This section is incomplete; text here should expand on the 246 table.] 248 4.2. Timestamp Anonymisation 250 [TODO: as section 4.1] 252 [EDITOR'S NOTE: Counters might go here, since they are subject to the 253 same techniques for largely the same reasons.] 255 4.3. Anonymisation of Other Flow Fields 257 [TODO: as section 4.1] 259 [EDITOR'S NOTE: Port Numbers go here. Counters might, if not above. 260 It might make sense to split this into flow key anonymisation versus 261 flow value anonymisation.] 263 5. Parameters for the Description of Anonymisation Techniques 265 [TODO: see corresponding section of draft-ietf-psamp-sample-tech for 266 the proposed structure of this section.] 268 6. Anonymisation Support in IPFIX 270 [TODO: Here we'll describe how the information specified above can be 271 transmitted on the wire using an option template. The idea is to 272 scope the option to the Template ID and for each field specify which 273 are anonymised, providing info on the output characteristics of the 274 technique, and which ones aren't.] 276 [EDITOR'S NOTE: Multiple anon. techniques applied on an IE at the 277 same time is indicated with multiple elements of the same type (in 278 application order as in PSAMP)] 280 [EDITOR'S NOTE: for blackmarking we'll recommend not to export the 281 information at all following the data protection law principle that 282 only necessary information should be exported.] 284 7. Security Considerations 286 [TODO: write this section.] 288 8. IANA Considerations 290 This document contains no actions for IANA. 292 9. References 294 9.1. Normative References 296 [RFC5101] Claise, B., "Specification of the IP Flow Information 297 Export (IPFIX) Protocol for the Exchange of IP Traffic 298 Flow Information", RFC 5101, January 2008. 300 [RFC5102] Quittek, J., Bryant, S., Claise, B., Aitken, P., and J. 301 Meyer, "Information Model for IP Flow Information Export", 302 RFC 5102, January 2008. 304 9.2. Informative References 306 [I-D.ietf-ipfix-arch] 307 Sadasivan, G. and N. Brownlee, "Architecture Model for IP 308 Flow Information Export", draft-ietf-ipfix-arch-02 (work 309 in progress), October 2003. 311 [I-D.ietf-ipfix-as] 312 Zseby, T., "IPFIX Applicability", draft-ietf-ipfix-as-12 313 (work in progress), July 2007. 315 [I-D.ietf-ipfix-architecture] 316 Sadasivan, G., "Architecture for IP Flow Information 317 Export", draft-ietf-ipfix-architecture-12 (work in 318 progress), September 2006. 320 [I-D.ietf-ipfix-reducing-redundancy] 321 Boschi, E., "Reducing Redundancy in IP Flow Information 322 Export (IPFIX) and Packet Sampling (PSAMP) Reports", 323 draft-ietf-ipfix-reducing-redundancy-04 (work in 324 progress), May 2007. 326 [RFC3917] Quittek, J., Zseby, T., Claise, B., and S. Zander, 327 "Requirements for IP Flow Information Export (IPFIX)", 328 RFC 3917, October 2004. 330 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 331 Requirement Levels", BCP 14, RFC 2119, March 1997. 333 Authors' Addresses 335 Elisa Boschi 336 Hitachi Europe 337 c/o ETH Zurich 338 Gloriastrasse 35 339 8092 Zurich 340 Switzerland 342 Phone: +41 44 632 70 57 343 Email: elisa.boschi@hitachi-eu.com 344 Brian Trammell 345 Hitachi Europe 346 c/o ETH Zurich 347 Gloriastrasse 35 348 8092 Zurich 349 Switzerland 351 Phone: +41 44 632 70 13 352 Email: trammell@tik.ee.ethz.ch 354 Full Copyright Statement 356 Copyright (C) The IETF Trust (2008). 358 This document is subject to the rights, licenses and restrictions 359 contained in BCP 78, and except as set forth therein, the authors 360 retain all their rights. 362 This document and the information contained herein are provided on an 363 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 364 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 365 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 366 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 367 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 368 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 370 Intellectual Property 372 The IETF takes no position regarding the validity or scope of any 373 Intellectual Property Rights or other rights that might be claimed to 374 pertain to the implementation or use of the technology described in 375 this document or the extent to which any license under such rights 376 might or might not be available; nor does it represent that it has 377 made any independent effort to identify any such rights. Information 378 on the procedures with respect to rights in RFC documents can be 379 found in BCP 78 and BCP 79. 381 Copies of IPR disclosures made to the IETF Secretariat and any 382 assurances of licenses to be made available, or the result of an 383 attempt made to obtain a general license or permission for the use of 384 such proprietary rights by implementers or users of this 385 specification can be obtained from the IETF on-line IPR repository at 386 http://www.ietf.org/ipr. 388 The IETF invites any interested party to bring to its attention any 389 copyrights, patents or patent applications, or other proprietary 390 rights that may cover technology that may be required to implement 391 this standard. Please address the information to the IETF at 392 ietf-ipr@ietf.org.