idnits 2.17.1 draft-dulaunoy-dnsop-passive-dns-cof-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 25, 2020) is 1373 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'WEIMERPDNS' is mentioned on line 459, but not defined == Missing Reference: 'REST' is mentioned on line 455, but not defined == Missing Reference: 'DNSDB' is mentioned on line 430, but not defined == Missing Reference: 'DNSDBQ' is mentioned on line 433, but not defined == Missing Reference: 'PDNSCERTAT' is mentioned on line 436, but not defined == Missing Reference: 'PDNSCIRCL' is mentioned on line 442, but not defined == Missing Reference: 'PDNSCOF' is mentioned on line 451, but not defined == Missing Reference: 'PDNSCLIENT' is mentioned on line 446, but not defined == Missing Reference: 'BAILIWICK' is mentioned on line 420, but not defined == Missing Reference: 'CACHEPOISONING' is mentioned on line 425, but not defined == Unused Reference: 'RFC5001' is defined on line 402, but no explicit reference was found in the text == Unused Reference: 'I-D.narten-iana-considerations-rfc2434bis' is defined on line 466, but no explicit reference was found in the text == Unused Reference: 'RFC3552' is defined on line 472, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2234 (Obsoleted by RFC 4234) ** Obsolete normative reference: RFC 4627 (Obsoleted by RFC 7158, RFC 7159) Summary: 2 errors (**), 0 flaws (~~), 14 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Domain Name System Operations A. Dulaunoy 3 Internet-Draft CIRCL 4 Intended status: Informational A. Kaplan 5 Expires: December 27, 2020 CERT.at 6 P. Vixie 7 H. Stern 8 Farsight Security, Inc. 9 June 25, 2020 11 Passive DNS - Common Output Format 12 draft-dulaunoy-dnsop-passive-dns-cof-07 14 Abstract 16 This document describes a common output format of Passive DNS Servers 17 which clients can query. The output format description includes also 18 in addition a common semantic for each Passive DNS system. By having 19 multiple Passive DNS Systems adhere to the same output format for 20 queries, users of multiple Passive DNS servers will be able to 21 combine result sets easily. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at https://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on December 27, 2020. 40 Copyright Notice 42 Copyright (c) 2020 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (https://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 59 2. Limitation . . . . . . . . . . . . . . . . . . . . . . . . . 3 60 3. Common Output Format . . . . . . . . . . . . . . . . . . . . 4 61 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 4 62 3.2. ABNF grammar . . . . . . . . . . . . . . . . . . . . . . 4 63 3.3. Mandatory Fields . . . . . . . . . . . . . . . . . . . . 5 64 3.3.1. rrname . . . . . . . . . . . . . . . . . . . . . . . 5 65 3.3.2. rrtype . . . . . . . . . . . . . . . . . . . . . . . 5 66 3.3.3. rdata . . . . . . . . . . . . . . . . . . . . . . . . 5 67 3.3.4. time_first . . . . . . . . . . . . . . . . . . . . . 6 68 3.3.5. time_last . . . . . . . . . . . . . . . . . . . . . . 6 69 3.4. Optional Fields . . . . . . . . . . . . . . . . . . . . . 6 70 3.4.1. count . . . . . . . . . . . . . . . . . . . . . . . . 6 71 3.4.2. bailiwick . . . . . . . . . . . . . . . . . . . . . . 6 72 3.5. Additional Fields . . . . . . . . . . . . . . . . . . . . 6 73 3.5.1. sensor_id . . . . . . . . . . . . . . . . . . . . . . 6 74 3.5.2. zone_time_first . . . . . . . . . . . . . . . . . . . 7 75 3.5.3. zone_time_last . . . . . . . . . . . . . . . . . . . 7 76 3.5.4. origin . . . . . . . . . . . . . . . . . . . . . . . 7 77 3.5.5. time_first_ms . . . . . . . . . . . . . . . . . . . . 7 78 3.5.6. time_last_ms . . . . . . . . . . . . . . . . . . . . 7 79 3.6. Additional Fields Registry . . . . . . . . . . . . . . . 7 80 4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 81 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 82 6. Privacy Considerations . . . . . . . . . . . . . . . . . . . 8 83 7. Security Considerations . . . . . . . . . . . . . . . . . . . 8 84 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 85 8.1. Normative References . . . . . . . . . . . . . . . . . . 8 86 8.2. References . . . . . . . . . . . . . . . . . . . . . . . 10 87 8.3. Informative References . . . . . . . . . . . . . . . . . 11 88 Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . 11 89 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 91 1. Introduction 93 Passive DNS is a technique described by Florian Weimer in 2005 in 94 Passive DNS replication, F Weimer - 17th Annual FIRST Conference on 95 Computer Security [WEIMERPDNS]. Since then multiple Passive DNS 96 implementations were created and evolved over time. Users of these 97 Passive DNS servers may query a server (often via WHOIS [RFC3912] or 98 HTTP REST [REST]), parse the results and process them in other 99 applications. 101 There are multiple implementations of Passive DNS software. Users of 102 passive DNS query each implementation and aggregate the results for 103 their search. This document describes the output format of four 104 Passive DNS Systems ([DNSDB], [DNSDBQ], [PDNSCERTAT], [PDNSCIRCL] and 105 [PDNSCOF]) which are in use today and which already share a nearly 106 identical output format. As the format and the meaning of output 107 fields from each Passive DNS need to be consistent, we propose in 108 this document a solution to commonly name each field along with their 109 corresponding interpretation. The format follows a simple key-value 110 structure in JSON [RFC4627] format. The benefit of having a 111 consistent Passive DNS output format is that multiple client 112 implementations can query different servers without having to have a 113 separate parser for each individual server. passivedns-client 114 [PDNSCLIENT] currently implements multiple parsers due to a lack of 115 standardization. The document does not describe the protocol (e.g. 116 WHOIS [RFC3912], HTTP REST [REST]) nor the query format used to query 117 the Passive DNS. Neither does this document describe "pre-recursor" 118 Passive DNS Systems. Both of these are separate topics and deserve 119 their own RFC document. The document describes the current best 120 practices implemented in various Passive DNS server implementations. 122 1.1. Requirements Language 124 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 125 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 126 document are to be interpreted as described in RFC 2119 [RFC2119]. 128 2. Limitation 130 As a Passive DNS servers can include protection mechanisms for their 131 operation, results might be different due to those protection 132 measures. These mechanisms filter out DNS answers if they fail some 133 criteria. The bailiwick algorithm [BAILIWICK] protects the Passive 134 DNS Database from cache poisoning attacks [CACHEPOISONING]. Another 135 limitation that clients querying the database need to be aware of is 136 that each query simply gets a snapshot-answer of the time of 137 querying. Clients MUST NOT rely on consistent answers. Nor must 138 they assume that answers must be identical across multiple Passive 139 DNS Servers. 141 3. Common Output Format 143 3.1. Overview 145 The formatting of the answer follows the JSON [RFC4627] format. In 146 fact, it is a subset of the full JSON language. Notable differences 147 are the modified definition of whitespace ("ws"). The order of the 148 fields is not significant for the same resource type. 150 The intent of this output format is to be easily parsable by scripts. 151 Each JSON object is expressed on a single line to be processed by the 152 client line-by-line. Every implementation MUST support the JSON 153 output format. 155 Examples of JSON (Appendix A) output are in the appendix. 157 3.2. ABNF grammar 159 Formal grammar as defined in ABNF [RFC2234] 161 answer = entries 162 entries = * ( entry CR) 163 entry = "{" keyvallist "}" 164 keyvallist = [ member *( value-separator member ) ] 165 member = qm field qm name-separator value 166 name-separator = ws %x3A ws ; a ":" colon 167 value = value ; as defined in the JSON RFC 168 value-separator = ws %x2C ws ; , comma. As defined in JSON 169 field = "rrname" | "rrtype" | "rdata" | "time_first" | 170 "time_last" | "count" | "bailiwick" | "sensor_id" | 171 "zone_time_first" | "zone_time_last" | "origin" | 172 futureField 173 futureField = string 174 CR = %x0D 175 qm = %x22 ; " a quotation mark 176 ws = *( 177 %x20 | ; Space 178 %x09 ; Horizontal tab 179 ) 181 Note that value is defined in JSON [RFC4627] and has the exact same 182 specification as there. The same goes for the definition of string. 184 3.3. Mandatory Fields 186 Implementation MUST support all the mandatory fields. 188 Uniqueness property: the tuple (rrname,rrtype,rdata) will always be 189 unique within one answer per server. While rrname and rrtype are 190 always individual JSON primitive types (strings, numbers, booleans or 191 null), rdata MAY return multiple resource records or a single record. 192 When multiple resource records are returned, rdata MUST be a JSON 193 array. In the case of a single resource record is returned, rdata 194 MUST be a JSON string or a JSON array containing one JSON string. 195 Senders SHOULD send an array for rdata, but receivers MUST be able to 196 accept a single-string result for rdata. 198 3.3.1. rrname 200 This field returns the name of the queried resource. 202 3.3.2. rrtype 204 This field returns the resource record type as seen by the passive 205 DNS. The key is rrtype and the value is in the interpreted record 206 type represented as a JSON [RFC4627] string. If the value cannot be 207 interpreted, the decimal value is returned following the principle of 208 transparency as described in RFC 3597 [RFC3597]. Then the decimal 209 value is represented as a JSON [RFC4627] number. The resource record 210 type can be any values as described by IANA in the DNS parameters 211 document in the section 'Resource Record (RR) TYPEs' 212 (http://www.iana.org/assignments/dns-parameters). Supported textual 213 descriptions of rrtypes include: A, AAAA, CNAME, etc. A client MUST 214 be able to understand these textual rrtype values represented as a 215 JSON [RFC4627] string. In addition, a client MUST be able to handle 216 a decimal value (as mentioned above) answer represented as a JSON 217 [RFC4627] number. 219 3.3.3. rdata 221 This field returns the resource records of the queried resource. 222 When multiple resource records are returned, rdata MUST be a JSON 223 array containing JSON strings. In the case of a single resource 224 record is returned, rdata MUST be a JSON string or a JSON array 225 containing one JSON string. Each resource record is represented as a 226 JSON [RFC4627] string. Each resource record MUST be escaped as 227 defined in section 2.6 of RFC4627 [RFC4627]. Depending on the 228 rrtype, this can be an IPv4 or IPv6 address, a domain name (as in the 229 case of CNAMEs), an SPF record, etc. A client MUST be able to 230 interpret any value which is legal as the right hand side in a DNS 231 master file RFC 1035 [RFC1035] and RFC 1034 [RFC1034]. If the rdata 232 came from an unknown DNS resource records, the server must follow the 233 transparency principle as described in RFC 3597 [RFC3597]. 235 3.3.4. time_first 237 This field returns the first time that the record / unique tuple 238 (rrname, rrtype, rdata) has been seen by the passive DNS. The date 239 is expressed in seconds (decimal) since 1st of January 1970 (Unix 240 timestamp). The time zone MUST be UTC. This field is represented as 241 a JSON [RFC4627] number. 243 3.3.5. time_last 245 This field returns the last time that the unique tuple (rrname, 246 rrtype, rdata) record has been seen by the passive DNS. The date is 247 expressed in seconds (decimal) since 1st of January 1970 (Unix 248 timestamp). The time zone MUST be UTC. This field is represented as 249 a JSON [RFC4627] number. 251 3.4. Optional Fields 253 Implementations SHOULD support one or more fields. 255 3.4.1. count 257 Specifies how many authoritative DNS answers were received at the 258 Passive DNS Server's collectors with exactly the given set of values 259 as answers (i.e. same data in the answer set - compare with the 260 uniqueness property in "Mandatory Fields"). The number of requests 261 is expressed as a decimal value. This field is represented as a JSON 262 [RFC4627] number. 264 3.4.2. bailiwick 266 The bailiwick is the best estimate of the apex of the zone where this 267 data is authoritative. 269 3.5. Additional Fields 271 Implementations MAY support the following fields: 273 3.5.1. sensor_id 275 This field returns the sensor information where the record was seen. 276 It is represented as a JSON [RFC4627] string. 278 If the data originate from sensors or probes which are part of a 279 publicly-known gathering or measurement system (e.g. RIPE Atlas), a 280 JSON [RFC4627] string SHOULD be prefixed. 282 3.5.2. zone_time_first 284 This field returns the first time that the unique tuple (rrname, 285 rrtype, rdata) record has been seen via master file import. The date 286 is expressed in seconds (decimal) since 1st of January 1970 (Unix 287 timestamp). The time zone MUST be UTC. This field is represented as 288 a JSON [RFC4627] number. 290 3.5.3. zone_time_last 292 This field returns the last time that the unique tuple (rrname, 293 rrtype, rdata) record has been seen via master file import. The date 294 is expressed in seconds (decimal) since 1st of January 1970 (Unix 295 timestamp). The time zone MUST be UTC. This field is represented as 296 a JSON [RFC4627] number. 298 3.5.4. origin 300 Specifies the resource origin of the Passive DNS response. This 301 field is represented as a Uniform Resource Identifier [RFC3986] 302 (URI). 304 3.5.5. time_first_ms 306 Same meaning as the field "time_first", with the only difference, 307 that the resolution is in milliseconds since 1st of January 1970 308 (UTC). 310 3.5.6. time_last_ms 312 Same meaning as the field "time_last", with the only difference, that 313 the resolution is in milliseconds since 1st of January 1970 (UTC). 315 3.6. Additional Fields Registry 317 In accordance with [RFC6648], designers of new passive DNS 318 applications that would need additional fields can request and 319 register new field name at https://github.com/adulau/pdns-qof/wiki/ 320 Additional-Fields. 322 4. Acknowledgements 324 Thanks to the Passive DNS developers who contributed to the document. 326 5. IANA Considerations 328 This memo includes no request to IANA. 330 6. Privacy Considerations 332 Passive DNS Servers capture DNS answers from multiple collecting 333 points ("sensors") which are located on the Internet-facing side of 334 DNS recursors ("post-recursor passive DNS"). In this process, they 335 intentionally omit the source IP, source port, destination IP and 336 destination port from the captured packets. Since the data is 337 captured "post-recursor", the timing information (who queries what) 338 is lost, since the recursor will cache the results. Furthermore, 339 since multiple sensors feed into a passive DNS server, the resulting 340 data gets mixed together, reducing the likelihood that Passive DNS 341 Servers are able to find out much about the actual person querying 342 the DNS records nor who actually sent the query. In this sense, 343 passive DNS Servers are similar to keeping an archive of all previous 344 phone books - if public DNS records can be compared to phone numbers 345 - as they often are. Nevertheless, the authors strongly encourage 346 Passive DNS implementors to take special care of privacy issues. 347 bortzmeyer-dnsop-dns-privacy is an excellent starting point for this. 348 Finally, the overall recommendations in RFC6973 [RFC6973] should be 349 taken into consideration when designing any application which uses 350 Passive DNS data. 352 In the scope of the General Data Protection Regulation (GDPR - 353 Directive 95/46/EC), operators of Passive DNS Server needs to ensure 354 the legal ground and lawfulness of its operation. 356 7. Security Considerations 358 In some cases, Passive DNS output might contain confidential 359 information and its access might be restricted. When a user is 360 querying multiple Passive DNS and aggregating the data, the 361 sensitivity of the data must be considered. 363 8. References 365 8.1. Normative References 367 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 368 STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, 369 . 371 [RFC1035] Mockapetris, P., "Domain names - implementation and 372 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, 373 November 1987, . 375 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 376 Requirement Levels", BCP 14, RFC 2119, 377 DOI 10.17487/RFC2119, March 1997, 378 . 380 [RFC2234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 381 Specifications: ABNF", RFC 2234, DOI 10.17487/RFC2234, 382 November 1997, . 384 [RFC3597] Gustafsson, A., "Handling of Unknown DNS Resource Record 385 (RR) Types", RFC 3597, DOI 10.17487/RFC3597, September 386 2003, . 388 [RFC3912] Daigle, L., "WHOIS Protocol Specification", RFC 3912, 389 DOI 10.17487/RFC3912, September 2004, 390 . 392 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 393 Resource Identifier (URI): Generic Syntax", STD 66, 394 RFC 3986, DOI 10.17487/RFC3986, January 2005, 395 . 397 [RFC4627] Crockford, D., "The application/json Media Type for 398 JavaScript Object Notation (JSON)", RFC 4627, 399 DOI 10.17487/RFC4627, July 2006, 400 . 402 [RFC5001] Austein, R., "DNS Name Server Identifier (NSID) Option", 403 RFC 5001, DOI 10.17487/RFC5001, August 2007, 404 . 406 [RFC6648] Saint-Andre, P., Crocker, D., and M. Nottingham, 407 "Deprecating the "X-" Prefix and Similar Constructs in 408 Application Protocols", BCP 178, RFC 6648, 409 DOI 10.17487/RFC6648, June 2012, 410 . 412 [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 413 Morris, J., Hansen, M., and R. Smith, "Privacy 414 Considerations for Internet Protocols", RFC 6973, 415 DOI 10.17487/RFC6973, July 2013, 416 . 418 8.2. References 420 [BAILIWICK] 421 Edmonds, R., "Passive DNS Hardening", 2010, 422 . 425 [CACHEPOISONING] 426 Kaminsky, D., "Black ops 2008: It's the end of the cache 427 as we know it.", 2008, 428 . 430 [DNSDB] Security, F., "DNSDB API", 2013, 431 . 433 [DNSDBQ] Vixie, P., "DNSDB API Client, C Version", 2018, 434 . 436 [PDNSCERTAT] 437 CERT.at, "pDNS presentation at 4th Centr R&D workshop 438 Frankfurt Jun 5th 2012", 2012, 439 . 442 [PDNSCIRCL] 443 Luxembourg, C. -. I. R. C., "CIRCL Passive DNS", 2012, 444 . 446 [PDNSCLIENT] 447 Lee, C., "Queries 5 major Passive DNS databases: BFK, 448 CERTEE, DNSParse, ISC, and VirusTotal.", 2013, 449 . 451 [PDNSCOF] Dulaunoy, D. P. A., "Passive DNS server interface using 452 the common output format", 2013, 453 . 455 [REST] Fielding, R. T., "Representational State Transfer (REST)", 456 2000, . 459 [WEIMERPDNS] 460 Weimer, F., "Passive DNS Replication", 2005, 461 . 464 8.3. Informative References 466 [I-D.narten-iana-considerations-rfc2434bis] 467 Narten, T. and H. Alvestrand, "Guidelines for Writing an 468 IANA Considerations Section in RFCs", draft-narten-iana- 469 considerations-rfc2434bis-09 (work in progress), March 470 2008. 472 [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 473 Text on Security Considerations", BCP 72, RFC 3552, 474 DOI 10.17487/RFC3552, July 2003, 475 . 477 Appendix A. Examples 479 The JSON output are represented on multiple lines for readability but 480 each JSON object should be on a single line. 482 If you query a passive DNS for the rrname www.ietf.org, the passive 483 dns common output format can be: 485 {"count": 102, "time_first": 1298412391, "rrtype": "AAAA", 486 "rrname": "www.ietf.org", "rdata": "2001:1890:1112:1::20", 487 "time_last": 1302506851} 488 {"count": 59, "time_first": 1384865833, "rrtype": "A", 489 "rrname": "www.ietf.org", "rdata": "4.31.198.44", 490 "time_last": 1389022219} 492 If you query a passive DNS for the rrname ietf.org, the passive dns 493 common output format can be: 495 {"count": 109877, "time_first": 1298398002, "rrtype": "NS", 496 "rrname": "ietf.org", "rdata": "ns1.yyz1.afilias-nst.info", 497 "time_last": 1389095375} 498 {"count": 4, "time_first": 1298495035, "rrtype": "A", 499 "rrname": "ietf.org", "rdata": "64.170.98.32", 500 "time_last": 1298495035} 501 {"count": 9, "time_first": 1317037550, "rrtype": "AAAA", 502 "rrname": "ietf.org", "rdata": "2001:1890:123a::1:1e", 503 "time_last": 1330209752} 505 Please note that the examples imply that a single query returns a 506 single set of JSON objects. For example, two queries were made; one 507 query returned a set of two JSON objects and the other query returned 508 a set of three JSON objects. This specification requires each JSON 509 object individually MUST conform to the common output format, but 510 this specification does not require that a query will return a set of 511 JSON objects. 513 Please note that in the examples above, any backslashes "\" can be 514 ignored and are an artifact of the tools which produced this 515 document. 517 Authors' Addresses 519 Alexandre Dulaunoy 520 CIRCL 521 16, bd d'Avranches 522 Luxembourg L-1160 523 Luxembourg 525 Phone: (+352) 247 88444 526 Email: alexandre.dulaunoy@circl.lu 527 URI: http://www.circl.lu/ 529 L. Aaron Kaplan 530 CERT.at 531 Karlsplatz 1/2/9 532 Vienna A-1010 533 Austria 535 Phone: +43 1 5056416 78 536 Email: kaplan@cert.at 537 URI: http://www.cert.at/ 539 Paul Vixie 540 Farsight Security, Inc. 541 11400 La Honda Road 542 Woodside, California 94062 543 U.S.A. 545 Email: paul@redbarn.org 546 URI: https://www.farsightsecurity.com/ 547 Henry Stern 548 Farsight Security, Inc. 549 11400 La Honda Road 550 Woodside, California 94062 551 U.S.A. 553 Phone: +1 650 542-7836 554 Email: henry@stern.ca 555 URI: https://www.farsightsecurity.com/