idnits 2.17.1 draft-dulaunoy-dnsop-passive-dns-cof-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 8, 2019) is 1846 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'WEIMERPDNS' is mentioned on line 445, but not defined == Missing Reference: 'REST' is mentioned on line 441, but not defined == Missing Reference: 'DNSDB' is mentioned on line 416, but not defined == Missing Reference: 'DNSDBQ' is mentioned on line 419, but not defined == Missing Reference: 'PDNSCERTAT' is mentioned on line 422, but not defined == Missing Reference: 'PDNSCIRCL' is mentioned on line 428, but not defined == Missing Reference: 'PDNSCOF' is mentioned on line 437, but not defined == Missing Reference: 'PDNSCLIENT' is mentioned on line 432, but not defined == Missing Reference: 'BAILIWICK' is mentioned on line 406, but not defined == Missing Reference: 'CACHEPOISONING' is mentioned on line 411, but not defined == Unused Reference: 'RFC5001' is defined on line 388, but no explicit reference was found in the text == Unused Reference: 'I-D.narten-iana-considerations-rfc2434bis' is defined on line 452, but no explicit reference was found in the text == Unused Reference: 'RFC3552' is defined on line 458, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2234 (Obsoleted by RFC 4234) ** Obsolete normative reference: RFC 4627 (Obsoleted by RFC 7158, RFC 7159) Summary: 2 errors (**), 0 flaws (~~), 14 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Domain Name System Operations A. Dulaunoy 3 Internet-Draft CIRCL 4 Intended status: Informational A. Kaplan 5 Expires: October 10, 2019 CERT.at 6 P. Vixie 7 H. Stern 8 Farsight Security, Inc. 9 April 8, 2019 11 Passive DNS - Common Output Format 12 draft-dulaunoy-dnsop-passive-dns-cof-06 14 Abstract 16 This document describes a common output format of Passive DNS Servers 17 which clients can query. The output format description includes also 18 in addition a common semantic for each Passive DNS system. By having 19 multiple Passive DNS Systems adhere to the same output format for 20 queries, users of multiple Passive DNS servers will be able to 21 combine result sets easily. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at https://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on October 10, 2019. 40 Copyright Notice 42 Copyright (c) 2019 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (https://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 58 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 59 2. Limitation . . . . . . . . . . . . . . . . . . . . . . . . . 3 60 3. Common Output Format . . . . . . . . . . . . . . . . . . . . 3 61 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 4 62 3.2. ABNF grammar . . . . . . . . . . . . . . . . . . . . . . 4 63 3.3. Mandatory Fields . . . . . . . . . . . . . . . . . . . . 4 64 3.3.1. rrname . . . . . . . . . . . . . . . . . . . . . . . 5 65 3.3.2. rrtype . . . . . . . . . . . . . . . . . . . . . . . 5 66 3.3.3. rdata . . . . . . . . . . . . . . . . . . . . . . . . 5 67 3.3.4. time_first . . . . . . . . . . . . . . . . . . . . . 6 68 3.3.5. time_last . . . . . . . . . . . . . . . . . . . . . . 6 69 3.4. Optional Fields . . . . . . . . . . . . . . . . . . . . . 6 70 3.4.1. count . . . . . . . . . . . . . . . . . . . . . . . . 6 71 3.4.2. bailiwick . . . . . . . . . . . . . . . . . . . . . . 6 72 3.5. Additional Fields . . . . . . . . . . . . . . . . . . . . 6 73 3.5.1. sensor_id . . . . . . . . . . . . . . . . . . . . . . 6 74 3.5.2. zone_time_first . . . . . . . . . . . . . . . . . . . 7 75 3.5.3. zone_time_last . . . . . . . . . . . . . . . . . . . 7 76 3.5.4. origin . . . . . . . . . . . . . . . . . . . . . . . 7 77 3.6. Additional Fields Registry . . . . . . . . . . . . . . . 7 78 4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 79 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 80 6. Privacy Considerations . . . . . . . . . . . . . . . . . . . 7 81 7. Security Considerations . . . . . . . . . . . . . . . . . . . 8 82 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 83 8.1. Normative References . . . . . . . . . . . . . . . . . . 8 84 8.2. References . . . . . . . . . . . . . . . . . . . . . . . 9 85 8.3. Informative References . . . . . . . . . . . . . . . . . 10 86 Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . 10 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 89 1. Introduction 91 Passive DNS is a technique described by Florian Weimer in 2005 in 92 Passive DNS replication, F Weimer - 17th Annual FIRST Conference on 93 Computer Security [WEIMERPDNS]. Since then multiple Passive DNS 94 implementations were created and evolved over time. Users of these 95 Passive DNS servers may query a server (often via WHOIS [RFC3912] or 96 HTTP REST [REST]), parse the results and process them in other 97 applications. 99 There are multiple implementations of Passive DNS software. Users of 100 passive DNS query each implementation and aggregate the results for 101 their search. This document describes the output format of four 102 Passive DNS Systems ([DNSDB], [DNSDBQ], [PDNSCERTAT], [PDNSCIRCL] and 103 [PDNSCOF]) which are in use today and which already share a nearly 104 identical output format. As the format and the meaning of output 105 fields from each Passive DNS need to be consistent, we propose in 106 this document a solution to commonly name each field along with their 107 corresponding interpretation. The format follows a simple key-value 108 structure in JSON [RFC4627] format. The benefit of having a 109 consistent Passive DNS output format is that multiple client 110 implementations can query different servers without having to have a 111 separate parser for each individual server. passivedns-client 112 [PDNSCLIENT] currently implements multiple parsers due to a lack of 113 standardization. The document does not describe the protocol (e.g. 114 WHOIS [RFC3912], HTTP REST [REST]) nor the query format used to query 115 the Passive DNS. Neither does this document describe "pre-recursor" 116 Passive DNS Systems. Both of these are separate topics and deserve 117 their own RFC document. The document describes the current best 118 practices implemented in various Passive DNS server implementations. 120 1.1. Requirements Language 122 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 123 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 124 document are to be interpreted as described in RFC 2119 [RFC2119]. 126 2. Limitation 128 As a Passive DNS servers can include protection mechanisms for their 129 operation, results might be different due to those protection 130 measures. These mechanisms filter out DNS answers if they fail some 131 criteria. The bailiwick algorithm [BAILIWICK] protects the Passive 132 DNS Database from cache poisoning attacks [CACHEPOISONING]. Another 133 limitation that clients querying the database need to be aware of is 134 that each query simply gets a snapshot-answer of the time of 135 querying. Clients MUST NOT rely on consistent answers. Nor must 136 they assume that answers must be identical across multiple Passive 137 DNS Servers. 139 3. Common Output Format 140 3.1. Overview 142 The formatting of the answer follows the JSON [RFC4627] format. In 143 fact, it is a subset of the full JSON language. Notable differences 144 are the modified definition of whitespace ("ws"). The order of the 145 fields is not significant for the same resource type. 147 The intent of this output format is to be easily parsable by scripts. 148 Each JSON object is expressed on a single line to be processed by the 149 client line-by-line. Every implementation MUST support the JSON 150 output format. 152 Examples of JSON (Appendix A) output are in the appendix. 154 3.2. ABNF grammar 156 Formal grammar as defined in ABNF [RFC2234] 158 answer = entries 159 entries = * ( entry CR) 160 entry = "{" keyvallist "}" 161 keyvallist = [ member *( value-separator member ) ] 162 member = qm field qm name-separator value 163 name-separator = ws %x3A ws ; a ":" colon 164 value = value ; as defined in the JSON RFC 165 value-separator = ws %x2C ws ; , comma. As defined in JSON 166 field = "rrname" | "rrtype" | "rdata" | "time_first" | 167 "time_last" | "count" | "bailiwick" | "sensor_id" | 168 "zone_time_first" | "zone_time_last" | "origin" | 169 futureField 170 futureField = string 171 CR = %x0D 172 qm = %x22 ; " a quotation mark 173 ws = *( 174 %x20 | ; Space 175 %x09 ; Horizontal tab 176 ) 178 Note that value is defined in JSON [RFC4627] and has the exact same 179 specification as there. The same goes for the definition of string. 181 3.3. Mandatory Fields 183 Implementation MUST support all the mandatory fields. 185 Uniqueness property: the tuple (rrname,rrtype,rdata) will always be 186 unique within one answer per server. While rrname and rrtype are 187 always individual JSON primitive types (strings, numbers, booleans or 188 null), rdata MAY return multiple resource records or a single record. 189 When multiple resource records are returned, rdata MUST be a JSON 190 array. In the case of a single resource record is returned, rdata 191 MUST be a JSON string or a JSON array containing one JSON string. 192 Senders SHOULD send an array for rdata, but receivers MUST be able to 193 accept a single-string result for rdata. 195 3.3.1. rrname 197 This field returns the name of the queried resource. 199 3.3.2. rrtype 201 This field returns the resource record type as seen by the passive 202 DNS. The key is rrtype and the value is in the interpreted record 203 type represented as a JSON [RFC4627] string. If the value cannot be 204 interpreted, the decimal value is returned following the principle of 205 transparency as described in RFC 3597 [RFC3597]. Then the decimal 206 value is represented as a JSON [RFC4627] number. The resource record 207 type can be any values as described by IANA in the DNS parameters 208 document in the section 'Resource Record (RR) TYPEs' 209 (http://www.iana.org/assignments/dns-parameters). Supported textual 210 descriptions of rrtypes include: A, AAAA, CNAME, etc. A client MUST 211 be able to understand these textual rrtype values represented as a 212 JSON [RFC4627] string. In addition, a client MUST be able to handle 213 a decimal value (as mentioned above) answer represented as a JSON 214 [RFC4627] number. 216 3.3.3. rdata 218 This field returns the resource records of the queried resource. 219 When multiple resource records are returned, rdata MUST be a JSON 220 array containing JSON strings. In the case of a single resource 221 record is returned, rdata MUST be a JSON string or a JSON array 222 containing one JSON string. Each resource record is represented as a 223 JSON [RFC4627] string. Each resource record MUST be escaped as 224 defined in section 2.6 of RFC4627 [RFC4627]. Depending on the 225 rrtype, this can be an IPv4 or IPv6 address, a domain name (as in the 226 case of CNAMEs), an SPF record, etc. A client MUST be able to 227 interpret any value which is legal as the right hand side in a DNS 228 master file RFC 1035 [RFC1035] and RFC 1034 [RFC1034]. If the rdata 229 came from an unknown DNS resource records, the server must follow the 230 transparency principle as described in RFC 3597 [RFC3597]. 232 3.3.4. time_first 234 This field returns the first time that the record / unique tuple 235 (rrname, rrtype, rdata) has been seen by the passive DNS. The date 236 is expressed in seconds (decimal) since 1st of January 1970 (Unix 237 timestamp). The time zone MUST be UTC. This field is represented as 238 a JSON [RFC4627] number. 240 3.3.5. time_last 242 This field returns the last time that the unique tuple (rrname, 243 rrtype, rdata) record has been seen by the passive DNS. The date is 244 expressed in seconds (decimal) since 1st of January 1970 (Unix 245 timestamp). The time zone MUST be UTC. This field is represented as 246 a JSON [RFC4627] number. 248 3.4. Optional Fields 250 Implementations SHOULD support one or more fields. 252 3.4.1. count 254 Specifies how many authoritative DNS answers were received at the 255 Passive DNS Server's collectors with exactly the given set of values 256 as answers (i.e. same data in the answer set - compare with the 257 uniqueness property in "Mandatory Fields"). The number of requests 258 is expressed as a decimal value. This field is represented as a JSON 259 [RFC4627] number. 261 3.4.2. bailiwick 263 The bailiwick is the best estimate of the apex of the zone where this 264 data is authoritative. 266 3.5. Additional Fields 268 Implementations MAY support the following fields: 270 3.5.1. sensor_id 272 This field returns the sensor information where the record was seen. 273 It is represented as a JSON [RFC4627] string. 275 If the data originate from sensors or probes which are part of a 276 publicly-known gathering or measurement system (e.g. RIPE Atlas), a 277 JSON [RFC4627] string SHOULD be prefixed. 279 3.5.2. zone_time_first 281 This field returns the first time that the unique tuple (rrname, 282 rrtype, rdata) record has been seen via master file import. The date 283 is expressed in seconds (decimal) since 1st of January 1970 (Unix 284 timestamp). The time zone MUST be UTC. This field is represented as 285 a JSON [RFC4627] number. 287 3.5.3. zone_time_last 289 This field returns the last time that the unique tuple (rrname, 290 rrtype, rdata) record has been seen via master file import. The date 291 is expressed in seconds (decimal) since 1st of January 1970 (Unix 292 timestamp). The time zone MUST be UTC. This field is represented as 293 a JSON [RFC4627] number. 295 3.5.4. origin 297 Specifies the resource origin of the Passive DNS response. This 298 field is represented as a Uniform Resource Identifier [RFC3986] 299 (URI). 301 3.6. Additional Fields Registry 303 In accordance with [RFC6648], designers of new passive DNS 304 applications that would need additional fields can request and 305 register new field name at https://github.com/adulau/pdns-qof/wiki/ 306 Additional-Fields. 308 4. Acknowledgements 310 Thanks to the Passive DNS developers who contributed to the document. 312 5. IANA Considerations 314 This memo includes no request to IANA. 316 6. Privacy Considerations 318 Passive DNS Servers capture DNS answers from multiple collecting 319 points ("sensors") which are located on the Internet-facing side of 320 DNS recursors ("post-recursor passive DNS"). In this process, they 321 intentionally omit the source IP, source port, destination IP and 322 destination port from the captured packets. Since the data is 323 captured "post-recursor", the timing information (who queries what) 324 is lost, since the recursor will cache the results. Furthermore, 325 since multiple sensors feed into a passive DNS server, the resulting 326 data gets mixed together, reducing the likelihood that Passive DNS 327 Servers are able to find out much about the actual person querying 328 the DNS records nor who actually sent the query. In this sense, 329 passive DNS Servers are similar to keeping an archive of all previous 330 phone books - if public DNS records can be compared to phone numbers 331 - as they often are. Nevertheless, the authors strongly encourage 332 Passive DNS implementors to take special care of privacy issues. 333 bortzmeyer-dnsop-dns-privacy is an excellent starting point for this. 334 Finally, the overall recommendations in RFC6973 [RFC6973] should be 335 taken into consideration when designing any application which uses 336 Passive DNS data. 338 In the scope of the General Data Protection Regulation (GDPR - 339 Directive 95/46/EC), operators of Passive DNS Server needs to ensure 340 the legal ground and lawfulness of its operation. 342 7. Security Considerations 344 In some cases, Passive DNS output might contain confidential 345 information and its access might be restricted. When a user is 346 querying multiple Passive DNS and aggregating the data, the 347 sensitivity of the data must be considered. 349 8. References 351 8.1. Normative References 353 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 354 STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, 355 . 357 [RFC1035] Mockapetris, P., "Domain names - implementation and 358 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, 359 November 1987, . 361 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 362 Requirement Levels", BCP 14, RFC 2119, 363 DOI 10.17487/RFC2119, March 1997, 364 . 366 [RFC2234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 367 Specifications: ABNF", RFC 2234, DOI 10.17487/RFC2234, 368 November 1997, . 370 [RFC3597] Gustafsson, A., "Handling of Unknown DNS Resource Record 371 (RR) Types", RFC 3597, DOI 10.17487/RFC3597, September 372 2003, . 374 [RFC3912] Daigle, L., "WHOIS Protocol Specification", RFC 3912, 375 DOI 10.17487/RFC3912, September 2004, 376 . 378 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 379 Resource Identifier (URI): Generic Syntax", STD 66, 380 RFC 3986, DOI 10.17487/RFC3986, January 2005, 381 . 383 [RFC4627] Crockford, D., "The application/json Media Type for 384 JavaScript Object Notation (JSON)", RFC 4627, 385 DOI 10.17487/RFC4627, July 2006, 386 . 388 [RFC5001] Austein, R., "DNS Name Server Identifier (NSID) Option", 389 RFC 5001, DOI 10.17487/RFC5001, August 2007, 390 . 392 [RFC6648] Saint-Andre, P., Crocker, D., and M. Nottingham, 393 "Deprecating the "X-" Prefix and Similar Constructs in 394 Application Protocols", BCP 178, RFC 6648, 395 DOI 10.17487/RFC6648, June 2012, 396 . 398 [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 399 Morris, J., Hansen, M., and R. Smith, "Privacy 400 Considerations for Internet Protocols", RFC 6973, 401 DOI 10.17487/RFC6973, July 2013, 402 . 404 8.2. References 406 [BAILIWICK] 407 Edmonds, R., "Passive DNS Hardening", 2010, 408 . 411 [CACHEPOISONING] 412 Kaminsky, D., "Black ops 2008: It's the end of the cache 413 as we know it.", 2008, 414 . 416 [DNSDB] Security, F., "DNSDB API", 2013, 417 . 419 [DNSDBQ] Vixie, P., "DNSDB API Client, C Version", 2018, 420 . 422 [PDNSCERTAT] 423 CERT.at, "pDNS presentation at 4th Centr R&D workshop 424 Frankfurt Jun 5th 2012", 2012, 425 . 428 [PDNSCIRCL] 429 Luxembourg, C. -. I. R. C., "CIRCL Passive DNS", 2012, 430 . 432 [PDNSCLIENT] 433 Lee, C., "Queries 5 major Passive DNS databases: BFK, 434 CERTEE, DNSParse, ISC, and VirusTotal.", 2013, 435 . 437 [PDNSCOF] Dulaunoy, D. P. A., "Passive DNS server interface using 438 the common output format", 2013, 439 . 441 [REST] Fielding, R. T., "Representational State Transfer (REST)", 442 2000, . 445 [WEIMERPDNS] 446 Weimer, F., "Passive DNS Replication", 2005, 447 . 450 8.3. Informative References 452 [I-D.narten-iana-considerations-rfc2434bis] 453 Narten, T. and H. Alvestrand, "Guidelines for Writing an 454 IANA Considerations Section in RFCs", draft-narten-iana- 455 considerations-rfc2434bis-09 (work in progress), March 456 2008. 458 [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 459 Text on Security Considerations", BCP 72, RFC 3552, 460 DOI 10.17487/RFC3552, July 2003, 461 . 463 Appendix A. Examples 465 The JSON output are represented on multiple lines for readability but 466 each JSON object should be on a single line. 468 If you query a passive DNS for the rrname www.ietf.org, the passive 469 dns common output format can be: 471 {"count": 102, "time_first": 1298412391, "rrtype": "AAAA", 472 "rrname": "www.ietf.org", "rdata": "2001:1890:1112:1::20", 473 "time_last": 1302506851} 474 {"count": 59, "time_first": 1384865833, "rrtype": "A", 475 "rrname": "www.ietf.org", "rdata": "4.31.198.44", 476 "time_last": 1389022219} 478 If you query a passive DNS for the rrname ietf.org, the passive dns 479 common output format can be: 481 {"count": 109877, "time_first": 1298398002, "rrtype": "NS", 482 "rrname": "ietf.org", "rdata": "ns1.yyz1.afilias-nst.info", 483 "time_last": 1389095375} 484 {"count": 4, "time_first": 1298495035, "rrtype": "A", 485 "rrname": "ietf.org", "rdata": "64.170.98.32", 486 "time_last": 1298495035} 487 {"count": 9, "time_first": 1317037550, "rrtype": "AAAA", 488 "rrname": "ietf.org", "rdata": "2001:1890:123a::1:1e", 489 "time_last": 1330209752} 491 Please note that the examples imply that a single query returns a 492 single set of JSON objects. For example, two queries were made; one 493 query returned a set of two JSON objects and the other query returned 494 a set of three JSON objects. This specification requires each JSON 495 object individually MUST conform to the common output format, but 496 this specification does not require that a query will return a set of 497 JSON objects. 499 Please note that in the examples above, any backslashes "\" can be 500 ignored and are an artifact of the tools which produced this 501 document. 503 Authors' Addresses 505 Alexandre Dulaunoy 506 CIRCL 507 16, bd d'Avranches 508 Luxembourg L-1160 509 Luxembourg 511 Phone: (+352) 247 88444 512 Email: alexandre.dulaunoy@circl.lu 513 URI: http://www.circl.lu/ 514 L. Aaron Kaplan 515 CERT.at 516 Karlsplatz 1/2/9 517 Vienna A-1010 518 Austria 520 Phone: +43 1 5056416 78 521 Email: kaplan@cert.at 522 URI: http://www.cert.at/ 524 Paul Vixie 525 Farsight Security, Inc. 526 11400 La Honda Road 527 Woodside, California 94062 528 U.S.A. 530 Email: paul@redbarn.org 531 URI: https://www.farsightsecurity.com/ 533 Henry Stern 534 Farsight Security, Inc. 535 11400 La Honda Road 536 Woodside, California 94062 537 U.S.A. 539 Phone: +1 650 542-7836 540 Email: henry@stern.ca 541 URI: https://www.farsightsecurity.com/