idnits 2.17.1 draft-ietf-dnsop-dns-capture-format-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 3, 2017) is 2489 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 1527 -- Looks like a reference, but probably isn't: '2' on line 1530 -- Looks like a reference, but probably isn't: '3' on line 1533 -- Looks like a reference, but probably isn't: '4' on line 1536 -- Looks like a reference, but probably isn't: '5' on line 1539 -- Looks like a reference, but probably isn't: '6' on line 1542 -- Looks like a reference, but probably isn't: '7' on line 1545 -- Looks like a reference, but probably isn't: '8' on line 1547 -- Looks like a reference, but probably isn't: '9' on line 1549 -- Looks like a reference, but probably isn't: '10' on line 1551 -- Looks like a reference, but probably isn't: '11' on line 1554 -- Looks like a reference, but probably isn't: '12' on line 1953 -- Looks like a reference, but probably isn't: '13' on line 1959 -- Looks like a reference, but probably isn't: '14' on line 2002 -- Looks like a reference, but probably isn't: '15' on line 2012 -- Looks like a reference, but probably isn't: '16' on line 2025 -- Looks like a reference, but probably isn't: '17' on line 2051 -- Looks like a reference, but probably isn't: '18' on line 2052 -- Looks like a reference, but probably isn't: '19' on line 2054 -- Looks like a reference, but probably isn't: '20' on line 2057 -- Looks like a reference, but probably isn't: '21' on line 2059 -- Looks like a reference, but probably isn't: '22' on line 2061 -- Looks like a reference, but probably isn't: '23' on line 2246 -- Looks like a reference, but probably isn't: '24' on line 2248 ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) == Outdated reference: A later version (-11) exists of draft-greevenbosch-appsawg-cbor-cddl-10 == Outdated reference: A later version (-16) exists of draft-hoffman-dns-in-json-12 -- Obsolete informational reference (is this intentional?): RFC 7159 (Obsoleted by RFC 8259) Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 26 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 dnsop J. Dickinson 3 Internet-Draft J. Hague 4 Intended status: Standards Track S. Dickinson 5 Expires: January 4, 2018 Sinodun IT 6 T. Manderson 7 J. Bond 8 ICANN 9 July 3, 2017 11 C-DNS: A DNS Packet Capture Format 12 draft-ietf-dnsop-dns-capture-format-03 14 Abstract 16 This document describes a data representation for collections of DNS 17 messages. The format is designed for efficient storage and 18 transmission of large packet captures of DNS traffic; it attempts to 19 minimize the size of such packet capture files but retain the full 20 DNS message contents along with the most useful transport metadata. 21 It is intended to assist with the development of DNS traffic 22 monitoring applications. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on January 4, 2018. 41 Copyright Notice 43 Copyright (c) 2017 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 3. Data Collection Use Cases . . . . . . . . . . . . . . . . . . 5 61 4. Design Considerations . . . . . . . . . . . . . . . . . . . . 7 62 5. Conceptual Overview . . . . . . . . . . . . . . . . . . . . . 8 63 6. Choice of CBOR . . . . . . . . . . . . . . . . . . . . . . . 8 64 7. The C-DNS format . . . . . . . . . . . . . . . . . . . . . . 9 65 7.1. CDDL definition . . . . . . . . . . . . . . . . . . . . . 9 66 7.2. Format overview . . . . . . . . . . . . . . . . . . . . . 9 67 7.3. File header contents . . . . . . . . . . . . . . . . . . 10 68 7.4. File preamble contents . . . . . . . . . . . . . . . . . 10 69 7.5. Configuration contents . . . . . . . . . . . . . . . . . 11 70 7.6. Block contents . . . . . . . . . . . . . . . . . . . . . 13 71 7.7. Block preamble map . . . . . . . . . . . . . . . . . . . 13 72 7.8. Block statistics . . . . . . . . . . . . . . . . . . . . 14 73 7.9. Block table map . . . . . . . . . . . . . . . . . . . . . 14 74 7.10. IP address table . . . . . . . . . . . . . . . . . . . . 15 75 7.11. Class/Type table . . . . . . . . . . . . . . . . . . . . 15 76 7.12. Name/RDATA table . . . . . . . . . . . . . . . . . . . . 16 77 7.13. Query Signature table . . . . . . . . . . . . . . . . . . 16 78 7.14. Question table . . . . . . . . . . . . . . . . . . . . . 19 79 7.15. Resource Record (RR) table . . . . . . . . . . . . . . . 19 80 7.16. Question list table . . . . . . . . . . . . . . . . . . . 19 81 7.17. Resource Record list table . . . . . . . . . . . . . . . 20 82 7.18. Query/Response data . . . . . . . . . . . . . . . . . . . 20 83 7.19. Address Event counts . . . . . . . . . . . . . . . . . . 23 84 7.20. Malformed packet records . . . . . . . . . . . . . . . . 23 85 8. Malformed Packets . . . . . . . . . . . . . . . . . . . . . . 24 86 9. C-DNS to PCAP . . . . . . . . . . . . . . . . . . . . . . . . 25 87 9.1. Name Compression . . . . . . . . . . . . . . . . . . . . 26 88 10. Data Collection . . . . . . . . . . . . . . . . . . . . . . . 26 89 10.1. Matching algorithm . . . . . . . . . . . . . . . . . . . 27 90 10.2. Message identifiers . . . . . . . . . . . . . . . . . . 27 91 10.2.1. Primary ID (required) . . . . . . . . . . . . . . . 27 92 10.2.2. Secondary ID (optional) . . . . . . . . . . . . . . 28 93 10.3. Algorithm Parameters . . . . . . . . . . . . . . . . . . 28 94 10.4. Algorithm Requirements . . . . . . . . . . . . . . . . . 28 95 10.5. Algorithm Limitations . . . . . . . . . . . . . . . . . 28 96 10.6. Workspace . . . . . . . . . . . . . . . . . . . . . . . 28 97 10.7. Output . . . . . . . . . . . . . . . . . . . . . . . . . 29 98 10.8. Post Processing . . . . . . . . . . . . . . . . . . . . 29 99 11. Implementation Status . . . . . . . . . . . . . . . . . . . . 29 100 11.1. DNS-STATS Compactor . . . . . . . . . . . . . . . . . . 30 101 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 102 13. Security Considerations . . . . . . . . . . . . . . . . . . . 30 103 14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 30 104 15. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 31 105 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 32 106 16.1. Normative References . . . . . . . . . . . . . . . . . . 32 107 16.2. Informative References . . . . . . . . . . . . . . . . . 32 108 16.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 33 109 Appendix A. CDDL . . . . . . . . . . . . . . . . . . . . . . . . 34 110 Appendix B. DNS Name compression example . . . . . . . . . . . . 41 111 B.1. NSD compression algorithm . . . . . . . . . . . . . . . . 42 112 B.2. Knot Authoritative compression algorithm . . . . . . . . 42 113 B.3. Observed differences . . . . . . . . . . . . . . . . . . 43 114 Appendix C. Comparison of Binary Formats . . . . . . . . . . . . 43 115 C.1. Comparison with full PCAP files . . . . . . . . . . . . . 46 116 C.2. Simple versus block coding . . . . . . . . . . . . . . . 46 117 C.3. Binary versus text formats . . . . . . . . . . . . . . . 47 118 C.4. Performance . . . . . . . . . . . . . . . . . . . . . . . 47 119 C.5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . 47 120 C.6. Block size choice . . . . . . . . . . . . . . . . . . . . 48 121 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 49 123 1. Introduction 125 There has long been a need to collect DNS queries and responses on 126 authoritative and recursive name servers for monitoring and analysis. 127 This data is used in a number of ways including traffic monitoring, 128 analyzing network attacks and "day in the life" (DITL) [ditl] 129 analysis. 131 A wide variety of tools already exist that facilitate the collection 132 of DNS traffic data, such as DSC [dsc], packetq [packetq], dnscap 133 [dnscap] and dnstap [dnstap]. However, there is no standard exchange 134 format for large DNS packet captures. The PCAP [pcap] or PCAP-NG 135 [pcapng] formats are typically used in practice for packet captures, 136 but these file formats can contain a great deal of additional 137 information that is not directly pertinent to DNS traffic analysis 138 and thus unnecessarily increases the capture file size. 140 There has also been work on using text based formats to describe DNS 141 packets such as [I-D.daley-dnsxml], [I-D.hoffman-dns-in-json], but 142 these are largely aimed at producing convenient representations of 143 single messages. 145 Many DNS operators may receive hundreds of thousands of queries per 146 second on a single name server instance so a mechanism to minimize 147 the storage size (and therefore upload overhead) of the data 148 collected is highly desirable. 150 The format described in this document, C-DNS (Compacted-DNS), 151 focusses on the problem of capturing and storing large packet capture 152 files of DNS traffic. with the following goals in mind: 154 o Minimize the file size for storage and transmission 156 o Minimizing the overhead of producing the packet capture file and 157 the cost of any further (general purpose) compression of the file 159 This document contains: 161 o A discussion of the some common use cases in which such DNS data 162 is collected Section 3 164 o A discussion of the major design considerations in developing an 165 efficient data representation for collections of DNS messages 166 Section 4 168 o A conceptual overview of the C-DNS format Section 5 170 o A description of why CBOR [RFC7049] was chosen for this format 171 Section 6 173 o The definition of the C-DNS format for the collection of DNS 174 messages Section 7. 176 o Notes on converting C-DNS data to PCAP format Section 9 178 o Some high level implementation considerations for applications 179 designed to produce C-DNS Section 10 181 2. Terminology 183 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 184 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 185 document are to be interpreted as described in [RFC2119]. 187 "Packet" refers to individual IPv4 or IPv6 packets. Typically these 188 are UDP, but may be constructed from a TCP packet. "Message", unless 189 otherwise qualified, refers to a DNS payload extracted from a UDP or 190 TCP data stream. 192 The parts of DNS messages are named as they are in [RFC1035]. In 193 specific, the DNS message has five sections: Header, Question, 194 Answer, Authority, and Additional. 196 Pairs of DNS messages are called a Query and a Response. 198 3. Data Collection Use Cases 200 In an ideal world, it would be optimal to collect full packet 201 captures of all packets going in or out of a name server. However, 202 there are several design choices or other limitations that are common 203 to many DNS installations and operators. 205 o DNS servers are hosted in a variety of situations 207 * Self-hosted servers 209 * Third party hosting (including multiple third parties) 211 * Third party hardware (including multiple third parties) 213 o Data is collected under different conditions 215 * On well-provisioned servers running in a steady state 217 * On heavily loaded servers 219 * On virtualized servers 221 * On servers that are under DoS attack 223 * On servers that are unwitting intermediaries in DoS attacks 225 o Traffic can be collected via a variety of mechanisms 227 * On the same hardware as the name server itself 229 * Using a network tap on an adjacent host to listen to DNS 230 traffic 232 * Using port mirroring to listen from another host 234 o The capabilities of data collection (and upload) networks vary 236 * Out-of-band networks with the same capacity as the in-band 237 network 239 * Out-of-band networks with less capacity than the in-band 240 network 242 * Everything being on the in-band network 244 Thus, there is a wide range of use cases from very limited data 245 collection environments (third party hardware, servers that are under 246 attack, packet capture on the name server itself and no out-of-band 247 network) to "limitless" environments (self hosted, well provisioned 248 servers, using a network tap or port mirroring with an out-of-band 249 networks with the same capacity as the in-band network). In the 250 former, it is infeasible to reliably collect full packet captures, 251 especially if the server is under attack. In the latter case, 252 collection of full packet captures may be reasonable. 254 As a result of these restrictions, the C-DNS data format was designed 255 with the most limited use case in mind such that: 257 o data collection will occur on the same hardware as the name server 258 itself 260 o collected data will be stored on the same hardware as the name 261 server itself, at least temporarily 263 o collected data being returned to some central analysis system will 264 use the same network interface as the DNS queries and responses 266 o there can be multiple third party servers involved 268 Because of these considerations, a major factor in the design of the 269 format is minimal storage size of the capture files. 271 Another significant consideration for any application that records 272 DNS traffic is that the running of the name server software and the 273 transmission of DNS queries and responses are the most important jobs 274 of a name server; capturing data is not. Any data collection system 275 co-located with the name server needs to be intelligent enough to 276 carefully manage its CPU, disk, memory and network utilization. This 277 leads to designing a format that requires a relatively low overhead 278 to produce and minimizes the requirement for further potentially 279 costly compression. 281 However, it was also essential that interoperability with less 282 restricted infrastructure was maintained. In particular, it is 283 highly desirable that the collection format should facilitate the re- 284 creation of common formats (such as PCAP) that are as close to the 285 original as is realistic given the restrictions above. 287 4. Design Considerations 289 This section presents some of the major design considerations used in 290 the development of the C-DNS format. 292 1. The basic unit of data is a combined DNS Query and the associated 293 Response (a "Q/R data item"). The same structure will be used 294 for unmatched Queries and Responses. Queries without Responses 295 will be captured omitting the response data. Responses without 296 queries will be captured omitting the Query data (but using the 297 Question section from the response, if present, as an identifying 298 QNAME). 300 * Rationale: A Query and Response represents the basic level of 301 a clients interaction with the server. Also, combining the 302 Query and Response into one item often reduces storage 303 requirements due to commonality in the data of the two 304 messages. 306 2. Each Q/R data item will comprise a default Q/R data description 307 and a set of optional sections. Inclusion of optional sections 308 shall be configurable. 310 * Rationale: Different users will have different requirements 311 for data to be available for analysis. Users with minimal 312 requirements should not have to pay the cost of recording full 313 data, however this will limit the ability to reconstruct 314 packet captures. For example, omitting the resource records 315 from a Response will reduce the files size, and in principle 316 responses can be synthesized if there is enough context. 318 3. Multiple Q/R data items will be collected into blocks in the 319 format. Common data in a block will be abstracted and referenced 320 from individual Q/R data items by indexing. The maximum number 321 of Q/R data items in a block will be configurable. 323 * Rationale: This blocking and indexing provides a significant 324 reduction in the volume of file data generated. Although this 325 introduces complexity, it provides compression of the data 326 that makes use of knowledge of the DNS message structure. 328 * It is anticipated that the files produced can be subject to 329 further compression using general purpose compression tools. 330 Measurements show that blocking significantly reduces the CPU 331 required to perform such strong compression. See 332 Appendix C.2. 334 * [TODO: Further discussion of commonality between DNS messages 335 e.g. common query signatures, a finite set of valid responses 336 from authoritatives] 338 4. Metadata about other packets received can optionally be included 339 in each block. For example, counts of malformed DNS packets and 340 non-DNS packets (e.g. ICMP, TCP resets) sent to the server may 341 be of interest. 343 5. The wire format content of malformed DNS packets can optionally 344 be recorded. 346 * Rationale: Any structured capture format that does not capture 347 the DNS payload byte for byte will be limited to some extent 348 in that it cannot represent "malformed" DNS packets (see 349 Section 8). Only those packets that can be transformed 350 reasonably into the structured format can be represented by 351 the format. However this can result in rather misleading 352 statistics. For example, a malformed query which cannot be 353 represented in the C-DNS format will lead to the (well formed) 354 DNS responses with error code FORMERR appearing as 355 'unmatched'. Therefore it can greatly aid downstream analysis 356 to have the wire format of the malformed DNS packets available 357 directly in the C-DNS file. 359 5. Conceptual Overview 361 The following figures show purely schematic representations of the 362 C-DNS format to convey the high-level structure of the C-DNS format. 363 Section 7 provides a detailed discussion of the CBOR representation 364 and individual elements. 366 Figure showing the C-DNS format (PNG) [1] 368 Figure showing the C-DNS format (SVG) [2] 370 Figure showing the Q/R data item and Block tables format (PNG) [3] 372 Figure showing the Q/R data item and Block tables format (SVG) [4] 374 6. Choice of CBOR 376 This document presents a detailed format description using CBOR, the 377 Concise Binary Object Representation defined in [RFC7049]. 379 The choice of CBOR was made taking a number of factors into account. 381 o CBOR is a binary representation, and thus is economical in storage 382 space. 384 o Other binary representations were investigated, and whilst all had 385 attractive features, none had a significant advantage over CBOR. 386 See Appendix C for some discussion of this. 388 o CBOR is an IETF standard and familiar to IETF participants. It is 389 based on the now-common ideas of lists and objects, and thus 390 requires very little familiarization for those in the wider 391 industry. 393 o CBOR is a simple format, and can easily be implemented from 394 scratch if necessary. More complex formats require library 395 support which may present problems on unusual platforms. 397 o CBOR can also be easily converted to text formats such as JSON 398 ([RFC7159]) for debugging and other human inspection requirements. 400 o CBOR data schemas can be described using CDDL 401 [I-D.greevenbosch-appsawg-cbor-cddl]. 403 7. The C-DNS format 405 7.1. CDDL definition 407 The CDDL definition for the C-DNS format is given in Appendix A. 409 7.2. Format overview 411 A C-DNS file begins with a file header containing a file type 412 identifier and a preamble. The preamble contains information on the 413 collection settings. 415 The file header is followed by a series of data blocks. 417 A block consists of a block header, containing various tables of 418 common data, and some statistics for the traffic received over the 419 block. The block header is then followed by a list of the Q/R data 420 items detailing the queries and responses received during processing 421 of the block input. The list of Q/R data items is in turn followed 422 by a list of per-client counts of particular IP events that occurred 423 during collection of the block data. 425 The exact nature of the DNS data will affect what block size is the 426 best fit, however sample data for a root server indicated that block 427 sizes up to 10,000 Q/R data items give good results. See 428 Appendix C.6 for more details. 430 If no field type is specified, then the field is unsigned. 432 In all quantities that contain bit flags, bit 0 indicates the least 433 significant bit. An item described as an index is the index of the 434 Q/R data item in the referenced table. Indexes are 1-based. An 435 index value of 0 is reserved to mean "not present". 437 7.3. File header contents 439 The file header contains the following: 441 +---------------+---------------+-----------------------------------+ 442 | Field | Type | Description | 443 +---------------+---------------+-----------------------------------+ 444 | file-type-id | Text string | String "C-DNS" identifying the | 445 | | | file type. | 446 | | | | 447 | file-preamble | Map of items | Collection information for the | 448 | | | whole file. | 449 | | | | 450 | file-blocks | Array of | The data blocks. | 451 | | Blocks | | 452 +---------------+---------------+-----------------------------------+ 454 7.4. File preamble contents 456 The file preamble contains the following: 458 +----------------------+----------+---------------------------------+ 459 | Field | Type | Description | 460 +----------------------+----------+---------------------------------+ 461 | major-format-version | Unsigned | Unsigned integer '1'. The major | 462 | | | version of format used in file. | 463 | | | | 464 | minor-format-version | Unsigned | Unsigned integer '0'. The minor | 465 | | | version of format used in file. | 466 | | | | 467 | private-version | Unsigned | Version indicator available for | 468 | | | private use by applications. | 469 | | | Optional. | 470 | | | | 471 | configuration | Map of | The collection configuration. | 472 | | items | Optional. | 473 | | | | 474 | generator-id | Text | String identifying the | 475 | | string | collection program. Optional. | 476 | | | | 477 | host-id | Text | String identifying the | 478 | | string | collecting host. Empty if | 479 | | | converting an existing packet | 480 | | | capture file. Optional. | 481 +----------------------+----------+---------------------------------+ 483 7.5. Configuration contents 485 The collection configuration contains the following items. All are 486 optional. 488 +--------------------+----------+-----------------------------------+ 489 | Field | Type | Description | 490 +--------------------+----------+-----------------------------------+ 491 | query-timeout | Unsigned | To be matched with a query, a | 492 | | | response must arrive within this | 493 | | | number of seconds. | 494 | | | | 495 | skew-timeout | Unsigned | The network stack may report a | 496 | | | response before the corresponding | 497 | | | query. A response is not | 498 | | | considered to be missing a query | 499 | | | until after this many micro- | 500 | | | seconds. | 501 | | | | 502 | snaplen | Unsigned | Collect up to this many bytes per | 503 | | | packet. | 504 | | | | 505 | promisc | Unsigned | 1 if promiscuous mode was enabled | 506 | | | on the interface, 0 otherwise. | 507 | | | | 508 | interfaces | Array of | Identifiers of the interfaces | 509 | | text | used for collection. | 510 | | strings | | 511 | | | | 512 | server-addresses | Array of | Server collection IP addresses. | 513 | | byte | Hint for downstream analysers; | 514 | | strings | does not affect collection. | 515 | | | | 516 | vlan-ids | Array of | Identifiers of VLANs selected for | 517 | | unsigned | collection. | 518 | | | | 519 | filter | Text | 'tcpdump' [pcap] style filter for | 520 | | string | input. | 521 | | | | 522 | query-options | Unsigned | Bit flags indicating sections in | 523 | | | Query messages to be collected. | 524 | | | Bit 0. Collect second and | 525 | | | subsequent Questions in the | 526 | | | Question section. | 527 | | | Bit 1. Collect Answer sections. | 528 | | | Bit 2. Collect Authority | 529 | | | sections. | 530 | | | Bit 3. Collection Additional | 531 | | | sections. | 532 | | | | 533 | response-options | Unsigned | Bit flags indicating sections in | 534 | | | Response messages to be | 535 | | | collected. | 536 | | | Bit 0. Collect second and | 537 | | | subsequent Questions in the | 538 | | | Question section. | 539 | | | Bit 1. Collect Answer sections. | 540 | | | Bit 2. Collect Authority | 541 | | | sections. | 542 | | | Bit 3. Collection Additional | 543 | | | sections. | 544 | | | | 545 | accept-rr-types | Array of | A set of RR type names [rrtypes]. | 546 | | text | If not empty, only the nominated | 547 | | strings | RR types are collected. | 548 | | | | 549 | ignore-rr-types | Array of | A set of RR type names [rrtypes]. | 550 | | text | If not empty, all RR types are | 551 | | strings | collected except those listed. If | 552 | | | present, this item must be empty | 553 | | | if a non-empty list of Accept RR | 554 | | | types is present. | 555 | | | | 556 | max-block-qr-items | Unsigned | Maximum number of Q/R data items | 557 | | | in a block. | 558 | | | | 559 | collect-malformed | Unsigned | 1 if malformed packet contents | 560 | | | are collected, 0 otherwise. | 561 +--------------------+----------+-----------------------------------+ 563 7.6. Block contents 565 Each block contains the following: 567 +-----------------------+--------------+----------------------------+ 568 | Field | Type | Description | 569 +-----------------------+--------------+----------------------------+ 570 | preamble | Map of items | Overall information for | 571 | | | the block. | 572 | | | | 573 | statistics | Map of | Statistics about the | 574 | | statistics | block. Optional. | 575 | | | | 576 | tables | Map of | The tables containing data | 577 | | tables | referenced by individual | 578 | | | Q/R data items. | 579 | | | | 580 | queries | Array of Q/R | Details of individual Q/R | 581 | | data items | data items. | 582 | | | | 583 | address-event-counts | Array of | Per client counts of ICMP | 584 | | Address | messages and TCP resets. | 585 | | Event counts | Optional. | 586 | | | | 587 | malformed-packet-data | Array of | Wire contents of malformed | 588 | | malformed | packets. Optional. | 589 | | packets | | 590 +-----------------------+--------------+----------------------------+ 592 7.7. Block preamble map 594 The block preamble map contains overall information for the block. 596 +---------------+----------+----------------------------------------+ 597 | Field | Type | Description | 598 +---------------+----------+----------------------------------------+ 599 | earliest-time | Array of | A timestamp for the earliest record in | 600 | | unsigned | the block. The timestamp is specified | 601 | | | as a CBOR array with two or three | 602 | | | elements. The first two elements are | 603 | | | as in Posix struct timeval. The first | 604 | | | element is an unsigned integer time_t | 605 | | | and the second is an unsigned integer | 606 | | | number of microseconds. The third, if | 607 | | | present, is an unsigned integer number | 608 | | | of picoseconds. The microsecond and | 609 | | | picosecond items always have a value | 610 | | | between 0 and 999,999. | 611 +---------------+----------+----------------------------------------+ 613 7.8. Block statistics 615 The block statistics section contains some basic statistical 616 information about the block. All are optional. 618 +---------------------+----------+----------------------------------+ 619 | Field | Type | Description | 620 +---------------------+----------+----------------------------------+ 621 | total-packets | Unsigned | Total number of packets | 622 | | | processed from the input traffic | 623 | | | stream during collection of the | 624 | | | block data. | 625 | total-pairs | Unsigned | Total number of Q/R data items | 626 | | | in the block. | 627 | unmatched-queries | Unsigned | Number of unmatched queries in | 628 | | | the block. | 629 | unmatched-responses | Unsigned | Number of unmatched responses in | 630 | | | the block. | 631 | malformed-packets | Unsigned | Number of malformed packets | 632 | | | found in input for the block. | 633 +---------------------+----------+----------------------------------+ 635 Implementations may choose to add additional implementation-specific 636 fields to the statistics. 638 7.9. Block table map 640 The block table map contains the block tables. Each element, or 641 table, is an array. The following tables detail the contents of each 642 block table. 644 The Present column in the following tables indicates the 645 circumstances when an optional field will be present. A Q/R data 646 item may be: 648 o A Query plus a Response. 650 o A Query without a Response. 652 o A Response without a Query. 654 Also: 656 o A Query and/or a Response may contain an OPT section. 658 o A Question may or may not be present. If the Query is available, 659 the Question section of the Query is used. If no Query is 660 available, the Question section of the Response is used. Unless 661 otherwise noted, a Question refers to the first Question in the 662 Question section. 664 So, for example, a field listed with a Present value of QUERY is 665 present whenever the Q/R data item contains a Query. If the pair 666 contains a Response only, the field will not be present. 668 7.10. IP address table 670 The table "ip-address" holds all client and server IP addresses in 671 the block. Each item in the table is a single IP address. 673 +------------+--------+---------------------------------------------+ 674 | Field | Type | Description | 675 +------------+--------+---------------------------------------------+ 676 | ip-address | Byte | The IP address, in network byte order. The | 677 | | string | string is 4 bytes long for an IPv4 address, | 678 | | | 16 bytes long for an IPv6 address. | 679 +------------+--------+---------------------------------------------+ 681 7.11. Class/Type table 683 The table "classtype" holds pairs of RR CLASS and TYPE values. Each 684 item in the table is a CBOR map. 686 +-------+--------------+ 687 | Field | Description | 688 +-------+--------------+ 689 | type | TYPE value. | 690 | | | 691 | class | CLASS value. | 692 +-------+--------------+ 694 7.12. Name/RDATA table 696 The table "name-rdata" holds the contents of all NAME or RDATA items 697 in the block. Each item in the table is the content of a single NAME 698 or RDATA. 700 +------------+--------+---------------------------------------------+ 701 | Field | Type | Description | 702 +------------+--------+---------------------------------------------+ 703 | name-rdata | Byte | The NAME or RDATA contents. NAMEs, and | 704 | | string | labels within RDATA contents, are in | 705 | | | uncompressed label format. | 706 +------------+--------+---------------------------------------------+ 708 7.13. Query Signature table 710 The table "query-sig" holds elements of the Q/R data item that are 711 often common between multiple individual Q/R data items. Each item 712 in the table is a CBOR map. Each item in the map has an unsigned 713 value and an unsigned key. 715 The following abbreviations are used in the Present (P) column 717 o Q = QUERY 719 o A = Always 721 o QT = QUESTION 723 o QO = QUERY, OPT 725 o QR = QUERY & RESPONSE 727 o R = RESPONSE 729 +-----------------------+----+--------------------------------------+ 730 | Field | P | Description | 731 +-----------------------+----+--------------------------------------+ 732 | server-address-index | A | The index in the IP address table of | 733 | | | the server IP address. | 734 | | | | 735 | server-port | A | The server port. | 736 | | | | 737 | transport-flags | A | Bit flags describing the transport | 738 | | | used to service the query. Bit 0 is | 739 | | | the least significant bit. | 740 | | | Bit 0. Transport type. 0 = UDP, 1 = | 741 | | | TCP. | 742 | | | Bit 1. IP type. 0 = IPv4, 1 = IPv6. | 743 | | | Bit 2. Trailing bytes in query | 744 | | | payload. The DNS query message in | 745 | | | the UDP payload was followed by some | 746 | | | additional bytes, which were | 747 | | | discarded. | 748 | | | | 749 | qr-sig-flags | A | Bit flags indicating information | 750 | | | present in this Q/R data item. Bit 0 | 751 | | | is the least significant bit. | 752 | | | Bit 0. 1 if a Query is present. | 753 | | | Bit 1. 1 if a Response is present. | 754 | | | Bit 2. 1 if one or more Question is | 755 | | | present. | 756 | | | Bit 3. 1 if a Query is present and | 757 | | | it has an OPT Resource Record. | 758 | | | Bit 4. 1 if a Response is present | 759 | | | and it has an OPT Resource Record. | 760 | | | Bit 5. 1 if a Response is present | 761 | | | but has no Question. | 762 | | | | 763 | query-opcode | Q | Query OPCODE. Optional. | 764 | | | | 765 | qr-dns-flags | A | Bit flags with values from the Query | 766 | | | and Response DNS flags. Bit 0 is the | 767 | | | least significant bit. Flag values | 768 | | | are 0 if the Query or Response is | 769 | | | not present. | 770 | | | Bit 0. Query Checking Disabled (CD). | 771 | | | Bit 1. Query Authenticated Data | 772 | | | (AD). | 773 | | | Bit 2. Query reserved (Z). | 774 | | | Bit 3. Query Recursion Available | 775 | | | (RA). | 776 | | | Bit 4. Query Recursion Desired (RD). | 777 | | | Bit 5. Query TrunCation (TC). | 778 | | | Bit 6. Query Authoritative Answer | 779 | | | (AA). | 780 | | | Bit 7. Query DNSSEC answer OK (DO). | 781 | | | Bit 8. Response Checking Disabled | 782 | | | (CD). | 783 | | | Bit 9. Response Authenticated Data | 784 | | | (AD). | 785 | | | Bit 10. Response reserved (Z). | 786 | | | Bit 11. Response Recursion Available | 787 | | | (RA). | 788 | | | Bit 12. Response Recursion Desired | 789 | | | (RD). | 790 | | | Bit 13. Response TrunCation (TC). | 791 | | | Bit 14. Response Authoritative | 792 | | | Answer (AA). | 793 | | | | 794 | query-rcode | Q | Query RCODE. If the Query contains | 795 | | | OPT, this value incorporates any | 796 | | | EXTENDED_RCODE_VALUE. Optional. | 797 | | | | 798 | query-classtype-index | QT | The index in the Class/Type table of | 799 | | | the CLASS and TYPE of the first | 800 | | | Question. Optional. | 801 | | | | 802 | query-qd-count | QT | The QDCOUNT in the Query, or | 803 | | | Response if no Query present. | 804 | | | Optional. | 805 | | | | 806 | query-an-count | Q | Query ANCOUNT. Optional. | 807 | | | | 808 | query-ar-count | Q | Query ARCOUNT. Optional. | 809 | | | | 810 | query-ns-count | Q | Query NSCOUNT. Optional. | 811 | | | | 812 | edns-version | QO | The Query EDNS version. Optional. | 813 | | | | 814 | udp-buf-size | QO | The Query EDNS sender's UDP payload | 815 | | | size. Optional. | 816 | | | | 817 | opt-rdata-index | QO | The index in the NAME/RDATA table of | 818 | | | the OPT RDATA. Optional. | 819 | | | | 820 | response-rcode | R | Response RCODE. If the Response | 821 | | | contains OPT, this value | 822 | | | incorporates any | 823 | | | EXTENDED_RCODE_VALUE. Optional. | 824 +-----------------------+----+--------------------------------------+ 826 7.14. Question table 828 The table "qrr" holds details on individual Questions in a Question 829 section. Each item in the table is a CBOR map containing a single 830 Question. Each item in the map has an unsigned value and an unsigned 831 key. This data is optionally collected. 833 +-----------------+-------------------------------------------------+ 834 | Field | Description | 835 +-----------------+-------------------------------------------------+ 836 | name-index | The index in the NAME/RDATA table of the QNAME. | 837 | | | 838 | classtype-index | The index in the Class/Type table of the CLASS | 839 | | and TYPE of the Question. | 840 +-----------------+-------------------------------------------------+ 842 7.15. Resource Record (RR) table 844 The table "rr" holds details on individual Resource Records in RR 845 sections. Each item in the table is a CBOR map containing a single 846 Resource Record. This data is optionally collected. 848 +-----------------+-------------------------------------------------+ 849 | Field | Description | 850 +-----------------+-------------------------------------------------+ 851 | name-index | The index in the NAME/RDATA table of the NAME. | 852 | | | 853 | classtype-index | The index in the Class/Type table of the CLASS | 854 | | and TYPE of the RR. | 855 | | | 856 | ttl | The RR Time to Live. | 857 | | | 858 | rdata-index | The index in the NAME/RDATA table of the RR | 859 | | RDATA. | 860 +-----------------+-------------------------------------------------+ 862 7.16. Question list table 864 The table "qlist" holds a list of second and subsequent individual 865 Questions in a Question section. Each item in the table is a CBOR 866 unsigned integer. This data is optionally collected. 868 +----------+--------------------------------------------------------+ 869 | Field | Description | 870 +----------+--------------------------------------------------------+ 871 | question | The index in the Question table of the individual | 872 | | Question. | 873 +----------+--------------------------------------------------------+ 875 7.17. Resource Record list table 877 The table "rrlist" holds a list of individual Resource Records in a 878 Answer, Authority or Additional section. Each item in the table is a 879 CBOR unsigned integer. This data is optionally collected. 881 +-------+-----------------------------------------------------------+ 882 | Field | Description | 883 +-------+-----------------------------------------------------------+ 884 | rr | The index in the Resource Record table of the individual | 885 | | Resource Record. | 886 +-------+-----------------------------------------------------------+ 888 7.18. Query/Response data 890 The block Q/R data is a CBOR array of individual Q/R data items. 891 Each item in the array is a CBOR map containing details on the 892 individual Q/R data item. 894 Note that there is no requirement that the elements of the Q/R array 895 are presented in strict chronological order. 897 The following abbreviations are used in the Present (P) column 899 o Q = QUERY 901 o A = Always 903 o QT = QUESTION 905 o QO = QUERY, OPT 907 o QR = QUERY & RESPONSE 909 o R = RESPONSE 911 Each item in the map has an unsigned value (with the exception of 912 those listed below) and an unsigned key. 914 o query-extended and response-extended which are of type Extended 915 Information. 917 o delay-useconds and delay-pseconds which are integers (The delay 918 can be negative if the network stack/capture library returns them 919 out of order.) 921 +-----------------------+----+--------------------------------------+ 922 | Field | P | Description | 923 +-----------------------+----+--------------------------------------+ 924 | time-useconds | A | Q/R timestamp as an offset in | 925 | | | microseconds from the Block preamble | 926 | | | Timestamp. The timestamp is the | 927 | | | timestamp of the Query, or the | 928 | | | Response if there is no Query. | 929 | | | | 930 | time-pseconds | A | Picosecond component of the | 931 | | | timestamp. Optional. | 932 | | | | 933 | client-address-index | A | The index in the IP address table of | 934 | | | the client IP address. | 935 | | | | 936 | client-port | A | The client port. | 937 | | | | 938 | transaction-id | A | DNS transaction identifier. | 939 | | | | 940 | query-signature-index | A | The index of the Query Signature | 941 | | | table record for this data item. | 942 | | | | 943 | client-hoplimit | Q | The IPv4 TTL or IPv6 Hoplimit from | 944 | | | the Query packet. Optional. | 945 | | | | 946 | delay-useconds | QR | The time difference between Query | 947 | | | and Response, in microseconds. Only | 948 | | | present if there is a query and a | 949 | | | response. | 950 | | | | 951 | delay-pseconds | QR | Picosecond component of the time | 952 | | | different between Query and | 953 | | | Response. If delay-useconds is non- | 954 | | | zero then delay-pseconds (if | 955 | | | present) MUST be of the same sign as | 956 | | | delay-useconds, or be 0. Optional. | 957 | | | | 958 | query-name-index | QT | The index in the NAME/RDATA table of | 959 | | | the QNAME for the first Question. | 960 | | | Optional. | 961 | | | | 962 | query-size | R | DNS query message size (see below). | 963 | | | Optional. | 964 | | | | 965 | response-size | R | DNS query message size (see below). | 966 | | | Optional. | 967 | | | | 968 | query-extended | Q | Extended Query information. This | 969 | | | item is only present if collection | 970 | | | of extra Query information is | 971 | | | configured. Optional. | 972 | | | | 973 | response-extended | R | Extended Response information. This | 974 | | | item is only present if collection | 975 | | | of extra Response information is | 976 | | | configured. Optional. | 977 +-----------------------+----+--------------------------------------+ 979 An implementation must always collect basic Q/R information. It may 980 be configured to collect details on Question, Answer, Authority and 981 Additional sections of the Query, the Response or both. Note that 982 only the second and subsequent Questions of any Question section are 983 collected (the details of the first are in the basic information), 984 and that OPT Records are not collected in the Additional section. 986 The query-size and response-size fields hold the DNS message size. 987 For UDP this is the size of the UDP payload that contained the DNS 988 message and will therefore include any trailing bytes if present. 989 Trailing bytes with queries are routinely observed in traffic to 990 authoritative servers and this value allows a calculation of how many 991 trailing bytes were present. For TCP it is the size of the DNS 992 message as specified in the two-byte message length header. 994 The Extended information is a CBOR map as follows. Each item in the 995 map is present only if collection of the relevant details is 996 configured. Each item in the map has an unsigned value and an 997 unsigned key. 999 +------------------+------------------------------------------------+ 1000 | Field | Description | 1001 +------------------+------------------------------------------------+ 1002 | question-index | The index in the Questions list table of the | 1003 | | entry listing any second and subsequent | 1004 | | Questions in the Question section for the | 1005 | | Query or Response. | 1006 | | | 1007 | answer-index | The index in the RR list table of the entry | 1008 | | listing the Answer Resource Record sections | 1009 | | for the Query or Response. | 1010 | | | 1011 | authority-index | The index in the RR list table of the entry | 1012 | | listing the Authority Resource Record sections | 1013 | | for the Query or Response. | 1014 | | | 1015 | additional-index | The index in the RR list table of the entry | 1016 | | listing the Additional Resource Record | 1017 | | sections for the Query or Response. | 1018 +------------------+------------------------------------------------+ 1020 7.19. Address Event counts 1022 This table holds counts of various IP related events relating to 1023 traffic with individual client addresses. 1025 +------------------+----------+-------------------------------------+ 1026 | Field | Type | Description | 1027 +------------------+----------+-------------------------------------+ 1028 | ae-type | Unsigned | The type of event. The following | 1029 | | | events types are currently defined: | 1030 | | | 0. TCP reset. | 1031 | | | 1. ICMP time exceeded. | 1032 | | | 2. ICMP destination unreachable. | 1033 | | | 3. ICMPv6 time exceeded. | 1034 | | | 4. ICMPv6 destination unreachable. | 1035 | | | 5. ICMPv6 packet too big. | 1036 | | | | 1037 | ae-code | Unsigned | A code relating to the event. | 1038 | | | Optional. | 1039 | | | | 1040 | ae-address-index | Unsigned | The index in the IP address table | 1041 | | | of the client address. | 1042 | | | | 1043 | ae-count | Unsigned | The number of occurrences of this | 1044 | | | event during the block collection | 1045 | | | period. | 1046 +------------------+----------+-------------------------------------+ 1048 7.20. Malformed packet records 1050 This optional table records the original wire format content of 1051 malformed packets (see Section 8). 1053 +----------------+--------+-----------------------------------------+ 1054 | Field | Type | Description | 1055 +----------------+--------+-----------------------------------------+ 1056 | time-useconds | A | Packet timestamp as an offset in | 1057 | | | microseconds from the Block preamble | 1058 | | | Timestamp. | 1059 | | | | 1060 | time-pseconds | A | Picosecond component of the timestamp. | 1061 | | | Optional. | 1062 | | | | 1063 | packet-content | Byte | The packet content in wire format. | 1064 | | string | | 1065 +----------------+--------+-----------------------------------------+ 1067 8. Malformed Packets 1069 In the context of generating a C-DNS file it is assumed that only 1070 those packets which can be parsed to produce a well-formed DNS 1071 message are stored in the C-DNS format. This means as a minimum: 1073 o The packet has a well-formed 12 bytes DNS Header 1075 o The section counts are consistent with the section contents 1077 o All of the resource records can be parsed 1079 In principle, packets that do not meet these criteria could be 1080 classified into two categories: 1082 o Partially malformed: those packets which can be decoded 1083 sufficiently to extract 1085 * a DNS header (and therefore a DNS transaction ID) 1087 * a QDCOUNT 1089 * the first Question in the Question section if QDCOUNT is 1090 greater than 0 1092 but suffer other issues while parsing. This is the minimum 1093 information required to attempt Query/Response matching as 1094 described in Section 10.1 1096 o Completely malformed: those packets that cannot be decoded to this 1097 extent. 1099 An open question is whether there is value in attempting to process 1100 partially malformed packets in an analogous manner to well formed 1101 packets in terms of attempting to match them with the corresponding 1102 query or response. This could be done by creating 'placeholder' 1103 records during Query/Response matching with just the information 1104 extracted as above. If the packet were then matched the resulting 1105 C-DNS Q/R data item would include a flag to indicate a malformed 1106 record (in addition to capturing the wire format of the packet). 1108 An advantage of this would be that it would result in more meaningful 1109 statistics about matched packets because, for example, some partially 1110 malformed queries could be matched to responses. However it would 1111 only apply to those queries where the first Question is well formed. 1112 It could also simplify the downstream analysis of C-DNS files and the 1113 reconstruction of packet streams from C-DNS. 1115 A disadvantage is that this adds complexity to the Query/Response 1116 matching and data representation, could potentially lead to false 1117 matches and some additional statistics would be required (e.g. counts 1118 for matched-partially-malformed, unmatched-partially-malformed, 1119 completely-malformed). 1121 9. C-DNS to PCAP 1123 It is possible to re-construct PCAP files from the C-DNS format in a 1124 lossy fashion. Some of the issues with reconstructing both the DNS 1125 payload and the full packet stream are outlined here. 1127 The reconstruction depends on whether or not all the optional 1128 sections of both the query and response were captured in the C-DNS 1129 file. Clearly, if they were not all captured, the reconstruction 1130 will be imperfect. 1132 Even if all sections of the response were captured, one cannot 1133 reconstruct the DNS response payload exactly due to the fact that 1134 some DNS names in the message on the wire may have been compressed. 1135 Section 9.1 discusses this is more detail. 1137 Some transport information is not captured in the C-DNS format. For 1138 example, the following aspects of the original packet stream cannot 1139 be re-constructed from the C-DNS format: 1141 o IP fragmentation 1143 o TCP stream information: 1145 * Multiple DNS messages may have been sent in a single TCP 1146 segment 1148 * A DNS payload may have be split across multiple TCP segments 1150 * Multiple DNS messages may have be sent on a single TCP session 1152 o Malformed DNS messages if the wire format is not recorded 1154 o Any Non-DNS messages that were in the original packet stream e.g. 1155 ICMP 1157 Simple assumptions can be made on the reconstruction: fragmented and 1158 DNS-over-TCP messages can be reconstructed into single packets and a 1159 single TCP session can be constructed for each TCP packet. 1161 Additionally, if malformed packets and Non-DNS packets are captured 1162 separately, they can be merged with packet captures reconstructed 1163 from C-DNS to produce a more complete packet stream. 1165 9.1. Name Compression 1167 All the names stored in the C-DNS format are full domain names; no 1168 DNS style name compression is used on the individual names within the 1169 format. Therefore when reconstructing a packet, name compression 1170 must be used in order to reproduce the on the wire representation of 1171 the packet. 1173 [RFC1035] name compression works by substituting trailing sections of 1174 a name with a reference back to the occurrence of those sections 1175 earlier in the message. Not all name server software uses the same 1176 algorithm when compressing domain names within the responses. Some 1177 attempt maximum recompression at the expense of runtime resources, 1178 others use heuristics to balance compression and speed and others use 1179 different rules for what is a valid compression target. 1181 This means that responses to the same question from different name 1182 server software which match in terms of DNS payload content (header, 1183 counts, RRs with name compression removed) do not necessarily match 1184 byte-for-byte on the wire. 1186 Therefore, it is not possible to ensure that the DNS response payload 1187 is reconstructed byte-for-byte from C-DNS data. However, it can at 1188 least, in principle, be reconstructed to have the correct payload 1189 length (since the original response length is captured) if there is 1190 enough knowledge of the commonly implemented name compression 1191 algorithms. For example, a simplistic approach would be to try each 1192 algorithm in turn to see if it reproduces the original length, 1193 stopping at the first match. This would not guarantee the correct 1194 algorithm has been used as it is possible to match the length whilst 1195 still not matching the on the wire bytes but, without further 1196 information added to the C-DNS data, this is the best that can be 1197 achieved. 1199 Appendix B presents an example of two different compression 1200 algorithms used by well-known name server software. 1202 10. Data Collection 1204 This section describes a non-normative proposed algorithm for the 1205 processing of a captured stream of DNS queries and responses and 1206 matching queries/responses where possible. 1208 For the purposes of this discussion, it is assumed that the input has 1209 been pre-processed such that: 1211 1. All IP fragmentation reassembly, TCP stream reassembly, and so 1212 on, has already been performed 1214 2. Each message is associated with transport metadata required to 1215 generate the Primary ID (see Section 10.2.1) 1217 3. Each message has a well-formed DNS header of 12 bytes and (if 1218 present) the first Question in the Question section can be parsed 1219 to generate the Secondary ID (see below). As noted earlier, this 1220 requirement can result in a malformed query being removed in the 1221 pre-processing stage, but the correctly formed response with 1222 RCODE of FORMERR being present. 1224 DNS messages are processed in the order they are delivered to the 1225 application. It should be noted that packet capture libraries do not 1226 necessary provide packets in strict chronological order. 1228 TODO: Discuss the corner cases resulting from this in more detail. 1230 10.1. Matching algorithm 1232 A schematic representation of the algorithm for matching Q/R data 1233 items is shown in the following diagram: 1235 Figure showing the Query/Response matching algorithm format (PNG) [5] 1237 Figure showing the Query/Response matching algorithm format (SVG) [6] 1239 Further details of the algorithm are given in the following sections. 1241 10.2. Message identifiers 1243 10.2.1. Primary ID (required) 1245 A Primary ID is constructed for each message. It is composed of the 1246 following data: 1248 1. Source IP Address 1250 2. Destination IP Address 1252 3. Source Port 1254 4. Destination Port 1255 5. Transport 1257 6. DNS Message ID 1259 10.2.2. Secondary ID (optional) 1261 If present, the first Question in the Question section is used as a 1262 secondary ID for each message. Note that there may be well formed 1263 DNS queries that have a QDCOUNT of 0, and some responses may have a 1264 QDCOUNT of 0 (for example, responses with RCODE=FORMERR or NOTIMP). 1265 In this case the secondary ID is not used in matching. 1267 10.3. Algorithm Parameters 1269 1. Query timeout 1271 2. Skew timeout 1273 10.4. Algorithm Requirements 1275 The algorithm is designed to handle the following input data: 1277 1. Multiple queries with the same Primary ID (but different 1278 Secondary ID) arriving before any responses for these queries are 1279 seen. 1281 2. Multiple queries with the same Primary and Secondary ID arriving 1282 before any responses for these queries are seen. 1284 3. Queries for which no later response can be found within the 1285 specified timeout. 1287 4. Responses for which no previous query can be found within the 1288 specified timeout. 1290 10.5. Algorithm Limitations 1292 For cases 1 and 2 listed in the above requirements, it is not 1293 possible to unambiguously match queries with responses. This 1294 algorithm chooses to match to the earliest query with the correct 1295 Primary and Secondary ID. 1297 10.6. Workspace 1299 A FIFO structure is used to hold the Q/R data items during 1300 processing. 1302 10.7. Output 1304 The output is a list of Q/R data items. Both the Query and Response 1305 elements are optional in these items, therefore Q/R data items have 1306 one of three types of content: 1308 1. A matched pair of query and response messages 1310 2. A query message with no response 1312 3. A response message with no query 1314 The timestamp of a list item is that of the query for cases 1 and 2 1315 and that of the response for case 3. 1317 10.8. Post Processing 1319 When ending capture, all remaining entries in the Q/R data item FIFO 1320 should be treated as timed out queries. 1322 11. Implementation Status 1324 [Note to RFC Editor: please remove this section and reference to 1325 [RFC7942] prior to publication.] 1327 This section records the status of known implementations of the 1328 protocol defined by this specification at the time of posting of this 1329 Internet-Draft, and is based on a proposal described in [RFC7942]. 1330 The description of implementations in this section is intended to 1331 assist the IETF in its decision processes in progressing drafts to 1332 RFCs. Please note that the listing of any individual implementation 1333 here does not imply endorsement by the IETF. Furthermore, no effort 1334 has been spent to verify the information presented here that was 1335 supplied by IETF contributors. This is not intended as, and must not 1336 be construed to be, a catalog of available implementations or their 1337 features. Readers are advised to note that other implementations may 1338 exist. 1340 According to [RFC7942], "this will allow reviewers and working groups 1341 to assign due consideration to documents that have the benefit of 1342 running code, which may serve as evidence of valuable experimentation 1343 and feedback that have made the implemented protocols more mature. 1344 It is up to the individual working groups to use this information as 1345 they see fit". 1347 11.1. DNS-STATS Compactor 1349 ICANN/Sinodun IT have developed an open source implementation called 1350 DNS-STATS Compactor. The Compactor is a suite of tools which can 1351 capture DNS traffic (from either a network interface or a PCAP file) 1352 and store it in the Compacted-DNS (C-DNS) file format. PCAP files 1353 for the captured traffic can also be reconstructed. See Compactor 1354 [7]. 1356 This implementation: 1358 o is mature but has only been deployed for testing in a single 1359 environment so is not yet classified as production ready. 1361 o covers the whole of the specification described in the -03 draft 1362 with the exception of support for malformed packets (Section 8) 1363 and pico second time resolution. (Note: this implementation does 1364 allow malformed packets to be dumped to a PCAP file). 1366 o is released under the Mozilla Public License Version 2.0. 1368 o has a users mailing list available, see dns-stats-users [8]. 1370 There is also some discussion of issues encountered during 1371 development available at Compressing Pcap Files [9] and Packet 1372 Capture [10]. 1374 This information was last updated on 29th of June 2017. 1376 12. IANA Considerations 1378 None 1380 13. Security Considerations 1382 Any control interface MUST perform authentication and encryption. 1384 Any data upload MUST be authenticated and encrypted. 1386 14. Acknowledgements 1388 The authors wish to thank CZ.NIC, in particular Tomas Gavenciak, for 1389 many useful discussions on binary formats, compression and packet 1390 matching. Also Jan Vcelak and Wouter Wijngaards for discussions on 1391 name compression and Paul Hoffman for a detailed review of the 1392 document and the C-DNS CDDL. 1394 Thanks also to Robert Edmonds and Jerry Lundstroem for review. 1396 Also, Miek Gieben for mmark [11] 1398 15. Changelog 1400 draft-ietf-dnsop-dns-capture-format-03 1402 o Added an Implementation Status section 1404 draft-ietf-dnsop-dns-capture-format-02 1406 o Update qr_data_format.png to match CDDL 1408 o Editorial clarifications and improvements 1410 draft-ietf-dnsop-dns-capture-format-01 1412 o Many editorial improvements by Paul Hoffman 1414 o Included discussion of malformed packet handling 1416 o Improved Appendix C on Comparison of Binary Formats 1418 o Now using C-DNS field names in the tables in section 8 1420 o A handful of new fields included (CDDL updated) 1422 o Timestamps now include optional picoseconds 1424 o Added details of block statistics 1426 draft-ietf-dnsop-dns-capture-format-00 1428 o Changed dnstap.io to dnstap.info 1430 o qr_data_format.png was cut off at the bottom 1432 o Update authors address 1434 o Improve wording in Abstract 1436 o Changed DNS-STAT to C-DNS in CDDL 1438 o Set the format version in the CDDL 1440 o Added a TODO: Add block statistics 1442 o Added a TODO: Add extend to support pico/nano. Also do this for 1443 Time offset and Response delay 1445 o Added a TODO: Need to develop optional representation of malformed 1446 packets within C-DNS and what this means for packet matching. 1447 This may influence which fields are optional in the rest of the 1448 representation. 1450 o Added section on design goals to Introduction 1452 o Added a TODO: Can Class be optimised? Should a class of IN be 1453 inferred if not present? 1455 draft-dickinson-dnsop-dns-capture-format-00 1457 o Initial commit 1459 16. References 1461 16.1. Normative References 1463 [RFC1035] Mockapetris, P., "Domain names - implementation and 1464 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, 1465 November 1987, . 1467 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1468 Requirement Levels", BCP 14, RFC 2119, 1469 DOI 10.17487/RFC2119, March 1997, 1470 . 1472 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 1473 Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, 1474 October 2013, . 1476 16.2. Informative References 1478 [ditl] DNS-OARC, "DITL", 2016, . 1481 [dnscap] DNS-OARC, "DNSCAP", 2016, . 1484 [dnstap] dnstap.info, "dnstap", 2016, . 1486 [dsc] Wessels, D. and J. Lundstrom, "DSC", 2016, 1487 . 1489 [I-D.daley-dnsxml] 1490 Daley, J., Morris, S., and J. Dickinson, "dnsxml - A 1491 standard XML representation of DNS data", draft-daley- 1492 dnsxml-00 (work in progress), July 2013. 1494 [I-D.greevenbosch-appsawg-cbor-cddl] 1495 Birkholz, H., Vigano, C., and C. Bormann, "CBOR data 1496 definition language (CDDL): a notational convention to 1497 express CBOR data structures", draft-greevenbosch-appsawg- 1498 cbor-cddl-10 (work in progress), March 2017. 1500 [I-D.hoffman-dns-in-json] 1501 Hoffman, P., "Representing DNS Messages in JSON", draft- 1502 hoffman-dns-in-json-12 (work in progress), May 2017. 1504 [packetq] .SE - The Internet Infrastructure Foundation, "PacketQ", 1505 2014, . 1507 [pcap] tcpdump.org, "PCAP", 2016, . 1509 [pcapng] Tuexen, M., Risso, F., Bongertz, J., Combs, G., and G. 1510 Harris, "pcap-ng", 2016, . 1513 [RFC7159] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data 1514 Interchange Format", RFC 7159, DOI 10.17487/RFC7159, March 1515 2014, . 1517 [RFC7942] Sheffer, Y. and A. Farrel, "Improving Awareness of Running 1518 Code: The Implementation Status Section", BCP 205, 1519 RFC 7942, DOI 10.17487/RFC7942, July 2016, 1520 . 1522 [rrtypes] IANA, "RR types", 2016, . 1525 16.3. URIs 1527 [1] https://github.com/dns-stats/draft-dns-capture- 1528 format/blob/master/draft-03/cdns_format.png 1530 [2] https://github.com/dns-stats/draft-dns-capture- 1531 format/blob/master/draft-03/cdns_format.svg 1533 [3] https://github.com/dns-stats/draft-dns-capture- 1534 format/blob/master/draft-03/qr_data_format.png 1536 [4] https://github.com/dns-stats/draft-dns-capture- 1537 format/blob/master/draft-03/qr_data_format.svg 1539 [5] https://github.com/dns-stats/draft-dns-capture- 1540 format/blob/master/draft-03/packet_matching.png 1542 [6] https://github.com/dns-stats/draft-dns-capture- 1543 format/blob/master/draft-03/packet_matching.svg 1545 [7] https://github.com/dns-stats/compactor/wiki 1547 [8] https://mm.dns-stats.org/mailman/listinfo/dns-stats-users 1549 [9] https://www.sinodun.com/2017/06/compressing-pcap-files/ 1551 [10] https://www.sinodun.com/2017/06/more-on-debian-jessieubuntu- 1552 trusty-packet-capture-woes/ 1554 [11] https://github.com/miekg/mmark 1556 [12] https://www.nlnetlabs.nl/projects/nsd/ 1558 [13] https://www.knot-dns.cz/ 1560 [14] https://avro.apache.org/ 1562 [15] https://developers.google.com/protocol-buffers/ 1564 [16] http://cbor.io 1566 [17] https://github.com/kubo/snzip 1568 [18] http://google.github.io/snappy/ 1570 [19] http://lz4.github.io/lz4/ 1572 [20] http://www.gzip.org/ 1574 [21] http://facebook.github.io/zstd/ 1576 [22] http://tukaani.org/xz/ 1578 [23] https://github.com/dns-stats/draft-dns-capture- 1579 format/blob/master/file-size-versus-block-size.png 1581 [24] https://github.com/dns-stats/draft-dns-capture- 1582 format/blob/master/file-size-versus-block-size.svg 1584 Appendix A. CDDL 1586 ; CDDL specification of the file format for C-DNS, 1587 ; which describes a collection of DNS messages and 1588 ; traffic meta-data. 1590 File = [ 1591 file-type-id : tstr, ; = "C-DNS" 1592 file-preamble : FilePreamble, 1593 file-blocks : [* Block], 1594 ] 1596 FilePreamble = { 1597 major-format-version => uint, ; = 1 1598 minor-format-version => uint, ; = 0 1599 ? private-version => uint, 1600 ? configuration => Configuration, 1601 ? generator-id => tstr, 1602 ? host-id => tstr, 1603 } 1605 major-format-version = 0 1606 minor-format-version = 1 1607 private-version = 2 1608 configuration = 3 1609 generator-id = 4 1610 host-id = 5 1612 Configuration = { 1613 ? query-timeout => uint, 1614 ? skew-timeout => uint, 1615 ? snaplen => uint, 1616 ? promisc => uint, 1617 ? interfaces => [* tstr], 1618 ? server-addresses => [* IPAddress], ; Hint for later analysis 1619 ? vlan-ids => [* uint], 1620 ? filter => tstr, 1621 ? query-options => QRCollectionSections, 1622 ? response-options => QRCollectionSections, 1623 ? accept-rr-types => [* uint], 1624 ? ignore-rr-types => [* uint], 1625 ? max-block-qr-items => uint, 1626 ? collect-malformed => uint, 1627 } 1629 QRCollectionSectionValues = &( 1630 question : 0, ; Second & subsequent questions 1631 answer : 1, 1632 authority : 2, 1633 additional: 3, 1634 ) 1635 QRCollectionSections = uint .bits QRCollectionSectionValues 1637 query-timeout = 0 1638 skew-timeout = 1 1639 snaplen = 2 1640 promisc = 3 1641 interfaces = 4 1642 vlan-ids = 5 1643 filter = 6 1644 query-options = 7 1645 response-options = 8 1646 accept-rr-types = 9 1647 ignore-rr-types = 10 1648 server-addresses = 11 1649 max-block-qr-items = 12 1650 collect-malformed = 13 1652 Block = { 1653 preamble => BlockPreamble, 1654 ? statistics => BlockStatistics, 1655 tables => BlockTables, 1656 queries => [* QueryResponse], 1657 ? address-event-counts => [* AddressEventCount], 1658 ? malformed-packet-data => [* MalformedPacket], 1659 } 1661 preamble = 0 1662 statistics = 1 1663 tables = 2 1664 queries = 3 1665 address-event-counts = 4 1666 malformed-packet-data = 5 1668 BlockPreamble = { 1669 earliest-time => Timeval 1670 } 1672 earliest-time = 1 1674 Timeval = [ 1675 seconds : uint, 1676 microseconds : uint, 1677 ? picoseconds : uint, 1678 ] 1680 BlockStatistics = { 1681 ? total-packets => uint, 1682 ? total-pairs => uint, 1683 ? unmatched-queries => uint, 1684 ? unmatched-responses => uint, 1685 ? malformed-packets => uint, 1687 } 1689 total-packets = 0 1690 total-pairs = 1 1691 unmatched-queries = 2 1692 unmatched-responses = 3 1693 malformed-packets = 4 1695 BlockTables = { 1696 ip-address => [* IPAddress], 1697 classtype => [* ClassType], 1698 name-rdata => [* bstr], ; Holds both Name RDATA and RDATA 1699 query-sig => [* QuerySignature] 1700 ? qlist => [* QuestionList], 1701 ? qrr => [* Question], 1702 ? rrlist => [* RRList], 1703 ? rr => [* RR], 1704 } 1706 ip-address = 0 1707 classtype = 1 1708 name-rdata = 2 1709 query-sig = 3 1710 qlist = 4 1711 qrr = 5 1712 rrlist = 6 1713 rr = 7 1715 QueryResponse = { 1716 time-useconds => uint, ; Time offset from start of block 1717 ? time-pseconds => uint, ; in microseconds and picoseconds 1718 client-address-index => uint, 1719 client-port => uint, 1720 transaction-id => uint, 1721 query-signature-index => uint, 1722 ? client-hoplimit => uint, 1723 ? delay-useconds => int, 1724 ? delay-pseconds => int, ; Has same sign as delay-useconds 1725 ? query-name-index => uint, 1726 ? query-size => uint, ; DNS size of query 1727 ? response-size => uint, ; DNS size of response 1728 ? query-extended => QueryResponseExtended, 1729 ? response-extended => QueryResponseExtended, 1730 } 1732 time-useconds = 0 1733 time-pseconds = 1 1734 client-address-index = 2 1735 client-port = 3 1736 transaction-id = 4 1737 query-signature-index = 5 1738 client-hoplimit = 6 1739 delay-useconds = 7 1740 delay-pseconds = 8 1741 query-name-index = 9 1742 query-size = 10 1743 response-size = 11 1744 query-extended = 12 1745 response-extended = 13 1747 ClassType = { 1748 type => uint, 1749 class => uint, 1750 } 1752 type = 0 1753 class = 1 1755 DNSFlagValues = &( 1756 query-cd : 0, 1757 query-ad : 1, 1758 query-z : 2, 1759 query-ra : 3, 1760 query-rd : 4, 1761 query-tc : 5, 1762 query-aa : 6, 1763 query-d0 : 7, 1764 response-cd: 8, 1765 response-ad: 9, 1766 response-z : 10, 1767 response-ra: 11, 1768 response-rd: 12, 1769 response-tc: 13, 1770 response-aa: 14, 1771 ) 1772 DNSFlags = uint .bits DNSFlagValues 1774 QueryResponseFlagValues = &( 1775 has-query : 0, 1776 has-reponse : 1, 1777 query-has-question : 2, 1778 query-has-opt : 3, 1779 response-has-opt : 4, 1780 response-has-no-question: 5, 1781 ) 1782 QueryResponseFlags = uint .bits QueryResponseFlagValues 1783 TransportFlagValues = &( 1784 tcp : 0, 1785 ipv6 : 1, 1786 query-trailingdata: 2, 1787 ) 1788 TransportFlags = uint .bits TransportFlagValues 1790 QuerySignature = { 1791 server-address-index => uint, 1792 server-port => uint, 1793 transport-flags => TransportFlags, 1794 qr-sig-flags => QueryResponseFlags, 1795 ? query-opcode => uint, 1796 qr-dns-flags => DNSFlags, 1797 ? query-rcode => uint, 1798 ? query-classtype-index => uint, 1799 ? query-qd-count => uint, 1800 ? query-an-count => uint, 1801 ? query-ar-count => uint, 1802 ? query-ns-count => uint, 1803 ? edns-version => uint, 1804 ? udp-buf-size => uint, 1805 ? opt-rdata-index => uint, 1806 ? response-rcode => uint, 1807 } 1809 server-address-index = 0 1810 server-port = 1 1811 transport-flags = 2 1812 qr-sig-flags = 3 1813 query-opcode = 4 1814 qr-dns-flags = 5 1815 query-rcode = 6 1816 query-classtype-index = 7 1817 query-qd-count = 8 1818 query-an-count = 9 1819 query-ar-count = 10 1820 query-ns-count = 11 1821 edns-version = 12 1822 udp-buf-size = 13 1823 opt-rdata-index = 14 1824 response-rcode = 15 1826 QuestionList = [ 1827 * uint, ; Index of Question 1828 ] 1830 Question = { ; Second and subsequent questions 1831 name-index => uint, ; Index to a name in the name-rdata table 1832 classtype-index => uint, 1833 } 1835 name-index = 0 1836 classtype-index = 1 1838 RRList = [ 1839 * uint, ; Index of RR 1840 ] 1842 RR = { 1843 name-index => uint, ; Index to a name in the name-rdata table 1844 classtype-index => uint, 1845 ttl => uint, 1846 rdata-index => uint, ; Index to RDATA in the name-rdata table 1847 } 1849 ttl = 2 1850 rdata-index = 3 1852 QueryResponseExtended = { 1853 ? question-index => uint, ; Index of QuestionList 1854 ? answer-index => uint, ; Index of RRList 1855 ? authority-index => uint, 1856 ? additional-index => uint, 1857 } 1859 question-index = 0 1860 answer-index = 1 1861 authority-index = 2 1862 additional-index = 3 1864 AddressEventCount = { 1865 ae-type => &AddressEventType, 1866 ? ae-code => uint, 1867 ae-address-index => uint, 1868 ae-count => uint, 1869 } 1871 ae-type = 0 1872 ae-code = 1 1873 ae-address-index = 2 1874 ae-count = 3 1876 AddressEventType = ( 1877 tcp-reset : 0, 1878 icmp-time-exceeded : 1, 1879 icmp-dest-unreachable : 2, 1880 icmpv6-time-exceeded : 3, 1881 icmpv6-dest-unreachable: 4, 1882 icmpv6-packet-too-big : 5, 1883 ) 1885 MalformedPacket = { 1886 time-useconds => uint, ; Time offset from start of block 1887 ? time-pseconds => uint, ; in microseconds and picoseconds 1888 packet-content => bstr, ; Raw packet contents 1889 } 1891 time-useconds = 0 1892 time-pseconds = 1 1893 packet-content = 2 1895 IPv4Address = bstr .size 4 1896 IPv6Address = bstr .size 16 1897 IPAddress = IPv4Address / IPv6Address 1899 Appendix B. DNS Name compression example 1901 The basic algorithm, which follows the guidance in [RFC1035], is 1902 simply to collect each name, and the offset in the packet at which it 1903 starts, during packet construction. As each name is added, it is 1904 offered to each of the collected names in order of collection, 1905 starting from the first name. If labels at the end of the name can 1906 be replaced with a reference back to part (or all) of the earlier 1907 name, and if the uncompressed part of the name is shorter than any 1908 compression already found, the earlier name is noted as the 1909 compression target for the name. 1911 The following tables illustrate the process. In an example packet, 1912 the first name is example.com. 1914 +---+-------------+--------------+--------------------+ 1915 | N | Name | Uncompressed | Compression Target | 1916 +---+-------------+--------------+--------------------+ 1917 | 1 | example.com | | | 1918 +---+-------------+--------------+--------------------+ 1920 The next name added is bar.com. This is matched against example.com. 1921 The com part of this can be used as a compression target, with the 1922 remaining uncompressed part of the name being bar. 1924 +---+-------------+--------------+--------------------+ 1925 | N | Name | Uncompressed | Compression Target | 1926 +---+-------------+--------------+--------------------+ 1927 | 1 | example.com | | | 1928 | 2 | bar.com | bar | 1 + offset to com | 1929 +---+-------------+--------------+--------------------+ 1931 The third name added is www.bar.com. This is first matched against 1932 example.com, and as before this is recorded as a compression target, 1933 with the remaining uncompressed part of the name being www.bar. It 1934 is then matched against the second name, which again can be a 1935 compression target. Because the remaining uncompressed part of the 1936 name is www, this is an improved compression, and so it is adopted. 1938 +---+-------------+--------------+--------------------+ 1939 | N | Name | Uncompressed | Compression Target | 1940 +---+-------------+--------------+--------------------+ 1941 | 1 | example.com | | | 1942 | 2 | bar.com | bar | 1 + offset to com | 1943 | 3 | www.bar.com | www | 2 | 1944 +---+-------------+--------------+--------------------+ 1946 As an optimization, if a name is already perfectly compressed (in 1947 other words, the uncompressed part of the name is empty), then no 1948 further names will be considered for compression. 1950 B.1. NSD compression algorithm 1952 Using the above basic algorithm the packet lengths of responses 1953 generated by NSD [12] can be matched almost exactly. At the time of 1954 writing, a tiny number (<.01%) of the reconstructed packets had 1955 incorrect lengths. 1957 B.2. Knot Authoritative compression algorithm 1959 The Knot Authoritative [13] name server uses different compression 1960 behavior, which is the result of internal optimization designed to 1961 balance runtime speed with compression size gains. In brief, and 1962 omitting complications, Knot Authoritative will only consider the 1963 QNAME and names in the immediately preceding RR section in an RRSET 1964 as compression targets. 1966 A set of smart heuristics as described below can be implemented to 1967 mimic this and while not perfect it produces output nearly, but not 1968 quite, as good a match as with NSD. The heuristics are: 1970 1. A match is only perfect if the name is completely compressed AND 1971 the TYPE of the section in which the name occurs matches the TYPE 1972 of the name used as the compression target. 1974 2. If the name occurs in RDATA: 1976 * If the compression target name is in a query, then only the 1977 first RR in an RRSET can use that name as a compression 1978 target. 1980 * The compression target name MUST be in RDATA. 1982 * The name section TYPE must match the compression target name 1983 section TYPE. 1985 * The compression target name MUST be in the immediately 1986 preceding RR in the RRSET. 1988 Using this algorithm less than 0.1% of the reconstructed packets had 1989 incorrect lengths. 1991 B.3. Observed differences 1993 In sample traffic collected on a root name server around 2-4% of 1994 responses generated by Knot had different packet lengths to those 1995 produced by NSD. 1997 Appendix C. Comparison of Binary Formats 1999 Several binary serialisation formats were considered, and for 2000 completeness were also compared to JSON. 2002 o Apache Avro [14]. Data is stored according to a pre-defined 2003 schema. The schema itself is always included in the data file. 2004 Data can therefore be stored untagged, for a smaller serialisation 2005 size, and be written and read by an Avro library. 2007 * At the time of writing, Avro libraries are available for C, 2008 C++, C#, Java, Python, Ruby and PHP. Optionally tools are 2009 available for C++, Java and C# to generate code for encoding 2010 and decoding. 2012 o Google Protocol Buffers [15]. Data is stored according to a pre- 2013 defined schema. The schema is used by a generator to generate 2014 code for encoding and decoding the data. Data can therefore be 2015 stored untagged, for a smaller serialisation size. The schema is 2016 not stored with the data, so unlike Avro cannot be read with a 2017 generic library. 2019 * Code must be generated for a particular data schema to to read 2020 and write data using that schema. At the time of writing, the 2021 Google code generator can currently generate code for encoding 2022 and decoding a schema for C++, Go, Java, Python, Ruby, C#, 2023 Objective-C, Javascript and PHP. 2025 o CBOR [16]. Defined in [RFC7049], this serialisation format is 2026 comparable to JSON but with a binary representation. It does not 2027 use a pre-defined schema, so data is always stored tagged. 2028 However, CBOR data schemas can be described using CDDL 2029 [I-D.greevenbosch-appsawg-cbor-cddl] and tools exist to verify 2030 data files conform to the schema. 2032 * CBOR is a simple format, and simple to implement. At the time 2033 of writing, the CBOR website lists implementations for 16 2034 languages. 2036 Avro and Protocol Buffers both allow storage of untagged data, but 2037 because they rely on the data schema for this, their implementation 2038 is considerably more complex than CBOR. Using Avro or Protocol 2039 Buffers in an unsupported environment would require notably greater 2040 development effort compared to CBOR. 2042 A test program was written which reads input from a PCAP file and 2043 writes output using one of two basic structures; either a simple 2044 structure, where each query/response pair is represented in a single 2045 record entry, or the C-DNS block structure. 2047 The resulting output files were then compressed using a variety of 2048 common general-purpose lossless compression tools to explore the 2049 compressibility of the formats. The compression tools employed were: 2051 o snzip [17]. A command line compression tool based on the Google 2052 Snappy [18] library. 2054 o lz4 [19]. The command line compression tool from the reference C 2055 LZ4 implementation. 2057 o gzip [20]. The ubiquitous GNU zip tool. 2059 o zstd [21]. Compression using the Zstandard algorithm. 2061 o xz [22]. A popular compression tool noted for high compression. 2063 In all cases the compression tools were run using their default 2064 settings. 2066 Note that this draft does not mandate the use of compression, nor any 2067 particular compression scheme, but it anticipates that in practice 2068 output data will be subject to general-purpose compression, and so 2069 this should be taken into consideration. 2071 "test.pcap", a 662Mb capture of sample data from a root instance was 2072 used for the comparison. The following table shows the formatted 2073 size and size after compression (abbreviated to Comp. in the table 2074 headers), together with the task resident set size (RSS) and the user 2075 time taken by the compression. File sizes are in Mb, RSS in kb and 2076 user time in seconds. 2078 +-------------+-----------+-------+------------+-------+-----------+ 2079 | Format | File size | Comp. | Comp. size | RSS | User time | 2080 +-------------+-----------+-------+------------+-------+-----------+ 2081 | PCAP | 661.87 | snzip | 212.48 | 2696 | 1.26 | 2082 | | | lz4 | 181.58 | 6336 | 1.35 | 2083 | | | gzip | 153.46 | 1428 | 18.20 | 2084 | | | zstd | 87.07 | 3544 | 4.27 | 2085 | | | xz | 49.09 | 97416 | 160.79 | 2086 | | | | | | | 2087 | JSON simple | 4113.92 | snzip | 603.78 | 2656 | 5.72 | 2088 | | | lz4 | 386.42 | 5636 | 5.25 | 2089 | | | gzip | 271.11 | 1492 | 73.00 | 2090 | | | zstd | 133.43 | 3284 | 8.68 | 2091 | | | xz | 51.98 | 97412 | 600.74 | 2092 | | | | | | | 2093 | Avro simple | 640.45 | snzip | 148.98 | 2656 | 0.90 | 2094 | | | lz4 | 111.92 | 5828 | 0.99 | 2095 | | | gzip | 103.07 | 1540 | 11.52 | 2096 | | | zstd | 49.08 | 3524 | 2.50 | 2097 | | | xz | 22.87 | 97308 | 90.34 | 2098 | | | | | | | 2099 | CBOR simple | 764.82 | snzip | 164.57 | 2664 | 1.11 | 2100 | | | lz4 | 120.98 | 5892 | 1.13 | 2101 | | | gzip | 110.61 | 1428 | 12.88 | 2102 | | | zstd | 54.14 | 3224 | 2.77 | 2103 | | | xz | 23.43 | 97276 | 111.48 | 2104 | | | | | | | 2105 | PBuf simple | 749.51 | snzip | 167.16 | 2660 | 1.08 | 2106 | | | lz4 | 123.09 | 5824 | 1.14 | 2107 | | | gzip | 112.05 | 1424 | 12.75 | 2108 | | | zstd | 53.39 | 3388 | 2.76 | 2109 | | | xz | 23.99 | 97348 | 106.47 | 2110 | | | | | | | 2111 | JSON block | 519.77 | snzip | 106.12 | 2812 | 0.93 | 2112 | | | lz4 | 104.34 | 6080 | 0.97 | 2113 | | | gzip | 57.97 | 1604 | 12.70 | 2114 | | | zstd | 61.51 | 3396 | 3.45 | 2115 | | | xz | 27.67 | 97524 | 169.10 | 2116 | | | | | | | 2117 | Avro block | 60.45 | snzip | 48.38 | 2688 | 0.20 | 2118 | | | lz4 | 48.78 | 8540 | 0.22 | 2119 | | | gzip | 39.62 | 1576 | 2.92 | 2120 | | | zstd | 29.63 | 3612 | 1.25 | 2121 | | | xz | 18.28 | 97564 | 25.81 | 2122 | | | | | | | 2123 | CBOR block | 75.25 | snzip | 53.27 | 2684 | 0.24 | 2124 | | | lz4 | 51.88 | 8008 | 0.28 | 2125 | | | gzip | 41.17 | 1548 | 4.36 | 2126 | | | zstd | 30.61 | 3476 | 1.48 | 2127 | | | xz | 18.15 | 97556 | 38.78 | 2128 | | | | | | | 2129 | PBuf block | 67.98 | snzip | 51.10 | 2636 | 0.24 | 2130 | | | lz4 | 52.39 | 8304 | 0.24 | 2131 | | | gzip | 40.19 | 1520 | 3.63 | 2132 | | | zstd | 31.61 | 3576 | 1.40 | 2133 | | | xz | 17.94 | 97440 | 33.99 | 2134 +-------------+-----------+-------+------------+-------+-----------+ 2136 The above results are discussed in the following sections. 2138 C.1. Comparison with full PCAP files 2140 An important first consideration is whether moving away from PCAP 2141 offers significant benefits. 2143 The simple binary formats are typically larger than PCAP, even though 2144 they omit some information such as Ethernet MAC addresses. But not 2145 only do they require less CPU to compress than PCAP, the resulting 2146 compressed files are smaller than compressed PCAP. 2148 C.2. Simple versus block coding 2150 The intention of the block coding is to perform data de-duplication 2151 on query/response records within the block. The simple and block 2152 formats above store exactly the same information for each query/ 2153 response record. This information is parsed from the DNS traffic in 2154 the input PCAP file, and in all cases each field has an identifier 2155 and the field data is typed. 2157 The data de-duplication on the block formats show an order of 2158 magnitude reduction in the size of the format file size against the 2159 simple formats. As would be expected, the compression tools are able 2160 to find and exploit a lot of this duplication, but as the de- 2161 duplication process uses knowledge of DNS traffic, it is able to 2162 retain a size advantage. This advantage reduces as stronger 2163 compression is applied, as again would be expected, but even with the 2164 strongest compression applied the block formatted data remains around 2165 75% of the size of the simple format and its compression requires 2166 roughly a third of the CPU time. 2168 C.3. Binary versus text formats 2170 Text data formats offer many advantages over binary formats, 2171 particularly in the areas of ad-hoc data inspection and extraction. 2172 It was therefore felt worthwhile to carry out a direct comparison, 2173 implementing JSON versions of the simple and block formats. 2175 Concentrating on JSON block format, the format files produced are a 2176 significant fraction of an order of magnitude larger than binary 2177 formats. The impact on file size after compression is as might be 2178 expected from that starting point; the stronger compression produces 2179 files that are 150% of the size of similarly compressed binary 2180 format, and require over 4x more CPU to compress. 2182 C.4. Performance 2184 Concentrating again on the block formats, all three produce format 2185 files that are close to an order of magnitude smaller that the 2186 original "test.pcap" file. CBOR produces the largest files and Avro 2187 the smallest, 20% smaller than CBOR. 2189 However, once compression is taken into account, the size difference 2190 narrows. At medium compression (with gzip), the size difference is 2191 4%. Using strong compression (with xz) the difference reduces to 2%, 2192 with Avro the largest and Protocol Buffers the smallest, although 2193 CBOR and Protocol Buffers require slightly more compression CPU. 2195 The measurements presented above do not include data on the CPU 2196 required to generate the format files. Measurements indicate that 2197 writing Avro requires 10% more CPU than CBOR or Protocol Buffers. It 2198 appears, therefore, that Avro's advantage in compression CPU usage is 2199 probably offset by a larger CPU requirement in writing Avro. 2201 C.5. Conclusions 2203 The above assessments lead us to the choice of a binary format file 2204 using blocking. 2206 As noted previously, this draft anticipates that output data will be 2207 subject to compression. There is no compelling case for one 2208 particular binary serialisation format in terms of either final file 2209 size or machine resources consumed, so the choice must be largely 2210 based on other factors. CBOR was therefore chosen as the binary 2211 serialisation format for the reasons listed in Section 6. 2213 C.6. Block size choice 2215 Given the choice of a CBOR format using blocking, the question arises 2216 of what an appropriate default value for the maximum number of query/ 2217 response pairs in a block should be. This has two components; what 2218 is the impact on performance of using different block sizes in the 2219 format file, and what is the impact on the size of the format file 2220 before and after compression. 2222 The following table addresses the performance question, showing the 2223 impact on the performance of a C++ program converting "test.pcap" to 2224 C-DNS. File size is in Mb, resident set size (RSS) in kb. 2226 +------------+-----------+--------+-----------+ 2227 | Block size | File size | RSS | User time | 2228 +------------+-----------+--------+-----------+ 2229 | 1000 | 133.46 | 612.27 | 15.25 | 2230 | 5000 | 89.85 | 676.82 | 14.99 | 2231 | 10000 | 76.87 | 752.40 | 14.53 | 2232 | 20000 | 67.86 | 750.75 | 14.49 | 2233 | 40000 | 61.88 | 736.30 | 14.29 | 2234 | 80000 | 58.08 | 694.16 | 14.28 | 2235 | 160000 | 55.94 | 733.84 | 14.44 | 2236 | 320000 | 54.41 | 799.20 | 13.97 | 2237 +------------+-----------+--------+-----------+ 2239 Increasing block size, therefore, tends to increase maximum RSS a 2240 little, with no significant effect (if anything a small reduction) on 2241 CPU consumption. 2243 The following figure plots the effect of increasing block size on 2244 output file size for different compressions. 2246 Figure showing effect of block size on file size (PNG) [23] 2248 Figure showing effect of block size on file size (SVG) [24] 2250 From the above, there is obviously scope for tuning the default block 2251 size to the compression being employed, traffic characteristics, 2252 frequency of output file rollover etc. Using a strong compression, 2253 block sizes over 10,000 query/response pairs would seem to offer 2254 limited improvements. 2256 Authors' Addresses 2258 John Dickinson 2259 Sinodun IT 2260 Magdalen Centre 2261 Oxford Science Park 2262 Oxford OX4 4GA 2264 Email: jad@sinodun.com 2266 Jim Hague 2267 Sinodun IT 2268 Magdalen Centre 2269 Oxford Science Park 2270 Oxford OX4 4GA 2272 Email: jim@sinodun.com 2274 Sara Dickinson 2275 Sinodun IT 2276 Magdalen Centre 2277 Oxford Science Park 2278 Oxford OX4 4GA 2280 Email: sara@sinodun.com 2282 Terry Manderson 2283 ICANN 2284 12025 Waterfront Drive 2285 Suite 300 2286 Los Angeles CA 90094-2536 2288 Email: terry.manderson@icann.org 2290 John Bond 2291 ICANN 2292 12025 Waterfront Drive 2293 Suite 300 2294 Los Angeles CA 90094-2536 2296 Email: john.bond@icann.org