idnits 2.17.1 draft-ietf-dnsop-dns-capture-format-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 3, 2018) is 2304 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 1542 -- Looks like a reference, but probably isn't: '2' on line 1545 -- Looks like a reference, but probably isn't: '3' on line 1548 -- Looks like a reference, but probably isn't: '4' on line 1551 -- Looks like a reference, but probably isn't: '5' on line 1554 -- Looks like a reference, but probably isn't: '6' on line 1557 -- Looks like a reference, but probably isn't: '7' on line 1560 -- Looks like a reference, but probably isn't: '8' on line 1562 -- Looks like a reference, but probably isn't: '9' on line 1564 -- Looks like a reference, but probably isn't: '10' on line 1566 -- Looks like a reference, but probably isn't: '11' on line 1569 -- Looks like a reference, but probably isn't: '12' on line 1969 -- Looks like a reference, but probably isn't: '13' on line 1975 -- Looks like a reference, but probably isn't: '14' on line 2018 -- Looks like a reference, but probably isn't: '15' on line 2029 -- Looks like a reference, but probably isn't: '16' on line 2042 -- Looks like a reference, but probably isn't: '17' on line 2068 -- Looks like a reference, but probably isn't: '18' on line 2069 -- Looks like a reference, but probably isn't: '19' on line 2071 -- Looks like a reference, but probably isn't: '20' on line 2074 -- Looks like a reference, but probably isn't: '21' on line 2076 -- Looks like a reference, but probably isn't: '22' on line 2078 -- Looks like a reference, but probably isn't: '23' on line 2263 -- Looks like a reference, but probably isn't: '24' on line 2265 ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) == Outdated reference: A later version (-16) exists of draft-hoffman-dns-in-json-13 -- Obsolete informational reference (is this intentional?): RFC 7159 (Obsoleted by RFC 8259) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 26 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 dnsop J. Dickinson 3 Internet-Draft J. Hague 4 Intended status: Standards Track S. Dickinson 5 Expires: July 7, 2018 Sinodun IT 6 T. Manderson 7 J. Bond 8 ICANN 9 January 3, 2018 11 C-DNS: A DNS Packet Capture Format 12 draft-ietf-dnsop-dns-capture-format-04 14 Abstract 16 This document describes a data representation for collections of DNS 17 messages. The format is designed for efficient storage and 18 transmission of large packet captures of DNS traffic; it attempts to 19 minimize the size of such packet capture files but retain the full 20 DNS message contents along with the most useful transport metadata. 21 It is intended to assist with the development of DNS traffic 22 monitoring applications. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on July 7, 2018. 41 Copyright Notice 43 Copyright (c) 2018 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 3. Data Collection Use Cases . . . . . . . . . . . . . . . . . . 5 61 4. Design Considerations . . . . . . . . . . . . . . . . . . . . 7 62 5. Conceptual Overview . . . . . . . . . . . . . . . . . . . . . 8 63 6. Choice of CBOR . . . . . . . . . . . . . . . . . . . . . . . 8 64 7. The C-DNS format . . . . . . . . . . . . . . . . . . . . . . 9 65 7.1. CDDL definition . . . . . . . . . . . . . . . . . . . . . 9 66 7.2. Format overview . . . . . . . . . . . . . . . . . . . . . 9 67 7.3. File header contents . . . . . . . . . . . . . . . . . . 10 68 7.4. File preamble contents . . . . . . . . . . . . . . . . . 10 69 7.5. Configuration contents . . . . . . . . . . . . . . . . . 11 70 7.6. Block contents . . . . . . . . . . . . . . . . . . . . . 13 71 7.7. Block preamble map . . . . . . . . . . . . . . . . . . . 13 72 7.8. Block statistics . . . . . . . . . . . . . . . . . . . . 14 73 7.9. Block table map . . . . . . . . . . . . . . . . . . . . . 14 74 7.10. IP address table . . . . . . . . . . . . . . . . . . . . 15 75 7.11. Class/Type table . . . . . . . . . . . . . . . . . . . . 15 76 7.12. Name/RDATA table . . . . . . . . . . . . . . . . . . . . 16 77 7.13. Query Signature table . . . . . . . . . . . . . . . . . . 16 78 7.14. Question table . . . . . . . . . . . . . . . . . . . . . 19 79 7.15. Resource Record (RR) table . . . . . . . . . . . . . . . 19 80 7.16. Question list table . . . . . . . . . . . . . . . . . . . 19 81 7.17. Resource Record list table . . . . . . . . . . . . . . . 20 82 7.18. Query/Response data . . . . . . . . . . . . . . . . . . . 20 83 7.19. Address Event counts . . . . . . . . . . . . . . . . . . 23 84 7.20. Malformed packet records . . . . . . . . . . . . . . . . 23 85 8. Malformed Packets . . . . . . . . . . . . . . . . . . . . . . 24 86 9. C-DNS to PCAP . . . . . . . . . . . . . . . . . . . . . . . . 25 87 9.1. Name Compression . . . . . . . . . . . . . . . . . . . . 26 88 10. Data Collection . . . . . . . . . . . . . . . . . . . . . . . 26 89 10.1. Matching algorithm . . . . . . . . . . . . . . . . . . . 27 90 10.2. Message identifiers . . . . . . . . . . . . . . . . . . 27 91 10.2.1. Primary ID (required) . . . . . . . . . . . . . . . 27 92 10.2.2. Secondary ID (optional) . . . . . . . . . . . . . . 28 93 10.3. Algorithm Parameters . . . . . . . . . . . . . . . . . . 28 94 10.4. Algorithm Requirements . . . . . . . . . . . . . . . . . 28 95 10.5. Algorithm Limitations . . . . . . . . . . . . . . . . . 28 96 10.6. Workspace . . . . . . . . . . . . . . . . . . . . . . . 28 97 10.7. Output . . . . . . . . . . . . . . . . . . . . . . . . . 29 98 10.8. Post Processing . . . . . . . . . . . . . . . . . . . . 29 99 11. Implementation Status . . . . . . . . . . . . . . . . . . . . 29 100 11.1. DNS-STATS Compactor . . . . . . . . . . . . . . . . . . 30 101 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 102 13. Security Considerations . . . . . . . . . . . . . . . . . . . 30 103 14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 30 104 15. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 31 105 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 32 106 16.1. Normative References . . . . . . . . . . . . . . . . . . 32 107 16.2. Informative References . . . . . . . . . . . . . . . . . 32 108 16.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 34 109 Appendix A. CDDL . . . . . . . . . . . . . . . . . . . . . . . . 35 110 Appendix B. DNS Name compression example . . . . . . . . . . . . 41 111 B.1. NSD compression algorithm . . . . . . . . . . . . . . . . 42 112 B.2. Knot Authoritative compression algorithm . . . . . . . . 43 113 B.3. Observed differences . . . . . . . . . . . . . . . . . . 43 114 Appendix C. Comparison of Binary Formats . . . . . . . . . . . . 43 115 C.1. Comparison with full PCAP files . . . . . . . . . . . . . 46 116 C.2. Simple versus block coding . . . . . . . . . . . . . . . 47 117 C.3. Binary versus text formats . . . . . . . . . . . . . . . 47 118 C.4. Performance . . . . . . . . . . . . . . . . . . . . . . . 47 119 C.5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . 48 120 C.6. Block size choice . . . . . . . . . . . . . . . . . . . . 48 121 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 49 123 1. Introduction 125 There has long been a need to collect DNS queries and responses on 126 authoritative and recursive name servers for monitoring and analysis. 127 This data is used in a number of ways including traffic monitoring, 128 analyzing network attacks and "day in the life" (DITL) [ditl] 129 analysis. 131 A wide variety of tools already exist that facilitate the collection 132 of DNS traffic data, such as DSC [dsc], packetq [packetq], dnscap 133 [dnscap] and dnstap [dnstap]. However, there is no standard exchange 134 format for large DNS packet captures. The PCAP [pcap] or PCAP-NG 135 [pcapng] formats are typically used in practice for packet captures, 136 but these file formats can contain a great deal of additional 137 information that is not directly pertinent to DNS traffic analysis 138 and thus unnecessarily increases the capture file size. 140 There has also been work on using text based formats to describe DNS 141 packets such as [I-D.daley-dnsxml], [I-D.hoffman-dns-in-json], but 142 these are largely aimed at producing convenient representations of 143 single messages. 145 Many DNS operators may receive hundreds of thousands of queries per 146 second on a single name server instance so a mechanism to minimize 147 the storage size (and therefore upload overhead) of the data 148 collected is highly desirable. 150 The format described in this document, C-DNS (Compacted-DNS), 151 focusses on the problem of capturing and storing large packet capture 152 files of DNS traffic. with the following goals in mind: 154 o Minimize the file size for storage and transmission 156 o Minimizing the overhead of producing the packet capture file and 157 the cost of any further (general purpose) compression of the file 159 This document contains: 161 o A discussion of the some common use cases in which such DNS data 162 is collected Section 3 164 o A discussion of the major design considerations in developing an 165 efficient data representation for collections of DNS messages 166 Section 4 168 o A conceptual overview of the C-DNS format Section 5 170 o A description of why CBOR [RFC7049] was chosen for this format 171 Section 6 173 o The definition of the C-DNS format for the collection of DNS 174 messages Section 7. 176 o Notes on converting C-DNS data to PCAP format Section 9 178 o Some high level implementation considerations for applications 179 designed to produce C-DNS Section 10 181 2. Terminology 183 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 184 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 185 document are to be interpreted as described in [RFC2119]. 187 "Packet" refers to individual IPv4 or IPv6 packets. Typically these 188 are UDP, but may be constructed from a TCP packet. "Message", unless 189 otherwise qualified, refers to a DNS payload extracted from a UDP or 190 TCP data stream. 192 The parts of DNS messages are named as they are in [RFC1035]. In 193 specific, the DNS message has five sections: Header, Question, 194 Answer, Authority, and Additional. 196 Pairs of DNS messages are called a Query and a Response. 198 3. Data Collection Use Cases 200 In an ideal world, it would be optimal to collect full packet 201 captures of all packets going in or out of a name server. However, 202 there are several design choices or other limitations that are common 203 to many DNS installations and operators. 205 o DNS servers are hosted in a variety of situations 207 * Self-hosted servers 209 * Third party hosting (including multiple third parties) 211 * Third party hardware (including multiple third parties) 213 o Data is collected under different conditions 215 * On well-provisioned servers running in a steady state 217 * On heavily loaded servers 219 * On virtualized servers 221 * On servers that are under DoS attack 223 * On servers that are unwitting intermediaries in DoS attacks 225 o Traffic can be collected via a variety of mechanisms 227 * On the same hardware as the name server itself 229 * Using a network tap on an adjacent host to listen to DNS 230 traffic 232 * Using port mirroring to listen from another host 234 o The capabilities of data collection (and upload) networks vary 236 * Out-of-band networks with the same capacity as the in-band 237 network 239 * Out-of-band networks with less capacity than the in-band 240 network 242 * Everything being on the in-band network 244 Thus, there is a wide range of use cases from very limited data 245 collection environments (third party hardware, servers that are under 246 attack, packet capture on the name server itself and no out-of-band 247 network) to "limitless" environments (self hosted, well provisioned 248 servers, using a network tap or port mirroring with an out-of-band 249 networks with the same capacity as the in-band network). In the 250 former, it is infeasible to reliably collect full packet captures, 251 especially if the server is under attack. In the latter case, 252 collection of full packet captures may be reasonable. 254 As a result of these restrictions, the C-DNS data format was designed 255 with the most limited use case in mind such that: 257 o data collection will occur on the same hardware as the name server 258 itself 260 o collected data will be stored on the same hardware as the name 261 server itself, at least temporarily 263 o collected data being returned to some central analysis system will 264 use the same network interface as the DNS queries and responses 266 o there can be multiple third party servers involved 268 Because of these considerations, a major factor in the design of the 269 format is minimal storage size of the capture files. 271 Another significant consideration for any application that records 272 DNS traffic is that the running of the name server software and the 273 transmission of DNS queries and responses are the most important jobs 274 of a name server; capturing data is not. Any data collection system 275 co-located with the name server needs to be intelligent enough to 276 carefully manage its CPU, disk, memory and network utilization. This 277 leads to designing a format that requires a relatively low overhead 278 to produce and minimizes the requirement for further potentially 279 costly compression. 281 However, it was also essential that interoperability with less 282 restricted infrastructure was maintained. In particular, it is 283 highly desirable that the collection format should facilitate the re- 284 creation of common formats (such as PCAP) that are as close to the 285 original as is realistic given the restrictions above. 287 4. Design Considerations 289 This section presents some of the major design considerations used in 290 the development of the C-DNS format. 292 1. The basic unit of data is a combined DNS Query and the associated 293 Response (a "Q/R data item"). The same structure will be used 294 for unmatched Queries and Responses. Queries without Responses 295 will be captured omitting the response data. Responses without 296 queries will be captured omitting the Query data (but using the 297 Question section from the response, if present, as an identifying 298 QNAME). 300 * Rationale: A Query and Response represents the basic level of 301 a clients interaction with the server. Also, combining the 302 Query and Response into one item often reduces storage 303 requirements due to commonality in the data of the two 304 messages. 306 2. Each Q/R data item will comprise a default Q/R data description 307 and a set of optional sections. Inclusion of optional sections 308 shall be configurable. 310 * Rationale: Different users will have different requirements 311 for data to be available for analysis. Users with minimal 312 requirements should not have to pay the cost of recording full 313 data, however this will limit the ability to reconstruct 314 packet captures. For example, omitting the resource records 315 from a Response will reduce the files size, and in principle 316 responses can be synthesized if there is enough context. 318 3. Multiple Q/R data items will be collected into blocks in the 319 format. Common data in a block will be abstracted and referenced 320 from individual Q/R data items by indexing. The maximum number 321 of Q/R data items in a block will be configurable. 323 * Rationale: This blocking and indexing provides a significant 324 reduction in the volume of file data generated. Although this 325 introduces complexity, it provides compression of the data 326 that makes use of knowledge of the DNS message structure. 328 * It is anticipated that the files produced can be subject to 329 further compression using general purpose compression tools. 330 Measurements show that blocking significantly reduces the CPU 331 required to perform such strong compression. See 332 Appendix C.2. 334 * [TODO: Further discussion of commonality between DNS messages 335 e.g. common query signatures, a finite set of valid responses 336 from authoritatives] 338 4. Metadata about other packets received can optionally be included 339 in each block. For example, counts of malformed DNS packets and 340 non-DNS packets (e.g. ICMP, TCP resets) sent to the server may 341 be of interest. 343 5. The wire format content of malformed DNS packets can optionally 344 be recorded. 346 * Rationale: Any structured capture format that does not capture 347 the DNS payload byte for byte will be limited to some extent 348 in that it cannot represent "malformed" DNS packets (see 349 Section 8). Only those packets that can be transformed 350 reasonably into the structured format can be represented by 351 the format. However this can result in rather misleading 352 statistics. For example, a malformed query which cannot be 353 represented in the C-DNS format will lead to the (well formed) 354 DNS responses with error code FORMERR appearing as 355 'unmatched'. Therefore it can greatly aid downstream analysis 356 to have the wire format of the malformed DNS packets available 357 directly in the C-DNS file. 359 5. Conceptual Overview 361 The following figures show purely schematic representations of the 362 C-DNS format to convey the high-level structure of the C-DNS format. 363 Section 7 provides a detailed discussion of the CBOR representation 364 and individual elements. 366 Figure showing the C-DNS format (PNG) [1] 368 Figure showing the C-DNS format (SVG) [2] 370 Figure showing the Q/R data item and Block tables format (PNG) [3] 372 Figure showing the Q/R data item and Block tables format (SVG) [4] 374 6. Choice of CBOR 376 This document presents a detailed format description using CBOR, the 377 Concise Binary Object Representation defined in [RFC7049]. 379 The choice of CBOR was made taking a number of factors into account. 381 o CBOR is a binary representation, and thus is economical in storage 382 space. 384 o Other binary representations were investigated, and whilst all had 385 attractive features, none had a significant advantage over CBOR. 386 See Appendix C for some discussion of this. 388 o CBOR is an IETF standard and familiar to IETF participants. It is 389 based on the now-common ideas of lists and objects, and thus 390 requires very little familiarization for those in the wider 391 industry. 393 o CBOR is a simple format, and can easily be implemented from 394 scratch if necessary. More complex formats require library 395 support which may present problems on unusual platforms. 397 o CBOR can also be easily converted to text formats such as JSON 398 ([RFC7159]) for debugging and other human inspection requirements. 400 o CBOR data schemas can be described using CDDL 401 [I-D.greevenbosch-appsawg-cbor-cddl]. 403 7. The C-DNS format 405 7.1. CDDL definition 407 The CDDL definition for the C-DNS format is given in Appendix A. 409 7.2. Format overview 411 A C-DNS file begins with a file header containing a file type 412 identifier and a preamble. The preamble contains information on the 413 collection settings. 415 The file header is followed by a series of data blocks. 417 A block consists of a block header, containing various tables of 418 common data, and some statistics for the traffic received over the 419 block. The block header is then followed by a list of the Q/R data 420 items detailing the queries and responses received during processing 421 of the block input. The list of Q/R data items is in turn followed 422 by a list of per-client counts of particular IP events that occurred 423 during collection of the block data. 425 The exact nature of the DNS data will affect what block size is the 426 best fit, however sample data for a root server indicated that block 427 sizes up to 10,000 Q/R data items give good results. See 428 Appendix C.6 for more details. 430 If no field type is specified, then the field is unsigned. 432 In all quantities that contain bit flags, bit 0 indicates the least 433 significant bit. An item described as an index is the index of the 434 Q/R data item in the referenced table. Indexes are 1-based. An 435 index value of 0 is reserved to mean "not present". 437 All map keys are unsigned integers with values specified in the CDDL 438 (string keys would significantly bloat the file size). 440 7.3. File header contents 442 The file header contains the following: 444 +---------------+---------------+-----------------------------------+ 445 | Field | Type | Description | 446 +---------------+---------------+-----------------------------------+ 447 | file-type-id | Text string | String "C-DNS" identifying the | 448 | | | file type. | 449 | | | | 450 | file-preamble | Map of items | Collection information for the | 451 | | | whole file. | 452 | | | | 453 | file-blocks | Array of | The data blocks. | 454 | | Blocks | | 455 +---------------+---------------+-----------------------------------+ 457 7.4. File preamble contents 459 The file preamble contains the following: 461 +----------------------+----------+---------------------------------+ 462 | Field | Type | Description | 463 +----------------------+----------+---------------------------------+ 464 | major-format-version | Unsigned | Unsigned integer '1'. The major | 465 | | | version of format used in file. | 466 | | | | 467 | minor-format-version | Unsigned | Unsigned integer '0'. The minor | 468 | | | version of format used in file. | 469 | | | | 470 | private-version | Unsigned | Version indicator available for | 471 | | | private use by applications. | 472 | | | Optional. | 473 | | | | 474 | configuration | Map of | The collection configuration. | 475 | | items | Optional. | 476 | | | | 477 | generator-id | Text | String identifying the | 478 | | string | collection program. Optional. | 479 | | | | 480 | host-id | Text | String identifying the | 481 | | string | collecting host. Empty if | 482 | | | converting an existing packet | 483 | | | capture file. Optional. | 484 +----------------------+----------+---------------------------------+ 486 7.5. Configuration contents 488 The collection configuration contains the following items. All are 489 optional. 491 +--------------------+----------+-----------------------------------+ 492 | Field | Type | Description | 493 +--------------------+----------+-----------------------------------+ 494 | query-timeout | Unsigned | To be matched with a query, a | 495 | | | response must arrive within this | 496 | | | number of seconds. | 497 | | | | 498 | skew-timeout | Unsigned | The network stack may report a | 499 | | | response before the corresponding | 500 | | | query. A response is not | 501 | | | considered to be missing a query | 502 | | | until after this many micro- | 503 | | | seconds. | 504 | | | | 505 | snaplen | Unsigned | Collect up to this many bytes per | 506 | | | packet. | 507 | | | | 508 | promisc | Unsigned | 1 if promiscuous mode was enabled | 509 | | | on the interface, 0 otherwise. | 510 | | | | 511 | interfaces | Array of | Identifiers of the interfaces | 512 | | text | used for collection. | 513 | | strings | | 514 | | | | 515 | server-addresses | Array of | Server collection IP addresses. | 516 | | byte | Hint for downstream analysers; | 517 | | strings | does not affect collection. | 518 | | | | 519 | vlan-ids | Array of | Identifiers of VLANs selected for | 520 | | unsigned | collection. | 521 | | | | 522 | filter | Text | 'tcpdump' [pcap] style filter for | 523 | | string | input. | 524 | | | | 525 | query-options | Unsigned | Bit flags indicating sections in | 526 | | | Query messages to be collected. | 527 | | | Bit 0. Collect second and | 528 | | | subsequent Questions in the | 529 | | | Question section. | 530 | | | Bit 1. Collect Answer sections. | 531 | | | Bit 2. Collect Authority | 532 | | | sections. | 533 | | | Bit 3. Collection Additional | 534 | | | sections. | 535 | | | | 536 | response-options | Unsigned | Bit flags indicating sections in | 537 | | | Response messages to be | 538 | | | collected. | 539 | | | Bit 0. Collect second and | 540 | | | subsequent Questions in the | 541 | | | Question section. | 542 | | | Bit 1. Collect Answer sections. | 543 | | | Bit 2. Collect Authority | 544 | | | sections. | 545 | | | Bit 3. Collection Additional | 546 | | | sections. | 547 | | | | 548 | accept-rr-types | Array of | A set of RR type names [rrtypes]. | 549 | | text | If not empty, only the nominated | 550 | | strings | RR types are collected. | 551 | | | | 552 | ignore-rr-types | Array of | A set of RR type names [rrtypes]. | 553 | | text | If not empty, all RR types are | 554 | | strings | collected except those listed. If | 555 | | | present, this item must be empty | 556 | | | if a non-empty list of Accept RR | 557 | | | types is present. | 558 | | | | 559 | max-block-qr-items | Unsigned | Maximum number of Q/R data items | 560 | | | in a block. | 561 | | | | 562 | collect-malformed | Unsigned | 1 if malformed packet contents | 563 | | | are collected, 0 otherwise. | 564 +--------------------+----------+-----------------------------------+ 566 7.6. Block contents 568 Each block contains the following: 570 +-----------------------+--------------+----------------------------+ 571 | Field | Type | Description | 572 +-----------------------+--------------+----------------------------+ 573 | preamble | Map of items | Overall information for | 574 | | | the block. | 575 | | | | 576 | statistics | Map of | Statistics about the | 577 | | statistics | block. Optional. | 578 | | | | 579 | tables | Map of | The tables containing data | 580 | | tables | referenced by individual | 581 | | | Q/R data items. | 582 | | | | 583 | queries | Array of Q/R | Details of individual Q/R | 584 | | data items | data items. | 585 | | | | 586 | address-event-counts | Array of | Per client counts of ICMP | 587 | | Address | messages and TCP resets. | 588 | | Event counts | Optional. | 589 | | | | 590 | malformed-packet-data | Array of | Wire contents of malformed | 591 | | malformed | packets. Optional. | 592 | | packets | | 593 +-----------------------+--------------+----------------------------+ 595 7.7. Block preamble map 597 The block preamble map contains overall information for the block. 599 +---------------+----------+----------------------------------------+ 600 | Field | Type | Description | 601 +---------------+----------+----------------------------------------+ 602 | earliest-time | Array of | A timestamp for the earliest record in | 603 | | unsigned | the block. The timestamp is specified | 604 | | | as a CBOR array with two or three | 605 | | | elements. The first two elements are | 606 | | | as in Posix struct timeval. The first | 607 | | | element is an unsigned integer time_t | 608 | | | and the second is an unsigned integer | 609 | | | number of microseconds. The third, if | 610 | | | present, is an unsigned integer number | 611 | | | of picoseconds. The microsecond and | 612 | | | picosecond items always have a value | 613 | | | between 0 and 999,999. | 614 +---------------+----------+----------------------------------------+ 616 7.8. Block statistics 618 The block statistics section contains some basic statistical 619 information about the block. All are optional. 621 +---------------------+----------+----------------------------------+ 622 | Field | Type | Description | 623 +---------------------+----------+----------------------------------+ 624 | total-packets | Unsigned | Total number of packets | 625 | | | processed from the input traffic | 626 | | | stream during collection of the | 627 | | | block data. | 628 | total-pairs | Unsigned | Total number of Q/R data items | 629 | | | in the block. | 630 | unmatched-queries | Unsigned | Number of unmatched queries in | 631 | | | the block. | 632 | unmatched-responses | Unsigned | Number of unmatched responses in | 633 | | | the block. | 634 | malformed-packets | Unsigned | Number of malformed packets | 635 | | | found in input for the block. | 636 +---------------------+----------+----------------------------------+ 638 Implementations may choose to add additional implementation-specific 639 fields to the statistics. 641 7.9. Block table map 643 The block table map contains the block tables. Each element, or 644 table, is an array. The following tables detail the contents of each 645 block table. 647 The Present column in the following tables indicates the 648 circumstances when an optional field will be present. A Q/R data 649 item may be: 651 o A Query plus a Response. 653 o A Query without a Response. 655 o A Response without a Query. 657 Also: 659 o A Query and/or a Response may contain an OPT section. 661 o A Question may or may not be present. If the Query is available, 662 the Question section of the Query is used. If no Query is 663 available, the Question section of the Response is used. Unless 664 otherwise noted, a Question refers to the first Question in the 665 Question section. 667 So, for example, a field listed with a Present value of QUERY is 668 present whenever the Q/R data item contains a Query. If the pair 669 contains a Response only, the field will not be present. 671 7.10. IP address table 673 The table "ip-address" holds all client and server IP addresses in 674 the block. Each item in the table is a single IP address. 676 +------------+--------+---------------------------------------------+ 677 | Field | Type | Description | 678 +------------+--------+---------------------------------------------+ 679 | ip-address | Byte | The IP address, in network byte order. The | 680 | | string | string is 4 bytes long for an IPv4 address, | 681 | | | 16 bytes long for an IPv6 address. | 682 +------------+--------+---------------------------------------------+ 684 7.11. Class/Type table 686 The table "classtype" holds pairs of RR CLASS and TYPE values. Each 687 item in the table is a CBOR map. 689 +-------+----------+--------------+ 690 | Field | Type | Description | 691 +-------+----------+--------------+ 692 | type | Unsigned | TYPE value. | 693 | | | | 694 | class | Unsigned | CLASS value. | 695 +-------+----------+--------------+ 697 7.12. Name/RDATA table 699 The table "name-rdata" holds the contents of all NAME or RDATA items 700 in the block. Each item in the table is the content of a single NAME 701 or RDATA. 703 Note that NAMEs, and labels within RDATA contents, are full domain 704 names or labels; no DNS style name compression is used on the 705 individual names/labels within the format. 707 +------------+-------------+----------------------------------------+ 708 | Field | Type | Description | 709 +------------+-------------+----------------------------------------+ 710 | name-rdata | Byte string | The NAME or RDATA contents | 711 | | | (uncompressed). | 712 +------------+-------------+----------------------------------------+ 714 7.13. Query Signature table 716 The table "query-sig" holds elements of the Q/R data item that are 717 often common between multiple individual Q/R data items. Each item 718 in the table is a CBOR map. Each item in the map has an unsigned 719 value and an unsigned integer key. 721 The following abbreviations are used in the Present (P) column 723 o Q = QUERY 725 o A = Always 727 o QT = QUESTION 729 o QO = QUERY, OPT 731 o QR = QUERY & RESPONSE 733 o R = RESPONSE 735 +-----------------------+----+--------------------------------------+ 736 | Field | P | Description | 737 +-----------------------+----+--------------------------------------+ 738 | server-address-index | A | The index in the IP address table of | 739 | | | the server IP address. | 740 | | | | 741 | server-port | A | The server port. | 742 | | | | 743 | transport-flags | A | Bit flags describing the transport | 744 | | | used to service the query. Bit 0 is | 745 | | | the least significant bit. | 746 | | | Bit 0. Transport type. 0 = UDP, 1 = | 747 | | | TCP. | 748 | | | Bit 1. IP type. 0 = IPv4, 1 = IPv6. | 749 | | | Bit 2. Trailing bytes in query | 750 | | | payload. The DNS query message in | 751 | | | the UDP payload was followed by some | 752 | | | additional bytes, which were | 753 | | | discarded. | 754 | | | | 755 | qr-sig-flags | A | Bit flags indicating information | 756 | | | present in this Q/R data item. Bit 0 | 757 | | | is the least significant bit. | 758 | | | Bit 0. 1 if a Query is present. | 759 | | | Bit 1. 1 if a Response is present. | 760 | | | Bit 2. 1 if one or more Question is | 761 | | | present. | 762 | | | Bit 3. 1 if a Query is present and | 763 | | | it has an OPT Resource Record. | 764 | | | Bit 4. 1 if a Response is present | 765 | | | and it has an OPT Resource Record. | 766 | | | Bit 5. 1 if a Response is present | 767 | | | but has no Question. | 768 | | | | 769 | query-opcode | Q | Query OPCODE. Optional. | 770 | | | | 771 | qr-dns-flags | A | Bit flags with values from the Query | 772 | | | and Response DNS flags. Bit 0 is the | 773 | | | least significant bit. Flag values | 774 | | | are 0 if the Query or Response is | 775 | | | not present. | 776 | | | Bit 0. Query Checking Disabled (CD). | 777 | | | Bit 1. Query Authenticated Data | 778 | | | (AD). | 779 | | | Bit 2. Query reserved (Z). | 780 | | | Bit 3. Query Recursion Available | 781 | | | (RA). | 782 | | | Bit 4. Query Recursion Desired (RD). | 783 | | | Bit 5. Query TrunCation (TC). | 784 | | | Bit 6. Query Authoritative Answer | 785 | | | (AA). | 786 | | | Bit 7. Query DNSSEC answer OK (DO). | 787 | | | Bit 8. Response Checking Disabled | 788 | | | (CD). | 789 | | | Bit 9. Response Authenticated Data | 790 | | | (AD). | 791 | | | Bit 10. Response reserved (Z). | 792 | | | Bit 11. Response Recursion Available | 793 | | | (RA). | 794 | | | Bit 12. Response Recursion Desired | 795 | | | (RD). | 796 | | | Bit 13. Response TrunCation (TC). | 797 | | | Bit 14. Response Authoritative | 798 | | | Answer (AA). | 799 | | | | 800 | query-rcode | Q | Query RCODE. If the Query contains | 801 | | | OPT, this value incorporates any | 802 | | | EXTENDED_RCODE_VALUE. Optional. | 803 | | | | 804 | query-classtype-index | QT | The index in the Class/Type table of | 805 | | | the CLASS and TYPE of the first | 806 | | | Question. Optional. | 807 | | | | 808 | query-qd-count | QT | The QDCOUNT in the Query, or | 809 | | | Response if no Query present. | 810 | | | Optional. | 811 | | | | 812 | query-an-count | Q | Query ANCOUNT. Optional. | 813 | | | | 814 | query-ar-count | Q | Query ARCOUNT. Optional. | 815 | | | | 816 | query-ns-count | Q | Query NSCOUNT. Optional. | 817 | | | | 818 | edns-version | QO | The Query EDNS version. Optional. | 819 | | | | 820 | udp-buf-size | QO | The Query EDNS sender's UDP payload | 821 | | | size. Optional. | 822 | | | | 823 | opt-rdata-index | QO | The index in the NAME/RDATA table of | 824 | | | the OPT RDATA. Optional. | 825 | | | | 826 | response-rcode | R | Response RCODE. If the Response | 827 | | | contains OPT, this value | 828 | | | incorporates any | 829 | | | EXTENDED_RCODE_VALUE. Optional. | 830 +-----------------------+----+--------------------------------------+ 832 7.14. Question table 834 The table "qrr" holds details on individual Questions in a Question 835 section. Each item in the table is a CBOR map containing a single 836 Question. Each item in the map has an unsigned value and an unsigned 837 integer key. This data is optionally collected. 839 +-----------------+-------------------------------------------------+ 840 | Field | Description | 841 +-----------------+-------------------------------------------------+ 842 | name-index | The index in the NAME/RDATA table of the QNAME. | 843 | | | 844 | classtype-index | The index in the Class/Type table of the CLASS | 845 | | and TYPE of the Question. | 846 +-----------------+-------------------------------------------------+ 848 7.15. Resource Record (RR) table 850 The table "rr" holds details on individual Resource Records in RR 851 sections. Each item in the table is a CBOR map containing a single 852 Resource Record. This data is optionally collected. 854 +-----------------+-------------------------------------------------+ 855 | Field | Description | 856 +-----------------+-------------------------------------------------+ 857 | name-index | The index in the NAME/RDATA table of the NAME. | 858 | | | 859 | classtype-index | The index in the Class/Type table of the CLASS | 860 | | and TYPE of the RR. | 861 | | | 862 | ttl | The RR Time to Live. | 863 | | | 864 | rdata-index | The index in the NAME/RDATA table of the RR | 865 | | RDATA. | 866 +-----------------+-------------------------------------------------+ 868 7.16. Question list table 870 The table "qlist" holds a list of second and subsequent individual 871 Questions in a Question section. Each item in the table is a CBOR 872 unsigned integer. This data is optionally collected. 874 +----------+--------------------------------------------------------+ 875 | Field | Description | 876 +----------+--------------------------------------------------------+ 877 | question | The index in the Question table of the individual | 878 | | Question. | 879 +----------+--------------------------------------------------------+ 881 7.17. Resource Record list table 883 The table "rrlist" holds a list of individual Resource Records in a 884 Answer, Authority or Additional section. Each item in the table is a 885 CBOR unsigned integer. This data is optionally collected. 887 +-------+-----------------------------------------------------------+ 888 | Field | Description | 889 +-------+-----------------------------------------------------------+ 890 | rr | The index in the Resource Record table of the individual | 891 | | Resource Record. | 892 +-------+-----------------------------------------------------------+ 894 7.18. Query/Response data 896 The block Q/R data is a CBOR array of individual Q/R data items. 897 Each item in the array is a CBOR map containing details on the 898 individual Q/R data item. 900 Note that there is no requirement that the elements of the Q/R array 901 are presented in strict chronological order. 903 The following abbreviations are used in the Present (P) column 905 o Q = QUERY 907 o A = Always 909 o QT = QUESTION 911 o QO = QUERY, OPT 913 o QR = QUERY & RESPONSE 915 o R = RESPONSE 917 Each item in the map has an unsigned value (with the exception of 918 those listed below) and an unsigned integer key. 920 o query-extended and response-extended which are of type Extended 921 Information. 923 o delay-useconds and delay-pseconds which are integers (The delay 924 can be negative if the network stack/capture library returns them 925 out of order.) 927 +-----------------------+----+--------------------------------------+ 928 | Field | P | Description | 929 +-----------------------+----+--------------------------------------+ 930 | time-useconds | A | Q/R timestamp as an offset in | 931 | | | microseconds from the Block preamble | 932 | | | Timestamp. The timestamp is the | 933 | | | timestamp of the Query, or the | 934 | | | Response if there is no Query. | 935 | | | | 936 | time-pseconds | A | Picosecond component of the | 937 | | | timestamp. Optional. | 938 | | | | 939 | client-address-index | A | The index in the IP address table of | 940 | | | the client IP address. | 941 | | | | 942 | client-port | A | The client port. | 943 | | | | 944 | transaction-id | A | DNS transaction identifier. | 945 | | | | 946 | query-signature-index | A | The index of the Query Signature | 947 | | | table record for this data item. | 948 | | | | 949 | client-hoplimit | Q | The IPv4 TTL or IPv6 Hoplimit from | 950 | | | the Query packet. Optional. | 951 | | | | 952 | delay-useconds | QR | The time difference between Query | 953 | | | and Response, in microseconds. Only | 954 | | | present if there is a query and a | 955 | | | response. | 956 | | | | 957 | delay-pseconds | QR | Picosecond component of the time | 958 | | | different between Query and | 959 | | | Response. If delay-useconds is non- | 960 | | | zero then delay-pseconds (if | 961 | | | present) MUST be of the same sign as | 962 | | | delay-useconds, or be 0. Optional. | 963 | | | | 964 | query-name-index | QT | The index in the NAME/RDATA table of | 965 | | | the QNAME for the first Question. | 966 | | | Optional. | 967 | | | | 968 | query-size | R | DNS query message size (see below). | 969 | | | Optional. | 970 | | | | 971 | response-size | R | DNS query message size (see below). | 972 | | | Optional. | 973 | | | | 974 | query-extended | Q | Extended Query information. This | 975 | | | item is only present if collection | 976 | | | of extra Query information is | 977 | | | configured. Optional. | 978 | | | | 979 | response-extended | R | Extended Response information. This | 980 | | | item is only present if collection | 981 | | | of extra Response information is | 982 | | | configured. Optional. | 983 +-----------------------+----+--------------------------------------+ 985 An implementation must always collect basic Q/R information. It may 986 be configured to collect details on Question, Answer, Authority and 987 Additional sections of the Query, the Response or both. Note that 988 only the second and subsequent Questions of any Question section are 989 collected (the details of the first are in the basic information), 990 and that OPT Records are not collected in the Additional section. 992 The query-size and response-size fields hold the DNS message size. 993 For UDP this is the size of the UDP payload that contained the DNS 994 message and will therefore include any trailing bytes if present. 995 Trailing bytes with queries are routinely observed in traffic to 996 authoritative servers and this value allows a calculation of how many 997 trailing bytes were present. For TCP it is the size of the DNS 998 message as specified in the two-byte message length header. 1000 The Extended information is a CBOR map as follows. Each item in the 1001 map is present only if collection of the relevant details is 1002 configured. Each item in the map has an unsigned value and an 1003 unsigned integer key. 1005 +------------------+------------------------------------------------+ 1006 | Field | Description | 1007 +------------------+------------------------------------------------+ 1008 | question-index | The index in the Questions list table of the | 1009 | | entry listing any second and subsequent | 1010 | | Questions in the Question section for the | 1011 | | Query or Response. | 1012 | | | 1013 | answer-index | The index in the RR list table of the entry | 1014 | | listing the Answer Resource Record sections | 1015 | | for the Query or Response. | 1016 | | | 1017 | authority-index | The index in the RR list table of the entry | 1018 | | listing the Authority Resource Record sections | 1019 | | for the Query or Response. | 1020 | | | 1021 | additional-index | The index in the RR list table of the entry | 1022 | | listing the Additional Resource Record | 1023 | | sections for the Query or Response. | 1024 +------------------+------------------------------------------------+ 1026 7.19. Address Event counts 1028 This table holds counts of various IP related events relating to 1029 traffic with individual client addresses. 1031 +------------------+----------+-------------------------------------+ 1032 | Field | Type | Description | 1033 +------------------+----------+-------------------------------------+ 1034 | ae-type | Unsigned | The type of event. The following | 1035 | | | events types are currently defined: | 1036 | | | 0. TCP reset. | 1037 | | | 1. ICMP time exceeded. | 1038 | | | 2. ICMP destination unreachable. | 1039 | | | 3. ICMPv6 time exceeded. | 1040 | | | 4. ICMPv6 destination unreachable. | 1041 | | | 5. ICMPv6 packet too big. | 1042 | | | | 1043 | ae-code | Unsigned | A code relating to the event. | 1044 | | | Optional. | 1045 | | | | 1046 | ae-address-index | Unsigned | The index in the IP address table | 1047 | | | of the client address. | 1048 | | | | 1049 | ae-count | Unsigned | The number of occurrences of this | 1050 | | | event during the block collection | 1051 | | | period. | 1052 +------------------+----------+-------------------------------------+ 1054 7.20. Malformed packet records 1056 This optional table records the original wire format content of 1057 malformed packets (see Section 8). 1059 +----------------+--------+-----------------------------------------+ 1060 | Field | Type | Description | 1061 +----------------+--------+-----------------------------------------+ 1062 | time-useconds | A | Packet timestamp as an offset in | 1063 | | | microseconds from the Block preamble | 1064 | | | Timestamp. | 1065 | | | | 1066 | time-pseconds | A | Picosecond component of the timestamp. | 1067 | | | Optional. | 1068 | | | | 1069 | packet-content | Byte | The packet content in wire format. | 1070 | | string | | 1071 +----------------+--------+-----------------------------------------+ 1073 8. Malformed Packets 1075 In the context of generating a C-DNS file it is assumed that only 1076 those packets which can be parsed to produce a well-formed DNS 1077 message are stored in the C-DNS format. This means as a minimum: 1079 o The packet has a well-formed 12 bytes DNS Header 1081 o The section counts are consistent with the section contents 1083 o All of the resource records can be parsed 1085 In principle, packets that do not meet these criteria could be 1086 classified into two categories: 1088 o Partially malformed: those packets which can be decoded 1089 sufficiently to extract 1091 * a DNS header (and therefore a DNS transaction ID) 1093 * a QDCOUNT 1095 * the first Question in the Question section if QDCOUNT is 1096 greater than 0 1098 but suffer other issues while parsing. This is the minimum 1099 information required to attempt Query/Response matching as 1100 described in Section 10.1 1102 o Completely malformed: those packets that cannot be decoded to this 1103 extent. 1105 An open question is whether there is value in attempting to process 1106 partially malformed packets in an analogous manner to well formed 1107 packets in terms of attempting to match them with the corresponding 1108 query or response. This could be done by creating 'placeholder' 1109 records during Query/Response matching with just the information 1110 extracted as above. If the packet were then matched the resulting 1111 C-DNS Q/R data item would include a flag to indicate a malformed 1112 record (in addition to capturing the wire format of the packet). 1114 An advantage of this would be that it would result in more meaningful 1115 statistics about matched packets because, for example, some partially 1116 malformed queries could be matched to responses. However it would 1117 only apply to those queries where the first Question is well formed. 1118 It could also simplify the downstream analysis of C-DNS files and the 1119 reconstruction of packet streams from C-DNS. 1121 A disadvantage is that this adds complexity to the Query/Response 1122 matching and data representation, could potentially lead to false 1123 matches and some additional statistics would be required (e.g. counts 1124 for matched-partially-malformed, unmatched-partially-malformed, 1125 completely-malformed). 1127 9. C-DNS to PCAP 1129 It is possible to re-construct PCAP files from the C-DNS format in a 1130 lossy fashion. Some of the issues with reconstructing both the DNS 1131 payload and the full packet stream are outlined here. 1133 The reconstruction depends on whether or not all the optional 1134 sections of both the query and response were captured in the C-DNS 1135 file. Clearly, if they were not all captured, the reconstruction 1136 will be imperfect. 1138 Even if all sections of the response were captured, one cannot 1139 reconstruct the DNS response payload exactly due to the fact that 1140 some DNS names in the message on the wire may have been compressed. 1141 Section 9.1 discusses this is more detail. 1143 Some transport information is not captured in the C-DNS format. For 1144 example, the following aspects of the original packet stream cannot 1145 be re-constructed from the C-DNS format: 1147 o IP fragmentation 1149 o TCP stream information: 1151 * Multiple DNS messages may have been sent in a single TCP 1152 segment 1154 * A DNS payload may have be split across multiple TCP segments 1156 * Multiple DNS messages may have be sent on a single TCP session 1158 o Malformed DNS messages if the wire format is not recorded 1160 o Any Non-DNS messages that were in the original packet stream e.g. 1161 ICMP 1163 Simple assumptions can be made on the reconstruction: fragmented and 1164 DNS-over-TCP messages can be reconstructed into single packets and a 1165 single TCP session can be constructed for each TCP packet. 1167 Additionally, if malformed packets and Non-DNS packets are captured 1168 separately, they can be merged with packet captures reconstructed 1169 from C-DNS to produce a more complete packet stream. 1171 9.1. Name Compression 1173 All the names stored in the C-DNS format are full domain names; no 1174 DNS style name compression is used on the individual names within the 1175 format. Therefore when reconstructing a packet, name compression 1176 must be used in order to reproduce the on the wire representation of 1177 the packet. 1179 [RFC1035] name compression works by substituting trailing sections of 1180 a name with a reference back to the occurrence of those sections 1181 earlier in the message. Not all name server software uses the same 1182 algorithm when compressing domain names within the responses. Some 1183 attempt maximum recompression at the expense of runtime resources, 1184 others use heuristics to balance compression and speed and others use 1185 different rules for what is a valid compression target. 1187 This means that responses to the same question from different name 1188 server software which match in terms of DNS payload content (header, 1189 counts, RRs with name compression removed) do not necessarily match 1190 byte-for-byte on the wire. 1192 Therefore, it is not possible to ensure that the DNS response payload 1193 is reconstructed byte-for-byte from C-DNS data. However, it can at 1194 least, in principle, be reconstructed to have the correct payload 1195 length (since the original response length is captured) if there is 1196 enough knowledge of the commonly implemented name compression 1197 algorithms. For example, a simplistic approach would be to try each 1198 algorithm in turn to see if it reproduces the original length, 1199 stopping at the first match. This would not guarantee the correct 1200 algorithm has been used as it is possible to match the length whilst 1201 still not matching the on the wire bytes but, without further 1202 information added to the C-DNS data, this is the best that can be 1203 achieved. 1205 Appendix B presents an example of two different compression 1206 algorithms used by well-known name server software. 1208 10. Data Collection 1210 This section describes a non-normative proposed algorithm for the 1211 processing of a captured stream of DNS queries and responses and 1212 matching queries/responses where possible. 1214 For the purposes of this discussion, it is assumed that the input has 1215 been pre-processed such that: 1217 1. All IP fragmentation reassembly, TCP stream reassembly, and so 1218 on, has already been performed 1220 2. Each message is associated with transport metadata required to 1221 generate the Primary ID (see Section 10.2.1) 1223 3. Each message has a well-formed DNS header of 12 bytes and (if 1224 present) the first Question in the Question section can be parsed 1225 to generate the Secondary ID (see below). As noted earlier, this 1226 requirement can result in a malformed query being removed in the 1227 pre-processing stage, but the correctly formed response with 1228 RCODE of FORMERR being present. 1230 DNS messages are processed in the order they are delivered to the 1231 application. It should be noted that packet capture libraries do not 1232 necessary provide packets in strict chronological order. 1234 TODO: Discuss the corner cases resulting from this in more detail. 1236 10.1. Matching algorithm 1238 A schematic representation of the algorithm for matching Q/R data 1239 items is shown in the following diagram: 1241 Figure showing the Query/Response matching algorithm format (PNG) [5] 1243 Figure showing the Query/Response matching algorithm format (SVG) [6] 1245 Further details of the algorithm are given in the following sections. 1247 10.2. Message identifiers 1249 10.2.1. Primary ID (required) 1251 A Primary ID is constructed for each message. It is composed of the 1252 following data: 1254 1. Source IP Address 1256 2. Destination IP Address 1258 3. Source Port 1260 4. Destination Port 1261 5. Transport 1263 6. DNS Message ID 1265 10.2.2. Secondary ID (optional) 1267 If present, the first Question in the Question section is used as a 1268 secondary ID for each message. Note that there may be well formed 1269 DNS queries that have a QDCOUNT of 0, and some responses may have a 1270 QDCOUNT of 0 (for example, responses with RCODE=FORMERR or NOTIMP). 1271 In this case the secondary ID is not used in matching. 1273 10.3. Algorithm Parameters 1275 1. Query timeout 1277 2. Skew timeout 1279 10.4. Algorithm Requirements 1281 The algorithm is designed to handle the following input data: 1283 1. Multiple queries with the same Primary ID (but different 1284 Secondary ID) arriving before any responses for these queries are 1285 seen. 1287 2. Multiple queries with the same Primary and Secondary ID arriving 1288 before any responses for these queries are seen. 1290 3. Queries for which no later response can be found within the 1291 specified timeout. 1293 4. Responses for which no previous query can be found within the 1294 specified timeout. 1296 10.5. Algorithm Limitations 1298 For cases 1 and 2 listed in the above requirements, it is not 1299 possible to unambiguously match queries with responses. This 1300 algorithm chooses to match to the earliest query with the correct 1301 Primary and Secondary ID. 1303 10.6. Workspace 1305 A FIFO structure is used to hold the Q/R data items during 1306 processing. 1308 10.7. Output 1310 The output is a list of Q/R data items. Both the Query and Response 1311 elements are optional in these items, therefore Q/R data items have 1312 one of three types of content: 1314 1. A matched pair of query and response messages 1316 2. A query message with no response 1318 3. A response message with no query 1320 The timestamp of a list item is that of the query for cases 1 and 2 1321 and that of the response for case 3. 1323 10.8. Post Processing 1325 When ending capture, all remaining entries in the Q/R data item FIFO 1326 should be treated as timed out queries. 1328 11. Implementation Status 1330 [Note to RFC Editor: please remove this section and reference to 1331 [RFC7942] prior to publication.] 1333 This section records the status of known implementations of the 1334 protocol defined by this specification at the time of posting of this 1335 Internet-Draft, and is based on a proposal described in [RFC7942]. 1336 The description of implementations in this section is intended to 1337 assist the IETF in its decision processes in progressing drafts to 1338 RFCs. Please note that the listing of any individual implementation 1339 here does not imply endorsement by the IETF. Furthermore, no effort 1340 has been spent to verify the information presented here that was 1341 supplied by IETF contributors. This is not intended as, and must not 1342 be construed to be, a catalog of available implementations or their 1343 features. Readers are advised to note that other implementations may 1344 exist. 1346 According to [RFC7942], "this will allow reviewers and working groups 1347 to assign due consideration to documents that have the benefit of 1348 running code, which may serve as evidence of valuable experimentation 1349 and feedback that have made the implemented protocols more mature. 1350 It is up to the individual working groups to use this information as 1351 they see fit". 1353 11.1. DNS-STATS Compactor 1355 ICANN/Sinodun IT have developed an open source implementation called 1356 DNS-STATS Compactor. The Compactor is a suite of tools which can 1357 capture DNS traffic (from either a network interface or a PCAP file) 1358 and store it in the Compacted-DNS (C-DNS) file format. PCAP files 1359 for the captured traffic can also be reconstructed. See Compactor 1360 [7]. 1362 This implementation: 1364 o is mature but has only been deployed for testing in a single 1365 environment so is not yet classified as production ready. 1367 o covers the whole of the specification described in the -03 draft 1368 with the exception of support for malformed packets (Section 8) 1369 and pico second time resolution. (Note: this implementation does 1370 allow malformed packets to be dumped to a PCAP file). 1372 o is released under the Mozilla Public License Version 2.0. 1374 o has a users mailing list available, see dns-stats-users [8]. 1376 There is also some discussion of issues encountered during 1377 development available at Compressing Pcap Files [9] and Packet 1378 Capture [10]. 1380 This information was last updated on 29th of June 2017. 1382 12. IANA Considerations 1384 None 1386 13. Security Considerations 1388 Any control interface MUST perform authentication and encryption. 1390 Any data upload MUST be authenticated and encrypted. 1392 14. Acknowledgements 1394 The authors wish to thank CZ.NIC, in particular Tomas Gavenciak, for 1395 many useful discussions on binary formats, compression and packet 1396 matching. Also Jan Vcelak and Wouter Wijngaards for discussions on 1397 name compression and Paul Hoffman for a detailed review of the 1398 document and the C-DNS CDDL. 1400 Thanks also to Robert Edmonds and Jerry Lundstroem for review. 1402 Also, Miek Gieben for mmark [11] 1404 15. Changelog 1406 draft-ietf-dnsop-dns-capture-format-04 1408 o Correct query-d0 to query-do in CDDL 1410 o Clarify that map keys are unsigned integers 1412 o Add Type to Class/type table 1414 o Clarify storage format in section 7.12 1416 draft-ietf-dnsop-dns-capture-format-03 1418 o Added an Implementation Status section 1420 draft-ietf-dnsop-dns-capture-format-02 1422 o Update qr_data_format.png to match CDDL 1424 o Editorial clarifications and improvements 1426 draft-ietf-dnsop-dns-capture-format-01 1428 o Many editorial improvements by Paul Hoffman 1430 o Included discussion of malformed packet handling 1432 o Improved Appendix C on Comparison of Binary Formats 1434 o Now using C-DNS field names in the tables in section 8 1436 o A handful of new fields included (CDDL updated) 1438 o Timestamps now include optional picoseconds 1440 o Added details of block statistics 1442 draft-ietf-dnsop-dns-capture-format-00 1444 o Changed dnstap.io to dnstap.info 1446 o qr_data_format.png was cut off at the bottom 1448 o Update authors address 1449 o Improve wording in Abstract 1451 o Changed DNS-STAT to C-DNS in CDDL 1453 o Set the format version in the CDDL 1455 o Added a TODO: Add block statistics 1457 o Added a TODO: Add extend to support pico/nano. Also do this for 1458 Time offset and Response delay 1460 o Added a TODO: Need to develop optional representation of malformed 1461 packets within C-DNS and what this means for packet matching. 1462 This may influence which fields are optional in the rest of the 1463 representation. 1465 o Added section on design goals to Introduction 1467 o Added a TODO: Can Class be optimised? Should a class of IN be 1468 inferred if not present? 1470 draft-dickinson-dnsop-dns-capture-format-00 1472 o Initial commit 1474 16. References 1476 16.1. Normative References 1478 [RFC1035] Mockapetris, P., "Domain names - implementation and 1479 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, 1480 November 1987, . 1482 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1483 Requirement Levels", BCP 14, RFC 2119, 1484 DOI 10.17487/RFC2119, March 1997, . 1487 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 1488 Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, 1489 October 2013, . 1491 16.2. Informative References 1493 [ditl] DNS-OARC, "DITL", 2016, . 1496 [dnscap] DNS-OARC, "DNSCAP", 2016, . 1499 [dnstap] dnstap.info, "dnstap", 2016, . 1501 [dsc] Wessels, D. and J. Lundstrom, "DSC", 2016, 1502 . 1504 [I-D.daley-dnsxml] 1505 Daley, J., Morris, S., and J. Dickinson, "dnsxml - A 1506 standard XML representation of DNS data", draft-daley- 1507 dnsxml-00 (work in progress), July 2013. 1509 [I-D.greevenbosch-appsawg-cbor-cddl] 1510 Birkholz, H., Vigano, C., and C. Bormann, "Concise data 1511 definition language (CDDL): a notational convention to 1512 express CBOR data structures", draft-greevenbosch-appsawg- 1513 cbor-cddl-11 (work in progress), July 2017. 1515 [I-D.hoffman-dns-in-json] 1516 Hoffman, P., "Representing DNS Messages in JSON", draft- 1517 hoffman-dns-in-json-13 (work in progress), October 2017. 1519 [packetq] .SE - The Internet Infrastructure Foundation, "PacketQ", 1520 2014, . 1522 [pcap] tcpdump.org, "PCAP", 2016, . 1524 [pcapng] Tuexen, M., Risso, F., Bongertz, J., Combs, G., and G. 1525 Harris, "pcap-ng", 2016, . 1528 [RFC7159] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data 1529 Interchange Format", RFC 7159, DOI 10.17487/RFC7159, March 1530 2014, . 1532 [RFC7942] Sheffer, Y. and A. Farrel, "Improving Awareness of Running 1533 Code: The Implementation Status Section", BCP 205, 1534 RFC 7942, DOI 10.17487/RFC7942, July 2016, 1535 . 1537 [rrtypes] IANA, "RR types", 2016, . 1540 16.3. URIs 1542 [1] https://github.com/dns-stats/draft-dns-capture- 1543 format/blob/master/draft-04/cdns_format.png 1545 [2] https://github.com/dns-stats/draft-dns-capture- 1546 format/blob/master/draft-04/cdns_format.svg 1548 [3] https://github.com/dns-stats/draft-dns-capture- 1549 format/blob/master/draft-04/qr_data_format.png 1551 [4] https://github.com/dns-stats/draft-dns-capture- 1552 format/blob/master/draft-04/qr_data_format.svg 1554 [5] https://github.com/dns-stats/draft-dns-capture- 1555 format/blob/master/draft-04/packet_matching.png 1557 [6] https://github.com/dns-stats/draft-dns-capture- 1558 format/blob/master/draft-04/packet_matching.svg 1560 [7] https://github.com/dns-stats/compactor/wiki 1562 [8] https://mm.dns-stats.org/mailman/listinfo/dns-stats-users 1564 [9] https://www.sinodun.com/2017/06/compressing-pcap-files/ 1566 [10] https://www.sinodun.com/2017/06/more-on-debian-jessieubuntu- 1567 trusty-packet-capture-woes/ 1569 [11] https://github.com/miekg/mmark 1571 [12] https://www.nlnetlabs.nl/projects/nsd/ 1573 [13] https://www.knot-dns.cz/ 1575 [14] https://avro.apache.org/ 1577 [15] https://developers.google.com/protocol-buffers/ 1579 [16] http://cbor.io 1581 [17] https://github.com/kubo/snzip 1583 [18] http://google.github.io/snappy/ 1585 [19] http://lz4.github.io/lz4/ 1587 [20] http://www.gzip.org/ 1589 [21] http://facebook.github.io/zstd/ 1591 [22] http://tukaani.org/xz/ 1593 [23] https://github.com/dns-stats/draft-dns-capture- 1594 format/blob/master/file-size-versus-block-size.png 1596 [24] https://github.com/dns-stats/draft-dns-capture- 1597 format/blob/master/file-size-versus-block-size.svg 1599 Appendix A. CDDL 1601 ; CDDL specification of the file format for C-DNS, 1602 ; which describes a collection of DNS messages and 1603 ; traffic meta-data. 1605 File = [ 1606 file-type-id : tstr, ; = "C-DNS" 1607 file-preamble : FilePreamble, 1608 file-blocks : [* Block], 1609 ] 1611 FilePreamble = { 1612 major-format-version => uint, ; = 1 1613 minor-format-version => uint, ; = 0 1614 ? private-version => uint, 1615 ? configuration => Configuration, 1616 ? generator-id => tstr, 1617 ? host-id => tstr, 1618 } 1620 major-format-version = 0 1621 minor-format-version = 1 1622 private-version = 2 1623 configuration = 3 1624 generator-id = 4 1625 host-id = 5 1627 Configuration = { 1628 ? query-timeout => uint, 1629 ? skew-timeout => uint, 1630 ? snaplen => uint, 1631 ? promisc => uint, 1632 ? interfaces => [* tstr], 1633 ? server-addresses => [* IPAddress], ; Hint for later analysis 1634 ? vlan-ids => [* uint], 1635 ? filter => tstr, 1636 ? query-options => QRCollectionSections, 1637 ? response-options => QRCollectionSections, 1638 ? accept-rr-types => [* uint], 1639 ? ignore-rr-types => [* uint], 1640 ? max-block-qr-items => uint, 1641 ? collect-malformed => uint, 1642 } 1644 QRCollectionSectionValues = &( 1645 question : 0, ; Second & subsequent questions 1646 answer : 1, 1647 authority : 2, 1648 additional: 3, 1649 ) 1650 QRCollectionSections = uint .bits QRCollectionSectionValues 1652 query-timeout = 0 1653 skew-timeout = 1 1654 snaplen = 2 1655 promisc = 3 1656 interfaces = 4 1657 vlan-ids = 5 1658 filter = 6 1659 query-options = 7 1660 response-options = 8 1661 accept-rr-types = 9 1662 ignore-rr-types = 10 1663 server-addresses = 11 1664 max-block-qr-items = 12 1665 collect-malformed = 13 1667 Block = { 1668 preamble => BlockPreamble, 1669 ? statistics => BlockStatistics, 1670 tables => BlockTables, 1671 queries => [* QueryResponse], 1672 ? address-event-counts => [* AddressEventCount], 1673 ? malformed-packet-data => [* MalformedPacket], 1674 } 1676 preamble = 0 1677 statistics = 1 1678 tables = 2 1679 queries = 3 1680 address-event-counts = 4 1681 malformed-packet-data = 5 1683 BlockPreamble = { 1684 earliest-time => Timeval 1686 } 1688 earliest-time = 1 1690 Timeval = [ 1691 seconds : uint, 1692 microseconds : uint, 1693 ? picoseconds : uint, 1694 ] 1696 BlockStatistics = { 1697 ? total-packets => uint, 1698 ? total-pairs => uint, 1699 ? unmatched-queries => uint, 1700 ? unmatched-responses => uint, 1701 ? malformed-packets => uint, 1702 } 1704 total-packets = 0 1705 total-pairs = 1 1706 unmatched-queries = 2 1707 unmatched-responses = 3 1708 malformed-packets = 4 1710 BlockTables = { 1711 ip-address => [* IPAddress], 1712 classtype => [* ClassType], 1713 name-rdata => [* bstr], ; Holds both Name RDATA and RDATA 1714 query-sig => [* QuerySignature] 1715 ? qlist => [* QuestionList], 1716 ? qrr => [* Question], 1717 ? rrlist => [* RRList], 1718 ? rr => [* RR], 1719 } 1721 ip-address = 0 1722 classtype = 1 1723 name-rdata = 2 1724 query-sig = 3 1725 qlist = 4 1726 qrr = 5 1727 rrlist = 6 1728 rr = 7 1730 QueryResponse = { 1731 time-useconds => uint, ; Time offset from start of block 1732 ? time-pseconds => uint, ; in microseconds and picoseconds 1733 client-address-index => uint, 1734 client-port => uint, 1735 transaction-id => uint, 1736 query-signature-index => uint, 1737 ? client-hoplimit => uint, 1738 ? delay-useconds => int, 1739 ? delay-pseconds => int, ; Has same sign as delay-useconds 1740 ? query-name-index => uint, 1741 ? query-size => uint, ; DNS size of query 1742 ? response-size => uint, ; DNS size of response 1743 ? query-extended => QueryResponseExtended, 1744 ? response-extended => QueryResponseExtended, 1745 } 1747 time-useconds = 0 1748 time-pseconds = 1 1749 client-address-index = 2 1750 client-port = 3 1751 transaction-id = 4 1752 query-signature-index = 5 1753 client-hoplimit = 6 1754 delay-useconds = 7 1755 delay-pseconds = 8 1756 query-name-index = 9 1757 query-size = 10 1758 response-size = 11 1759 query-extended = 12 1760 response-extended = 13 1762 ClassType = { 1763 type => uint, 1764 class => uint, 1765 } 1767 type = 0 1768 class = 1 1770 DNSFlagValues = &( 1771 query-cd : 0, 1772 query-ad : 1, 1773 query-z : 2, 1774 query-ra : 3, 1775 query-rd : 4, 1776 query-tc : 5, 1777 query-aa : 6, 1778 query-do : 7, 1779 response-cd: 8, 1780 response-ad: 9, 1781 response-z : 10, 1782 response-ra: 11, 1783 response-rd: 12, 1784 response-tc: 13, 1785 response-aa: 14, 1786 ) 1787 DNSFlags = uint .bits DNSFlagValues 1789 QueryResponseFlagValues = &( 1790 has-query : 0, 1791 has-reponse : 1, 1792 query-has-question : 2, 1793 query-has-opt : 3, 1794 response-has-opt : 4, 1795 response-has-no-question: 5, 1796 ) 1797 QueryResponseFlags = uint .bits QueryResponseFlagValues 1799 TransportFlagValues = &( 1800 tcp : 0, 1801 ipv6 : 1, 1802 query-trailingdata: 2, 1803 ) 1804 TransportFlags = uint .bits TransportFlagValues 1806 QuerySignature = { 1807 server-address-index => uint, 1808 server-port => uint, 1809 transport-flags => TransportFlags, 1810 qr-sig-flags => QueryResponseFlags, 1811 ? query-opcode => uint, 1812 qr-dns-flags => DNSFlags, 1813 ? query-rcode => uint, 1814 ? query-classtype-index => uint, 1815 ? query-qd-count => uint, 1816 ? query-an-count => uint, 1817 ? query-ar-count => uint, 1818 ? query-ns-count => uint, 1819 ? edns-version => uint, 1820 ? udp-buf-size => uint, 1821 ? opt-rdata-index => uint, 1822 ? response-rcode => uint, 1823 } 1825 server-address-index = 0 1826 server-port = 1 1827 transport-flags = 2 1828 qr-sig-flags = 3 1829 query-opcode = 4 1830 qr-dns-flags = 5 1831 query-rcode = 6 1832 query-classtype-index = 7 1833 query-qd-count = 8 1834 query-an-count = 9 1835 query-ar-count = 10 1836 query-ns-count = 11 1837 edns-version = 12 1838 udp-buf-size = 13 1839 opt-rdata-index = 14 1840 response-rcode = 15 1842 QuestionList = [ 1843 * uint, ; Index of Question 1844 ] 1846 Question = { ; Second and subsequent questions 1847 name-index => uint, ; Index to a name in the name-rdata table 1848 classtype-index => uint, 1849 } 1851 name-index = 0 1852 classtype-index = 1 1854 RRList = [ 1855 * uint, ; Index of RR 1856 ] 1858 RR = { 1859 name-index => uint, ; Index to a name in the name-rdata table 1860 classtype-index => uint, 1861 ttl => uint, 1862 rdata-index => uint, ; Index to RDATA in the name-rdata table 1863 } 1865 ttl = 2 1866 rdata-index = 3 1868 QueryResponseExtended = { 1869 ? question-index => uint, ; Index of QuestionList 1870 ? answer-index => uint, ; Index of RRList 1871 ? authority-index => uint, 1872 ? additional-index => uint, 1873 } 1875 question-index = 0 1876 answer-index = 1 1877 authority-index = 2 1878 additional-index = 3 1880 AddressEventCount = { 1881 ae-type => &AddressEventType, 1882 ? ae-code => uint, 1883 ae-address-index => uint, 1884 ae-count => uint, 1885 } 1887 ae-type = 0 1888 ae-code = 1 1889 ae-address-index = 2 1890 ae-count = 3 1892 AddressEventType = ( 1893 tcp-reset : 0, 1894 icmp-time-exceeded : 1, 1895 icmp-dest-unreachable : 2, 1896 icmpv6-time-exceeded : 3, 1897 icmpv6-dest-unreachable: 4, 1898 icmpv6-packet-too-big : 5, 1899 ) 1901 MalformedPacket = { 1902 time-useconds => uint, ; Time offset from start of block 1903 ? time-pseconds => uint, ; in microseconds and picoseconds 1904 packet-content => bstr, ; Raw packet contents 1905 } 1907 time-useconds = 0 1908 time-pseconds = 1 1909 packet-content = 2 1911 IPv4Address = bstr .size 4 1912 IPv6Address = bstr .size 16 1913 IPAddress = IPv4Address / IPv6Address 1915 Appendix B. DNS Name compression example 1917 The basic algorithm, which follows the guidance in [RFC1035], is 1918 simply to collect each name, and the offset in the packet at which it 1919 starts, during packet construction. As each name is added, it is 1920 offered to each of the collected names in order of collection, 1921 starting from the first name. If labels at the end of the name can 1922 be replaced with a reference back to part (or all) of the earlier 1923 name, and if the uncompressed part of the name is shorter than any 1924 compression already found, the earlier name is noted as the 1925 compression target for the name. 1927 The following tables illustrate the process. In an example packet, 1928 the first name is example.com. 1930 +---+-------------+--------------+--------------------+ 1931 | N | Name | Uncompressed | Compression Target | 1932 +---+-------------+--------------+--------------------+ 1933 | 1 | example.com | | | 1934 +---+-------------+--------------+--------------------+ 1936 The next name added is bar.com. This is matched against example.com. 1937 The com part of this can be used as a compression target, with the 1938 remaining uncompressed part of the name being bar. 1940 +---+-------------+--------------+--------------------+ 1941 | N | Name | Uncompressed | Compression Target | 1942 +---+-------------+--------------+--------------------+ 1943 | 1 | example.com | | | 1944 | 2 | bar.com | bar | 1 + offset to com | 1945 +---+-------------+--------------+--------------------+ 1947 The third name added is www.bar.com. This is first matched against 1948 example.com, and as before this is recorded as a compression target, 1949 with the remaining uncompressed part of the name being www.bar. It 1950 is then matched against the second name, which again can be a 1951 compression target. Because the remaining uncompressed part of the 1952 name is www, this is an improved compression, and so it is adopted. 1954 +---+-------------+--------------+--------------------+ 1955 | N | Name | Uncompressed | Compression Target | 1956 +---+-------------+--------------+--------------------+ 1957 | 1 | example.com | | | 1958 | 2 | bar.com | bar | 1 + offset to com | 1959 | 3 | www.bar.com | www | 2 | 1960 +---+-------------+--------------+--------------------+ 1962 As an optimization, if a name is already perfectly compressed (in 1963 other words, the uncompressed part of the name is empty), then no 1964 further names will be considered for compression. 1966 B.1. NSD compression algorithm 1968 Using the above basic algorithm the packet lengths of responses 1969 generated by NSD [12] can be matched almost exactly. At the time of 1970 writing, a tiny number (<.01%) of the reconstructed packets had 1971 incorrect lengths. 1973 B.2. Knot Authoritative compression algorithm 1975 The Knot Authoritative [13] name server uses different compression 1976 behavior, which is the result of internal optimization designed to 1977 balance runtime speed with compression size gains. In brief, and 1978 omitting complications, Knot Authoritative will only consider the 1979 QNAME and names in the immediately preceding RR section in an RRSET 1980 as compression targets. 1982 A set of smart heuristics as described below can be implemented to 1983 mimic this and while not perfect it produces output nearly, but not 1984 quite, as good a match as with NSD. The heuristics are: 1986 1. A match is only perfect if the name is completely compressed AND 1987 the TYPE of the section in which the name occurs matches the TYPE 1988 of the name used as the compression target. 1990 2. If the name occurs in RDATA: 1992 * If the compression target name is in a query, then only the 1993 first RR in an RRSET can use that name as a compression 1994 target. 1996 * The compression target name MUST be in RDATA. 1998 * The name section TYPE must match the compression target name 1999 section TYPE. 2001 * The compression target name MUST be in the immediately 2002 preceding RR in the RRSET. 2004 Using this algorithm less than 0.1% of the reconstructed packets had 2005 incorrect lengths. 2007 B.3. Observed differences 2009 In sample traffic collected on a root name server around 2-4% of 2010 responses generated by Knot had different packet lengths to those 2011 produced by NSD. 2013 Appendix C. Comparison of Binary Formats 2015 Several binary serialisation formats were considered, and for 2016 completeness were also compared to JSON. 2018 o Apache Avro [14]. Data is stored according to a pre-defined 2019 schema. The schema itself is always included in the data file. 2021 Data can therefore be stored untagged, for a smaller serialisation 2022 size, and be written and read by an Avro library. 2024 * At the time of writing, Avro libraries are available for C, 2025 C++, C#, Java, Python, Ruby and PHP. Optionally tools are 2026 available for C++, Java and C# to generate code for encoding 2027 and decoding. 2029 o Google Protocol Buffers [15]. Data is stored according to a pre- 2030 defined schema. The schema is used by a generator to generate 2031 code for encoding and decoding the data. Data can therefore be 2032 stored untagged, for a smaller serialisation size. The schema is 2033 not stored with the data, so unlike Avro cannot be read with a 2034 generic library. 2036 * Code must be generated for a particular data schema to to read 2037 and write data using that schema. At the time of writing, the 2038 Google code generator can currently generate code for encoding 2039 and decoding a schema for C++, Go, Java, Python, Ruby, C#, 2040 Objective-C, Javascript and PHP. 2042 o CBOR [16]. Defined in [RFC7049], this serialisation format is 2043 comparable to JSON but with a binary representation. It does not 2044 use a pre-defined schema, so data is always stored tagged. 2045 However, CBOR data schemas can be described using CDDL 2046 [I-D.greevenbosch-appsawg-cbor-cddl] and tools exist to verify 2047 data files conform to the schema. 2049 * CBOR is a simple format, and simple to implement. At the time 2050 of writing, the CBOR website lists implementations for 16 2051 languages. 2053 Avro and Protocol Buffers both allow storage of untagged data, but 2054 because they rely on the data schema for this, their implementation 2055 is considerably more complex than CBOR. Using Avro or Protocol 2056 Buffers in an unsupported environment would require notably greater 2057 development effort compared to CBOR. 2059 A test program was written which reads input from a PCAP file and 2060 writes output using one of two basic structures; either a simple 2061 structure, where each query/response pair is represented in a single 2062 record entry, or the C-DNS block structure. 2064 The resulting output files were then compressed using a variety of 2065 common general-purpose lossless compression tools to explore the 2066 compressibility of the formats. The compression tools employed were: 2068 o snzip [17]. A command line compression tool based on the Google 2069 Snappy [18] library. 2071 o lz4 [19]. The command line compression tool from the reference C 2072 LZ4 implementation. 2074 o gzip [20]. The ubiquitous GNU zip tool. 2076 o zstd [21]. Compression using the Zstandard algorithm. 2078 o xz [22]. A popular compression tool noted for high compression. 2080 In all cases the compression tools were run using their default 2081 settings. 2083 Note that this draft does not mandate the use of compression, nor any 2084 particular compression scheme, but it anticipates that in practice 2085 output data will be subject to general-purpose compression, and so 2086 this should be taken into consideration. 2088 "test.pcap", a 662Mb capture of sample data from a root instance was 2089 used for the comparison. The following table shows the formatted 2090 size and size after compression (abbreviated to Comp. in the table 2091 headers), together with the task resident set size (RSS) and the user 2092 time taken by the compression. File sizes are in Mb, RSS in kb and 2093 user time in seconds. 2095 +-------------+-----------+-------+------------+-------+-----------+ 2096 | Format | File size | Comp. | Comp. size | RSS | User time | 2097 +-------------+-----------+-------+------------+-------+-----------+ 2098 | PCAP | 661.87 | snzip | 212.48 | 2696 | 1.26 | 2099 | | | lz4 | 181.58 | 6336 | 1.35 | 2100 | | | gzip | 153.46 | 1428 | 18.20 | 2101 | | | zstd | 87.07 | 3544 | 4.27 | 2102 | | | xz | 49.09 | 97416 | 160.79 | 2103 | | | | | | | 2104 | JSON simple | 4113.92 | snzip | 603.78 | 2656 | 5.72 | 2105 | | | lz4 | 386.42 | 5636 | 5.25 | 2106 | | | gzip | 271.11 | 1492 | 73.00 | 2107 | | | zstd | 133.43 | 3284 | 8.68 | 2108 | | | xz | 51.98 | 97412 | 600.74 | 2109 | | | | | | | 2110 | Avro simple | 640.45 | snzip | 148.98 | 2656 | 0.90 | 2111 | | | lz4 | 111.92 | 5828 | 0.99 | 2112 | | | gzip | 103.07 | 1540 | 11.52 | 2113 | | | zstd | 49.08 | 3524 | 2.50 | 2114 | | | xz | 22.87 | 97308 | 90.34 | 2115 | | | | | | | 2116 | CBOR simple | 764.82 | snzip | 164.57 | 2664 | 1.11 | 2117 | | | lz4 | 120.98 | 5892 | 1.13 | 2118 | | | gzip | 110.61 | 1428 | 12.88 | 2119 | | | zstd | 54.14 | 3224 | 2.77 | 2120 | | | xz | 23.43 | 97276 | 111.48 | 2121 | | | | | | | 2122 | PBuf simple | 749.51 | snzip | 167.16 | 2660 | 1.08 | 2123 | | | lz4 | 123.09 | 5824 | 1.14 | 2124 | | | gzip | 112.05 | 1424 | 12.75 | 2125 | | | zstd | 53.39 | 3388 | 2.76 | 2126 | | | xz | 23.99 | 97348 | 106.47 | 2127 | | | | | | | 2128 | JSON block | 519.77 | snzip | 106.12 | 2812 | 0.93 | 2129 | | | lz4 | 104.34 | 6080 | 0.97 | 2130 | | | gzip | 57.97 | 1604 | 12.70 | 2131 | | | zstd | 61.51 | 3396 | 3.45 | 2132 | | | xz | 27.67 | 97524 | 169.10 | 2133 | | | | | | | 2134 | Avro block | 60.45 | snzip | 48.38 | 2688 | 0.20 | 2135 | | | lz4 | 48.78 | 8540 | 0.22 | 2136 | | | gzip | 39.62 | 1576 | 2.92 | 2137 | | | zstd | 29.63 | 3612 | 1.25 | 2138 | | | xz | 18.28 | 97564 | 25.81 | 2139 | | | | | | | 2140 | CBOR block | 75.25 | snzip | 53.27 | 2684 | 0.24 | 2141 | | | lz4 | 51.88 | 8008 | 0.28 | 2142 | | | gzip | 41.17 | 1548 | 4.36 | 2143 | | | zstd | 30.61 | 3476 | 1.48 | 2144 | | | xz | 18.15 | 97556 | 38.78 | 2145 | | | | | | | 2146 | PBuf block | 67.98 | snzip | 51.10 | 2636 | 0.24 | 2147 | | | lz4 | 52.39 | 8304 | 0.24 | 2148 | | | gzip | 40.19 | 1520 | 3.63 | 2149 | | | zstd | 31.61 | 3576 | 1.40 | 2150 | | | xz | 17.94 | 97440 | 33.99 | 2151 +-------------+-----------+-------+------------+-------+-----------+ 2153 The above results are discussed in the following sections. 2155 C.1. Comparison with full PCAP files 2157 An important first consideration is whether moving away from PCAP 2158 offers significant benefits. 2160 The simple binary formats are typically larger than PCAP, even though 2161 they omit some information such as Ethernet MAC addresses. But not 2162 only do they require less CPU to compress than PCAP, the resulting 2163 compressed files are smaller than compressed PCAP. 2165 C.2. Simple versus block coding 2167 The intention of the block coding is to perform data de-duplication 2168 on query/response records within the block. The simple and block 2169 formats above store exactly the same information for each query/ 2170 response record. This information is parsed from the DNS traffic in 2171 the input PCAP file, and in all cases each field has an identifier 2172 and the field data is typed. 2174 The data de-duplication on the block formats show an order of 2175 magnitude reduction in the size of the format file size against the 2176 simple formats. As would be expected, the compression tools are able 2177 to find and exploit a lot of this duplication, but as the de- 2178 duplication process uses knowledge of DNS traffic, it is able to 2179 retain a size advantage. This advantage reduces as stronger 2180 compression is applied, as again would be expected, but even with the 2181 strongest compression applied the block formatted data remains around 2182 75% of the size of the simple format and its compression requires 2183 roughly a third of the CPU time. 2185 C.3. Binary versus text formats 2187 Text data formats offer many advantages over binary formats, 2188 particularly in the areas of ad-hoc data inspection and extraction. 2189 It was therefore felt worthwhile to carry out a direct comparison, 2190 implementing JSON versions of the simple and block formats. 2192 Concentrating on JSON block format, the format files produced are a 2193 significant fraction of an order of magnitude larger than binary 2194 formats. The impact on file size after compression is as might be 2195 expected from that starting point; the stronger compression produces 2196 files that are 150% of the size of similarly compressed binary 2197 format, and require over 4x more CPU to compress. 2199 C.4. Performance 2201 Concentrating again on the block formats, all three produce format 2202 files that are close to an order of magnitude smaller that the 2203 original "test.pcap" file. CBOR produces the largest files and Avro 2204 the smallest, 20% smaller than CBOR. 2206 However, once compression is taken into account, the size difference 2207 narrows. At medium compression (with gzip), the size difference is 2208 4%. Using strong compression (with xz) the difference reduces to 2%, 2209 with Avro the largest and Protocol Buffers the smallest, although 2210 CBOR and Protocol Buffers require slightly more compression CPU. 2212 The measurements presented above do not include data on the CPU 2213 required to generate the format files. Measurements indicate that 2214 writing Avro requires 10% more CPU than CBOR or Protocol Buffers. It 2215 appears, therefore, that Avro's advantage in compression CPU usage is 2216 probably offset by a larger CPU requirement in writing Avro. 2218 C.5. Conclusions 2220 The above assessments lead us to the choice of a binary format file 2221 using blocking. 2223 As noted previously, this draft anticipates that output data will be 2224 subject to compression. There is no compelling case for one 2225 particular binary serialisation format in terms of either final file 2226 size or machine resources consumed, so the choice must be largely 2227 based on other factors. CBOR was therefore chosen as the binary 2228 serialisation format for the reasons listed in Section 6. 2230 C.6. Block size choice 2232 Given the choice of a CBOR format using blocking, the question arises 2233 of what an appropriate default value for the maximum number of query/ 2234 response pairs in a block should be. This has two components; what 2235 is the impact on performance of using different block sizes in the 2236 format file, and what is the impact on the size of the format file 2237 before and after compression. 2239 The following table addresses the performance question, showing the 2240 impact on the performance of a C++ program converting "test.pcap" to 2241 C-DNS. File size is in Mb, resident set size (RSS) in kb. 2243 +------------+-----------+--------+-----------+ 2244 | Block size | File size | RSS | User time | 2245 +------------+-----------+--------+-----------+ 2246 | 1000 | 133.46 | 612.27 | 15.25 | 2247 | 5000 | 89.85 | 676.82 | 14.99 | 2248 | 10000 | 76.87 | 752.40 | 14.53 | 2249 | 20000 | 67.86 | 750.75 | 14.49 | 2250 | 40000 | 61.88 | 736.30 | 14.29 | 2251 | 80000 | 58.08 | 694.16 | 14.28 | 2252 | 160000 | 55.94 | 733.84 | 14.44 | 2253 | 320000 | 54.41 | 799.20 | 13.97 | 2254 +------------+-----------+--------+-----------+ 2256 Increasing block size, therefore, tends to increase maximum RSS a 2257 little, with no significant effect (if anything a small reduction) on 2258 CPU consumption. 2260 The following figure plots the effect of increasing block size on 2261 output file size for different compressions. 2263 Figure showing effect of block size on file size (PNG) [23] 2265 Figure showing effect of block size on file size (SVG) [24] 2267 From the above, there is obviously scope for tuning the default block 2268 size to the compression being employed, traffic characteristics, 2269 frequency of output file rollover etc. Using a strong compression, 2270 block sizes over 10,000 query/response pairs would seem to offer 2271 limited improvements. 2273 Authors' Addresses 2275 John Dickinson 2276 Sinodun IT 2277 Magdalen Centre 2278 Oxford Science Park 2279 Oxford OX4 4GA 2281 Email: jad@sinodun.com 2283 Jim Hague 2284 Sinodun IT 2285 Magdalen Centre 2286 Oxford Science Park 2287 Oxford OX4 4GA 2289 Email: jim@sinodun.com 2291 Sara Dickinson 2292 Sinodun IT 2293 Magdalen Centre 2294 Oxford Science Park 2295 Oxford OX4 4GA 2297 Email: sara@sinodun.com 2298 Terry Manderson 2299 ICANN 2300 12025 Waterfront Drive 2301 Suite 300 2302 Los Angeles CA 90094-2536 2304 Email: terry.manderson@icann.org 2306 John Bond 2307 ICANN 2308 12025 Waterfront Drive 2309 Suite 300 2310 Los Angeles CA 90094-2536 2312 Email: john.bond@icann.org