idnits 2.17.1 draft-dickinson-dnsop-dns-capture-format-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 5 instances of too long lines in the document, the longest one being 7 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 31, 2016) is 2732 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 1185 -- Looks like a reference, but probably isn't: '2' on line 1188 -- Looks like a reference, but probably isn't: '3' on line 1191 -- Looks like a reference, but probably isn't: '4' on line 1194 -- Looks like a reference, but probably isn't: '5' on line 1197 -- Looks like a reference, but probably isn't: '6' on line 1200 -- Looks like a reference, but probably isn't: '7' on line 1203 -- Looks like a reference, but probably isn't: '8' on line 1516 -- Looks like a reference, but probably isn't: '9' on line 1522 ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) == Outdated reference: A later version (-11) exists of draft-greevenbosch-appsawg-cbor-cddl-09 == Outdated reference: A later version (-16) exists of draft-hoffman-dns-in-json-09 Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 dnsop J. Dickinson 3 Internet-Draft J. Hague 4 Intended status: Standards Track S. Dickinson 5 Expires: May 4, 2017 Sinodun IT 6 T. Manderson 7 J. Bond 8 ICANN 9 October 31, 2016 11 C-DNS: A DNS Packet Capture Format 12 draft-dickinson-dnsop-dns-capture-format-00 14 Abstract 16 This document describes a data representation for collections of DNS 17 messages. The format is designed for efficient storage of large 18 packet captures of DNS traffic; it attempts to minimize the size of 19 such packet capture files but retain the full DNS message contents 20 along with the most useful transport meta data. It is intended to 21 assist with the development of DNS traffic monitoring applications 22 and provide a more efficient data exchange format than alternatives 23 such as PCAP files. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on May 4, 2017. 42 Copyright Notice 44 Copyright (c) 2016 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 60 2. Requirements Terminology . . . . . . . . . . . . . . . . . . 4 61 3. Data Collection Use Cases . . . . . . . . . . . . . . . . . . 4 62 4. Design Considerations . . . . . . . . . . . . . . . . . . . . 6 63 5. C-DNS conceptual overview . . . . . . . . . . . . . . . . . . 7 64 6. Choice of CBOR . . . . . . . . . . . . . . . . . . . . . . . 8 65 7. C-DNS CBOR format . . . . . . . . . . . . . . . . . . . . . . 8 66 7.1. CDDL definition . . . . . . . . . . . . . . . . . . . . . 8 67 7.2. Format overview . . . . . . . . . . . . . . . . . . . . . 8 68 7.3. File header contents . . . . . . . . . . . . . . . . . . 9 69 7.4. File preamble contents . . . . . . . . . . . . . . . . . 9 70 7.5. Configuration contents . . . . . . . . . . . . . . . . . 10 71 7.6. Block contents . . . . . . . . . . . . . . . . . . . . . 11 72 7.7. Block preamble map . . . . . . . . . . . . . . . . . . . 12 73 7.8. Block table map . . . . . . . . . . . . . . . . . . . . . 12 74 7.9. IP address table . . . . . . . . . . . . . . . . . . . . 13 75 7.10. Class/Type table . . . . . . . . . . . . . . . . . . . . 13 76 7.11. Name/RDATA table . . . . . . . . . . . . . . . . . . . . 14 77 7.12. Query Signature table . . . . . . . . . . . . . . . . . . 14 78 7.13. Question table . . . . . . . . . . . . . . . . . . . . . 16 79 7.14. Resource Record (RR) table . . . . . . . . . . . . . . . 16 80 7.15. Question list table . . . . . . . . . . . . . . . . . . . 17 81 7.16. Resource Record list table . . . . . . . . . . . . . . . 17 82 7.17. Query/Response data . . . . . . . . . . . . . . . . . . . 17 83 7.18. Address Event counts . . . . . . . . . . . . . . . . . . 20 84 8. C-DNS to PCAP . . . . . . . . . . . . . . . . . . . . . . . . 21 85 8.1. Name Compression . . . . . . . . . . . . . . . . . . . . 22 86 9. Data Collection . . . . . . . . . . . . . . . . . . . . . . . 23 87 9.1. Matching algorithm . . . . . . . . . . . . . . . . . . . 23 88 9.2. Message identifiers . . . . . . . . . . . . . . . . . . . 24 89 9.2.1. Primary ID (required) . . . . . . . . . . . . . . . . 24 90 9.2.2. Secondary ID (optional) . . . . . . . . . . . . . . . 24 91 9.3. Algorithm Parameters . . . . . . . . . . . . . . . . . . 24 92 9.4. Algorithm Requirements . . . . . . . . . . . . . . . . . 24 93 9.5. Algorithm Limitations . . . . . . . . . . . . . . . . . . 25 94 9.6. Workspace . . . . . . . . . . . . . . . . . . . . . . . . 25 95 9.7. Output . . . . . . . . . . . . . . . . . . . . . . . . . 25 96 9.8. Post Processing . . . . . . . . . . . . . . . . . . . . . 25 98 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 99 11. Security Considerations . . . . . . . . . . . . . . . . . . . 25 100 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 25 101 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 26 102 13.1. Normative References . . . . . . . . . . . . . . . . . . 26 103 13.2. Informative References . . . . . . . . . . . . . . . . . 26 104 13.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 27 105 Appendix A. CDDL . . . . . . . . . . . . . . . . . . . . . . . . 27 106 Appendix B. DNS Name compression example . . . . . . . . . . . . 33 107 B.1. NSD compression algorithm . . . . . . . . . . . . . . . . 34 108 B.2. Knot Authoritative compression algorithm . . . . . . . . 34 109 B.3. Observed differences . . . . . . . . . . . . . . . . . . 35 110 Appendix C. Comparison of Binary Formats . . . . . . . . . . . . 35 111 Appendix D. Sample data on the C-DNS format . . . . . . . . . . 35 112 D.1. Comparison to full PCAPS . . . . . . . . . . . . . . . . 35 113 D.2. Block size choices . . . . . . . . . . . . . . . . . . . 35 114 D.3. Blocking vs more simple output . . . . . . . . . . . . . 36 115 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 37 117 1. Introduction 119 There has long been a need to collect DNS queries and responses on 120 authoritative and recursive name servers for monitoring and analysis. 121 This data is used in a number of ways including traffic monitoring, 122 analyzing network attacks and DITL [ditl]. 124 A wide variety of tools already exist to facilitate the collection of 125 DNS traffic data. DSC [dsc], packetq [packetq], dnscap [dnscap] and 126 dnstap [dnstap]. However, there is no standard exchange format for 127 large DNS packet captures and PCAP [pcap] or PCAP-NG [pcapng] are 128 typically used in practice. 130 There has also been work on using other text based formats to 131 describe DNS packets [I-D.daley-dnsxml], [I-D.hoffman-dns-in-json] 132 but these are largely aimed at producing convenient representations 133 of single messages. 135 Many DNS operators may receive 100's of thousands of queries per 136 second on a single name server instance so a mechanism to minimize 137 the storage size (and therefore upload overhead) of the data 138 collected is highly desirable. 140 This documents focusses on the problem of capturing and storing large 141 packet capture files of DNS traffic. 143 This document contains 144 o A discussion of the some common use cases in which such DNS data 145 is collected. See Section 3. 147 o A discussion of the major design considerations in developing an 148 efficient data representation for collections of DNS messages. 149 See Section 4. 151 o A definition of a CBOR [RFC7049] representation of a collection of 152 DNS messages. This will be referred to as the C-DNS format 153 (Compacted-DNS). See Section 7. 155 o Notes on converting C-DNS back to PCAP format. See Section 8. 157 o Some high level implementation considerations for applications 158 designed to produce C-DNS, e.g. a query response matching 159 algorithm. See Section 9. 161 2. Requirements Terminology 163 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 164 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 165 document are to be interpreted as described in [RFC2119]. 167 3. Data Collection Use Cases 169 In an ideal world it would be optimal to collect full packet captures 170 of all packets going in or out of a name server. However, there are 171 several design choices or other limitations that are common to many 172 DNS installations and operators. 174 o Servers are hosted in a variety of situations 176 * Operator self hosted servers 178 * Third party hosting (including multiple third parties) 180 * Third party hardware (including multiple third parties) 182 o Data is collected under different conditions 184 * On well provisioned servers running in a steady state. 186 * On heavily loaded servers 188 * On virtualized servers 190 * On servers that are under attack 191 * On servers that are unwitting intermediaries in attacks 193 o Traffic can be collected via a variety of mechanisms 195 * On the same hardware as the name server itself 197 * Using a network tap to listen in from another server 199 * Using port mirroring to listen in from another server 201 o The capabilities of data collection (and upload) networks vary 203 * Out-of-band networks with the same capacity as the in-band 204 network 206 * Out-of-band networks with less capacity than the in-band 207 network 209 * Everything on the in-band network 211 Clearly, there is a wide range of use cases from very limited data 212 collection environments (third party hardware, servers that are under 213 attack, packet capture on the name server itself and no out-of-band 214 network) to 'limitless' environments (self hosted, well provisioned 215 servers, using a network tap or port mirroring with an out-of-band 216 networks with the same capacity as the in-band network). In the 217 former, it is unfeasible to reliably collect full PCAPS especially if 218 the server is under attack. In the latter case, collection of full 219 PCAPs may be reasonable. 221 As a result of these restrictions the data format discussed below was 222 designed with the most limited use case in mind such that 224 o Data collection will occur on the same hardware as the name server 225 itself. 227 o Collected data will be stored on the same hardware as the name 228 server itself, at least temporarily. 230 o Collected data being returned to some central analysis system will 231 use the same network interface as the DNS queries and responses. 233 o There are multiple third party servers involved. 235 and therefore minimal storage size of the capture files is a major 236 factor. 238 Another consideration for any application that records DNS traffic is 239 that the running of the name server software and the transmission of 240 DNS queries and responses is the most important job of a name server. 241 Any data collection system co-located with the name server will need 242 to be intelligent enough to carefully manage its CPU, disk, memory 243 and network utilization. Hence this use case benefits from a format 244 that has a relatively low overhead to produce and minimizes the 245 requirement for further potentially costly compression. 247 However, it was also essential that interoperability with less 248 restricted infrastructure was maintained. In particular it is highly 249 desirable that the resulting collection format should facilitate the 250 re-creation of common formats (such as PCAPs) that are as close to 251 the original as is realistic given the restrictions above. 253 4. Design Considerations 255 This section presents some of the major design considerations used in 256 the development of the C-DNS format. 258 o The basic unit of data is a combined DNS Query and the associated 259 Response (a 'Q/R data item'). The same structure will be used for 260 unmatched queries and responses. Queries without responses will 261 be captured omitting the Response data. Responses without queries 262 will be captured omitting the Query data (but using the Query 263 section from the Response, if present, as an identifying QNAME). 265 Rationale: A Query and Response represents the basic level of a 266 clients interaction with the server. Also, combining the Query and 267 Response into one item lowers storage requirements due to commonality 268 in the data in most cases. 270 o Each Q/R data item will comprise a default Q/R data description 271 and a set of optional sections. Inclusion of optional sections 272 shall be configurable. 274 Rationale: Different users will have different requirements for data 275 to be available for analysis. Users with minimal requirements should 276 not have to pay the cost of recording full data, however this will 277 limit the ability to reconstruct PCAPS. For example omitting the 278 Resource Records from a Response will reduce the files size, and in 279 principle responses can be synthesized if there is enough context. 281 o Multiple Q/R items will be collected into blocks in the format. 282 Common data in a block will be abstracted and referenced from 283 individual Q/R items by indexing. The maximum number of Q/R items 284 in a block will be configurable. 286 Rationale: This blocking and indexing provides a significant 287 reduction in the volume of file data generated. Whilst introducing 288 complexity it provides compression of the data that makes use of 289 knowledge of the DNS packet structure. 291 [TODO: Further discussion on commonality between DNS packets e.g. 293 o common query signatures 295 o for the authoritative case there are a finite set of valid 296 responses and much commonality in NXDOMAIN responses] 298 It is anticipated that the files produced will be subject to further 299 compression using general purpose compression tools. Measurements 300 show that blocking significantly reduces the CPU required to perform 301 such strong compression. See Appendix D. 303 o Meta-data about other packets received should also be included in 304 each block. For example counts of malformed DNS packets and non- 305 DNS packets (e.g. ICMP, TCP resets) sent to the server are of 306 interest. 308 It should be noted that any structured capture format that does not 309 capture the DNS payload byte for byte will likely be limited to some 310 extent in that it cannot represent 'malformed' DNS packets. Only 311 those packets that can be transformed reasonably into the structured 312 format can be represented by it. So if a query is malformed this 313 will lead to the (well formed) DNS responses with error code FORMERR 314 appearing as 'unmatched'. 316 [TODO: Need further discussion of well-formed vs malformed packets 317 and how name servers view this definition.] 319 Packets such as those described above can be separately recorded in a 320 PCAP file for later analysis. 322 5. C-DNS conceptual overview 324 The following figures show purely schematic representations of the 325 C-DNS format to convey the high-level structure of the C-DNS format. 326 Section 7 provides a detailed discussion of the CBOR representation 327 and individual elements. 329 Figure showing the C-DNS format (PNG) [1] 331 Figure showing the C-DNS format (SVG) [2] 333 Figure showing the Q/R data item and Block tables format (PNG) [3] 334 Figure showing the Q/R data item and Block tables format (SVG) [4] 336 6. Choice of CBOR 338 This document presents a detailed format description using CBOR, the 339 Concise Binary Object Representation defined in [RFC7049]. 341 The choice of CBOR was made taking a number of factors into account. 343 o CBOR is a binary representation, and so economical in storage 344 space. 346 o Other similar representations were investigated, and whilst all 347 had attractive features, none had a significant advantage over 348 CBOR. See Appendix C and Appendix D - for some discussion of 349 this. 351 o CBOR is an IETF Standard and familiar to IETF participants, and 352 being based on the successful JSON text format, requires very 353 little familiarization for those in the wider industry. 355 o CBOR can also be easily converted to JSON for debugging and other 356 human inspection requirements. 358 o CBOR data schemas can be described using CDDL 359 [I-D.greevenbosch-appsawg-cbor-cddl]. 361 7. C-DNS CBOR format 363 7.1. CDDL definition 365 The CDDL definition for the C-DNS format is given in Appendix A. 367 7.2. Format overview 369 A C-DNS file begins with a file header containing a file type 370 identifier and preamble. The preamble contains information on the 371 collection settings. 373 This is followed by a series of data blocks. 375 A block consists of a block header, containing various tables of 376 common data, and some statistics for the traffic received over the 377 block. The block header is then followed by a list of the Q/R pairs 378 detailing the queries and responses received during the block. The 379 list of Q/R pairs is in turn followed by a list of per-client counts 380 of particular IP events that occurred during collection of the block 381 data. 383 The exact nature of the DNS data will affect what block size is the 384 best fit, however sample data for a root server indicated that block 385 sizes in the low 1000's give good results. See Appendix D.2 for more 386 details. 388 If no field type is specified then the field is unsigned. 390 In the following 392 o For all quantities that contain bit flags, bit 0 indicates the 393 least significant bit. 395 o Items described as indexes are the index of the data item in the 396 referenced table. Indexes are 1-based. An index value of 0 is 397 reserved to mean not present. 399 7.3. File header contents 401 The file header contains the following: 403 +-------------+---------------+-------------------------------------+ 404 | Field | Type | Description | 405 +-------------+---------------+-------------------------------------+ 406 | File type | Text string | String identifying the file type | 407 | ID | | | 408 | | | | 409 | File | Map of items | Collection information for the | 410 | preamble | | whole file. | 411 | | | | 412 | File Blocks | Array of | The data blocks | 413 | | Blocks | | 414 +-------------+---------------+-------------------------------------+ 416 7.4. File preamble contents 418 The file preamble contains the following: 420 +---------------+----------+----------------------------------------+ 421 | Field | Type | Description | 422 +---------------+----------+----------------------------------------+ 423 | Format | Unsigned | Indicates version of format used in | 424 | version | | file. | 425 | | | | 426 | Configuration | Map of | The collection configuration. | 427 | | items | Optional. | 428 | | | | 429 | Generator ID | Text | String identifying the collection | 430 | | string | program. Optional. | 431 | | | | 432 | Host ID | Text | String identifying the collecting | 433 | | string | host. Blank if converting an existing | 434 | | | PCAP file. Optional. | 435 +---------------+----------+----------------------------------------+ 437 7.5. Configuration contents 439 The collection configuration contains the following items. All are 440 optional. 442 +-------------+----------+------------------------------------------+ 443 | Field | Type | Description | 444 +-------------+----------+------------------------------------------+ 445 | Query | Unsigned | To be matched with a query, a response | 446 | timeout | | must arrive within this number of | 447 | | | seconds. | 448 | | | | 449 | Skew | Unsigned | The network stack may report a response | 450 | timeout | | before the corresponding query. A | 451 | | | response is not considered to be missing | 452 | | | a query until after this many micro- | 453 | | | seconds. | 454 | | | | 455 | Snap length | Unsigned | Collect up to this many bytes per | 456 | | | packet. | 457 | | | | 458 | Promiscuous | Unsigned | 1 if promiscuous mode was enabled on the | 459 | mode | | interface, 0 otherwise. | 460 | | | | 461 | Interfaces | Array of | Identifiers of the interfaces used for | 462 | | text | collection. | 463 | | strings | | 464 | | | | 465 | VLAN IDs | Array of | Identifiers of VLANs selected for | 466 | | unsigned | collection. | 467 | | | | 468 | Filter | Text | "tcpdump" [pcap] style filter for input. | 469 | | string | | 470 | | | | 471 | Query | Unsigned | Bit flags indicating sections in Query | 472 | collection | | packets to be collected. | 473 | options | | | 474 | | | Bit 0. Collect second and subsequent | 475 | | | question sections. | 476 | | | Bit 1. Collect Answer sections. | 477 | | | Bit 2. Collect Authority sections. | 478 | | | Bit 3. Collection Additional sections. | 479 | | | | 480 | Response | Unsigned | Bit flags indicating sections in | 481 | collection | | Response packets to be collected. | 482 | options | | | 483 | | | Bit 0. Collect second and subsequent | 484 | | | question sections. | 485 | | | Bit 1. Collect Answer sections. | 486 | | | Bit 2. Collect Authority sections. | 487 | | | Bit 3. Collection Additional sections. | 488 | | | | 489 | Accept RR | Array of | A set of RR type names [rrtypes]. If not | 490 | types | text | empty, only the nominated RR types are | 491 | | strings | collected. | 492 | | | | 493 | Ignore RR | Array of | A set of RR type names [rrtypes]. If not | 494 | types | text | empty, all RR types are collected except | 495 | | strings | those listed. If present, this item must | 496 | | | be empty if a non-empty list of Accept | 497 | | | RR types is present. | 498 +-------------+----------+------------------------------------------+ 500 7.6. Block contents 502 Each block contains the following: 504 +-------------+------------------+----------------------------------+ 505 | Field | Type | Description | 506 +-------------+------------------+----------------------------------+ 507 | Block | Map of items | Overall information for the | 508 | preamble | | block. | 509 | | | | 510 | Block | Map of | Statistics about the block. | 511 | statistics | statistics | | 512 | | | | 513 | Block | Map of tables | The tables containing data | 514 | tables | | referenced by individual Q/R | 515 | | | entries. | 516 | | | | 517 | Q/Rs | Array of Q/Rs | Details of individual Q/R pairs. | 518 | | | | 519 | Address | Array of Address | Per client counts of ICMP | 520 | Event | Event counts | messages and TCP resets. | 521 | Counts | | | 522 +-------------+------------------+----------------------------------+ 524 7.7. Block preamble map 526 The block preamble map contains overall information for the block. 528 +-----------+----------+--------------------------------------------+ 529 | Field | Type | Description | 530 +-----------+----------+--------------------------------------------+ 531 | Timestamp | Array of | A timestamp for the earliest record in the | 532 | | unsigned | block. The timestamp is specified as a | 533 | | | CBOR array with two elements as in Posix | 534 | | | struct timeval. The first element is an | 535 | | | unsigned integer time_t and the second is | 536 | | | an unsigned integer number of | 537 | | | microseconds. The latter is always a value | 538 | | | between 0 and 999,999. | 539 +-----------+----------+--------------------------------------------+ 541 7.8. Block table map 543 The block table map contains the block tables. Each element, or 544 table, is an array. The following tables detail the contents of each 545 block table. 547 The Present column in the following tables indicates the 548 circumstances when an optional field will be present. A Q/R pair may 549 be: 551 o A Query plus a Response. 553 o A Query without a Response. 555 o A Response without a Query. 557 Also: 559 o A Query and/or a Response may contain an OPT section. 561 o A Question may or may not be present. If the Query is available, 562 the Question section of the Query is used. If no Query is 563 available, the Question section of the Response is used. Unless 564 otherwise noted, a Question refers to the first Question in the 565 Question section. 567 So, for example, a field listed with a Present value of QUERY is 568 present whenever the Q/R pair contains a Query. If the pair contains 569 a Response only, the field will not be present. 571 7.9. IP address table 573 This table holds all client and server IP addresses in the block. 574 Each item in the table is a single IP address. 576 +---------+--------+------------------------------------------------+ 577 | Field | Type | Description | 578 +---------+--------+------------------------------------------------+ 579 | Address | Byte | The IP address, in network byte order. The | 580 | | string | string is 4 bytes long for an IPv4 address, 16 | 581 | | | bytes long for an IPv6 address. | 582 +---------+--------+------------------------------------------------+ 584 7.10. Class/Type table 586 This table holds pairs of RR CLASS and TYPE values. Each item in the 587 table is a CBOR map. 589 +-------+--------------+ 590 | Field | Description | 591 +-------+--------------+ 592 | Class | CLASS value. | 593 | | | 594 | Type | TYPE value. | 595 +-------+--------------+ 597 7.11. Name/RDATA table 599 This table holds the contents of all NAME or RDATA items in the 600 block. Each item in the table is the content of a single NAME or 601 RDATA. 603 +-------+--------+--------------------------------------------------+ 604 | Field | Type | Description | 605 +-------+--------+--------------------------------------------------+ 606 | Data | Byte | The NAME or RDATA contents. NAMEs, and labels | 607 | | string | within RDATA contents, are in uncompressed label | 608 | | | format. | 609 +-------+--------+--------------------------------------------------+ 611 7.12. Query Signature table 613 This table holds elements of the Q/R data that are often common to 614 between different individual Q/R records. Each item in the table is 615 a CBOR map. Each item in the map has an unsigned value and an 616 unsigned key. 618 The following abbreviations are used in the Present (P) column 620 o Q = QUERY 622 o A = Always 624 o QT = QUESTION 626 o QO = QUERY, OPT 628 o QR = QUERY & RESPONSE 630 o R = RESPONSE 632 +------------+----+-------------------------------------------------+ 633 | Field | P | Description | 634 +------------+----+-------------------------------------------------+ 635 | Server | A | The index in the IP address table of the server | 636 | address | | IP address. | 637 | | | | 638 | Server | A | The server port. | 639 | port | | | 640 | | | | 641 | Transport | A | Bit flags describing the protocol used to | 642 | flags | | service the query. Bit 0 is the least | 643 | | | significant bit. | 644 | | | Bit 0. Transport type. 0 = UDP, 1 = TCP. | 645 | | | Bit 1. IP type. 0 = IPv4, 1 = IPv6. | 646 | | | | 647 | Q/R | A | Bit flags indicating information present in | 648 | signature | | this Q/R pair. Bit 0 is the least significant | 649 | flags | | bit. | 650 | | | Bit 0. 1 if a Query is present. | 651 | | | Bit 1. 1 if a Response is present. | 652 | | | Bit 2. 1 if one or more Question is present. | 653 | | | Bit 3. 1 if a Query is present and it has an | 654 | | | OPT Resource Record. | 655 | | | Bit 4. 1 if a Response is present and it has an | 656 | | | OPT Resource Record. | 657 | | | Bit 5. 1 if a Response is present but has no | 658 | | | Question. | 659 | | | | 660 | Query | Q | Query OPCODE. | 661 | OPCODE | | | 662 | | | | 663 | Q/R DNS | A | Bit flags with values from the Query and | 664 | flags | | Response DNS flags. Bit 0 is the least | 665 | | | significant bit. Flag values are 0 if the Query | 666 | | | or Response is not present. | 667 | | | Bit 0. Query Checking Disabled (CD) flag. | 668 | | | Bit 1. Query Authenticated Data (AD) flag. | 669 | | | Bit 2. Query reserved (Z) flag. | 670 | | | Bit 3. Query Recursion Available (RA) flag. | 671 | | | Bit 4. Query Recursion Desired (RD) flag. | 672 | | | Bit 5. Query TrunCation (TC) flag. | 673 | | | Bit 6. Query Authoritative Answer (AA) flag. | 674 | | | Bit 7. Query DNSSEC answer OK (D0) flag. | 675 | | | Bit 8. Response Checking Disabled (CD) flag. | 676 | | | Bit 9. Response Authenticated Data (AD) flag. | 677 | | | Bit 10. Response reserved (Z) flag. | 678 | | | Bit 11. Response Recursion Available (RA) flag. | 679 | | | Bit 12. Response Recursion Desired (RD) flag. | 680 | | | Bit 13. Response TrunCation (TC) flag. | 681 | | | Bit 14. Response Authoritative Answer (AA) | 682 | | | flag. | 683 | | | | 684 | Query | Q | Query RCODE. If the Query contains OPT, this | 685 | RCODE | | value incorporates any EXTENDED_RCODE_VALUE. | 686 | | | | 687 | Question | QT | The index in the Class/Type table of the CLASS | 688 | Class/Type | | and TYPE of the first Question. | 689 | | | | 690 | Question | QT | The QDCOUNT in the Query, or Response if no | 691 | QDCOUNT | | Query present. | 692 | | | | 693 | Query | Q | Query ANCOUNT. | 694 | ANCOUNT | | | 695 | | | | 696 | Query | Q | Query ARCOUNT. | 697 | ARCOUNT | | | 698 | | | | 699 | Query | Q | Query NSCOUNT. | 700 | NSCOUNT | | | 701 | | | | 702 | Query EDNS | QO | The Query EDNS version. | 703 | version | | | 704 | | | | 705 | EDNS UDP | QO | The Query EDNS sender's UDO payload size | 706 | size | | | 707 | | | | 708 | Query OPT | QO | The index in the NAME/RDATA table of the OPT | 709 | RDATA | | RDATA. | 710 | | | | 711 | Response | R | Response RCODE. If the Response contains OPT, | 712 | RCODE | | this value incorporates any | 713 | | | EXTENDED_RCODE_VALUE. | 714 +------------+----+-------------------------------------------------+ 716 7.13. Question table 718 This table holds details on individual Questions in a Question 719 section. Each item in the table is a CBOR map containing a single 720 Question. Each item in the map has an unsigned value and an unsigned 721 key. This data is optionally collected. 723 +------------+------------------------------------------------------+ 724 | Field | Description | 725 +------------+------------------------------------------------------+ 726 | QNAME | The index in the NAME/RDATA table of the QNAME. | 727 | | | 728 | Class/Type | The index in the Class/Type table of the CLASS and | 729 | | TYPE of the Question. | 730 +------------+------------------------------------------------------+ 732 7.14. Resource Record (RR) table 734 This table holds details on individual Resource Records in RR 735 sections. Each item in the table is a CBOR map containing a single 736 Resource Record. This data is optionally collected. 738 +------------+------------------------------------------------------+ 739 | Field | Description | 740 +------------+------------------------------------------------------+ 741 | NAME | The index in the NAME/RDATA table of the NAME. | 742 | | | 743 | Class/Type | The index in the Class/Type table of the CLASS and | 744 | | TYPE of the RR. | 745 | | | 746 | TTL | The RR Time to Live. | 747 | | | 748 | RDATA | The index in the NAME/RDATA table of the RR RDATA. | 749 +------------+------------------------------------------------------+ 751 7.15. Question list table 753 This table holds a list of second and subsequent individual Questions 754 in a Question section. Each item in the table is a CBOR unsigned. 755 This data is optionally collected. 757 +----------+--------------------------------------------------------+ 758 | Field | Description | 759 +----------+--------------------------------------------------------+ 760 | Question | The index in the Question table of the individual | 761 | | Question. | 762 +----------+--------------------------------------------------------+ 764 7.16. Resource Record list table 766 This table holds a list of individual Resource Records in a Answer, 767 Authority or Additional section. Each item in the table is a CBOR 768 unsigned. This data is optionally collected. 770 +-------+-----------------------------------------------------------+ 771 | Field | Description | 772 +-------+-----------------------------------------------------------+ 773 | RR | The index in the Resource Record table of the individual | 774 | | Resource Record. | 775 +-------+-----------------------------------------------------------+ 777 7.17. Query/Response data 779 The block Q/R data is a CBOR array of individual Q/R items. Each 780 item in the array is a CBOR map containing details on the individual 781 Q/R pair. 783 Note that there is no requirement that the elements of the Q/R array 784 are presented in strict chronological order. 786 The following abbreviations are used in the Present (P) column 788 o Q = QUERY 790 o A = Always 792 o QT = QUESTION 794 o QO = QUERY, OPT 796 o QR = QUERY & RESPONSE 798 o R = RESPONSE 800 Each item in the map has an unsigned value (with the exception of 801 those listed below) and an unsigned key. 803 o Query extended information and Response extended information which 804 are of Type Extended Information. 806 o Response delay which is an integer (This can be negative if the 807 network stack/capture library returns them out of order.) 809 +-------------+----+------------------------------------------------+ 810 | Field | P | Description | 811 +-------------+----+------------------------------------------------+ 812 | Time offset | A | Q/R timestamp as an offset in microseconds | 813 | | | from the Block pre-amble Timestamp. The | 814 | | | timestamp is the timestamp of the Query, or | 815 | | | the Response if there is no Query. | 816 | | | | 817 | Client | A | The index in the IP address table of the | 818 | address | | client IP address. | 819 | | | | 820 | Client port | A | The client port. | 821 | | | | 822 | Transaction | A | DNS transaction identifier. | 823 | ID | | | 824 | | | | 825 | Query | A | The index of the more information on the Q/R | 826 | signature | | in the Query Signature table. | 827 | | | | 828 | Client | Q | The IPv4 TTL or IPv6 Hoplimit from the Query | 829 | hoplimit | | packet. | 830 | | | | 831 | Response | QR | The time different between Query and Response, | 832 | delay | | in microseconds. | 833 | | | | 834 | Question | QT | The index in the NAME/RDATA table of the QNAME | 835 | NAME | | for the first Question. | 836 | | | | 837 | Response | R | The size of the DNS message (not the packet | 838 | size | | containing the message, just the DNS message) | 839 | | | that forms the Response. | 840 | | | | 841 | Query | Q | Extended Query information. This item is only | 842 | extended | | present if collection of extra Query | 843 | information | | information is configured. | 844 | | | | 845 | Response | R | Extended Response information. This item is | 846 | extended | | only present if collection of extra Query | 847 | information | | information is configured. | 848 +-------------+----+------------------------------------------------+ 850 The collector always collects basic Q/R information. It may be 851 configured to collect details on Question, Answer, Authority and 852 Additional sections of the Query, the Response or both. Note that 853 only the second and subsequent Questions of any Question section are 854 collected (the details of the first are in the basic information), 855 and that OPT Records are not collected in the Additional section. 857 The Extended information is a CBOR map as follows. Each item in the 858 map is present only of collection of the relevant details is 859 configured. Each item in the map has an unsigned value and an 860 unsigned key. 862 +------------+------------------------------------------------------+ 863 | Field | Description | 864 +------------+------------------------------------------------------+ 865 | Question | The index in the Questions list table of the entry | 866 | | listing the second and subsequent Question sections | 867 | | for the Query or Response. | 868 | | | 869 | Answer | The index in the RR list table of the entry listing | 870 | | the Answer Resource Record sections for the Query or | 871 | | Response. | 872 | | | 873 | Authority | The index in the RR list table of the entry listing | 874 | | the Authority Resource Record sections for the Query | 875 | | or Response. | 876 | | | 877 | Additional | The index in the RR list table of the entry listing | 878 | | the Additional Resource Record sections for the | 879 | | Query or Response. | 880 +------------+------------------------------------------------------+ 882 7.18. Address Event counts 884 This table holds counts of various IP related events relating to 885 traffic with individual client addresses. 887 +----------+----------+---------------------------------------------+ 888 | Field | Type | Description | 889 +----------+----------+---------------------------------------------+ 890 | Event | Unsigned | The type of event. The following events | 891 | type | | types are currently defined: | 892 | | | 0. TCP reset. | 893 | | | 1. ICMP time exceeded. | 894 | | | 2. ICMP destination unreachable. | 895 | | | 3. ICMPv6 time exceeded. | 896 | | | 4. ICMPv6 destination unreachable. | 897 | | | 5. ICMPv6 packet too big. | 898 | | | | 899 | Event | Unsigned | A code relating to the event. Optional. | 900 | code | | | 901 | | | | 902 | Address | Unsigned | The index in the IP address table of the | 903 | index | | client address. | 904 | | | | 905 | Count | Unsigned | The number of occurrences of this event | 906 | | | during the block collection period. | 907 +----------+----------+---------------------------------------------+ 909 8. C-DNS to PCAP 911 It is possible to re-construct PCAP files from the C-DNS format. 912 However this is a lossy process and some of the issues with 913 reconstructing both the DNS payload and the full packet stream are 914 outlined here. 916 Firstly the reconstruction depends on whether or not all the optional 917 sections of both the query and response were captured in the C-DNS 918 file. Clearly if they were not all captured the reconstruction is 919 imperfect. 921 Secondly, even if all sections of the response were captured name 922 compression presents a challenge in reconstructing the DNS response 923 payload byte for byte. Section 8.1 discusses this is more detail. 925 Thirdly, not all transport information is captured in the C-DNS 926 format. For example, the following aspects of the original packet 927 stream cannot be re-constructed from the C-DNS format: 929 o IP Fragmentation 931 o TCP stream information: 933 * Multiple DNS messages may have been sent in a single TCP 934 segment 936 * A DNS payload may have be split across multiple TCP segments 938 * Multiple DNS messages may have be sent on a single TCP session 940 o Malformed DNS messages and non-DNS packets 942 Simple assumptions can be made on the reconstruction - fragmented and 943 DNS-over-TCP messages can be reconstructed into 'single' packets and 944 a single TCP session can be constructed for each TCP packet. 946 Additionally if the malformed and non-DNS packets are captured 947 separately into PCAPs they can be merged with PCAPs reconstructed 948 from C-DNS to produce a more complete packet stream. 950 8.1. Name Compression 952 All the names stored in the C-DNS format are full domain names; no 953 DNS style name compression is used on the individual names within the 954 format. Therefore when reconstructing a packet name compression must 955 be used in order to re-produce the on the wire representation of the 956 packet. 958 [RFC1035] name compression works by substituting trailing sections of 959 a name with a reference back to the occurrence of those sections 960 earlier in the packet. Not all name server software uses the same 961 algorithm when compressing domain names within the responses. Some 962 attempt maximum recompression at the expense of runtime resources, 963 others use heuristics to balance compression and speed and others use 964 different rules for what is a valid compression target. 966 This means that responses to the same question from different name 967 server software which match in terms of DNS payload content (header, 968 counts, RRs with name compression removed) do not necessarily match 969 byte for byte on the wire. 971 From the C-DNS format it is not possible to ensure that the DNS 972 response payload is reconstructed byte for byte. However it can at 973 least, in principle, be reconstructed to have the correct payload 974 length (since the original response length is captured) if there is 975 enough knowledge of the commonly implemented name compression 976 algorithms. For example, a simplistic approach would be to try each 977 algorithm in turn to see if it reproduces the original length, 978 stopping at the first match. This would not guarantee the correct 979 algorithm has been used as it is possible to match the length whilst 980 still not matching the on the wire bytes but without further 981 information added to the C-BOR this is the best that can be achieved. 983 Appendix B presents an example of two differing compression 984 algorithms used by well known name server software. 986 9. Data Collection 988 This section describes a non-normative proposed algorithm for the 989 processing of a captured stream of DNS queries and responses and 990 matching queries/responses where possible. 992 For the purposes of this discussion, it is assumed that the input has 993 been pre-processed such that: 995 1. All IP fragmentation reassembly, TCP stream reassembly etc. has 996 already been performed 998 2. Each message is associated with transport meta-data required to 999 generate the Primary ID (see below) 1001 3. Each message has a well-formed DNS header of 12 bytes and (if 1002 present) the first RR in the query section can be parsed to 1003 generate the Secondary ID (see below). 1005 * As noted earlier, this requirement can result in a malformed 1006 query being removed in the pre-processing stage, but the 1007 correctly formed response with RCODE of FORMERR being present 1009 DNS messages are processed in the order they are delivered to the 1010 application. 1012 o It should be noted that packet capture libraries do not necessary 1013 provide packets in strict chronological order. 1015 [TODO: Discuss the corner cases resulting from this in more detail.] 1017 9.1. Matching algorithm 1019 A schematic representation of the algorithm for matching Q/R pairs is 1020 shown in the following diagram: 1022 Figure showing the packet matching algorithm format (PNG) [5] 1024 Figure showing the packet matching algorithm format (SVG) [6] 1026 and further details of the algorithm are given in the following 1027 sections. 1029 9.2. Message identifiers 1031 9.2.1. Primary ID (required) 1033 A Primary ID can be constructed for each message which is composed of 1034 the following data: 1036 1. Source IP Address 1038 2. Destination IP Address 1040 3. Source Port 1042 4. Destination Port 1044 5. Transport 1046 6. DNS Message ID 1048 9.2.2. Secondary ID (optional) 1050 If present, the first question in the Question section is used as a 1051 secondary ID for each message. Note that there may be well formed 1052 DNS queries that have a QDCOUNT of 0, and some responses may have a 1053 QDCOUNT of 0 (for example, RCODE=FORMERR or NOTIMP) 1055 9.3. Algorithm Parameters 1057 1. Configurable timeout 1059 9.4. Algorithm Requirements 1061 The algorithm is designed to handle the following input data: 1063 1. Multiple queries with the same Primary ID (but different 1064 Secondary ID) arriving before any responses for these queries are 1065 seen. 1067 2. Multiple queries with the same Primary and Secondary ID arriving 1068 before any responses for these queries are seen. 1070 3. Queries for which no later response can be found within the 1071 specified timeout. 1073 4. Responses for which no previous query can be found within the 1074 specified timeout. 1076 9.5. Algorithm Limitations 1078 For cases 1 and 2 listed in the above requirements, it is not 1079 possible to unambiguously match queries with responses. The solution 1080 to this employed in this algorithm is to match to the earliest query 1081 with the correct Primary and Secondary ID. 1083 9.6. Workspace 1085 A FIFO structure is used to hold the Q/R items during processing. 1087 9.7. Output 1089 The output is a list of Q/R data items. Both the Query and Response 1090 elements are optional in these items, therefore Q/R items have one of 1091 three types of content: 1093 1. Paired Q/R messages 1095 2. A query message (no response) 1097 3. A response message (no query) 1099 The timestamp of a list item is that of the query for cases 1 and 2 1100 and that of the response for case 3. 1102 9.8. Post Processing 1104 When ending capture, all remaining entries in the Q/R FIFO should be 1105 treated as timed out queries. 1107 10. IANA Considerations 1109 None 1111 11. Security Considerations 1113 Any control interface MUST perform authentication and encryption. 1115 Any data upload MUST be authenticated and encrypted. 1117 12. Acknowledgements 1119 The authors wish to thank CZ.NIC, in particular Tomas Gavenciak, for 1120 many useful discussions on binary formats, compression and packet 1121 matching. Also Jan Vcelak and Wouter Wijngaards for discussions on 1122 name compression. 1124 Also, Miek Gieben for mmark [7] 1126 13. References 1128 13.1. Normative References 1130 [RFC1035] Mockapetris, P., "Domain names - implementation and 1131 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, 1132 November 1987, . 1134 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1135 Requirement Levels", BCP 14, RFC 2119, 1136 DOI 10.17487/RFC2119, March 1997, 1137 . 1139 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 1140 Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, 1141 October 2013, . 1143 13.2. Informative References 1145 [ditl] DNS-OARC, "DITL", 2016, . 1148 [dnscap] DNS-OARC, "DNSCAP", 2016, . 1151 [dnstap] dnstap.io, "dnstap", 2016, . 1153 [dsc] Wessels, D. and J. Lundstrom, "DSC", 2016, 1154 . 1156 [I-D.daley-dnsxml] 1157 Daley, J., Morris, S., and J. Dickinson, "dnsxml - A 1158 standard XML representation of DNS data", draft-daley- 1159 dnsxml-00 (work in progress), July 2013. 1161 [I-D.greevenbosch-appsawg-cbor-cddl] 1162 Vigano, C. and H. Birkholz, "CBOR data definition language 1163 (CDDL): a notational convention to express CBOR data 1164 structures", draft-greevenbosch-appsawg-cbor-cddl-09 (work 1165 in progress), September 2016. 1167 [I-D.hoffman-dns-in-json] 1168 Hoffman, P., "Representing DNS Messages in JSON", draft- 1169 hoffman-dns-in-json-09 (work in progress), October 2016. 1171 [packetq] .SE - The Internet Infrastructure Foundation, "PacketQ", 1172 2014, . 1174 [pcap] tcpdump.org, "PCAP", 2016, . 1176 [pcapng] Tuexen, M., Risso, F., Bongertz, J., Combs, G., and G. 1177 Harris, "pcap-ng", 2016, . 1180 [rrtypes] IANA, "RR types", 2016, . 1183 13.3. URIs 1185 [1] https://github.com/dns-stats/draft-dns-capture- 1186 format/blob/master/cdns_format.png 1188 [2] https://github.com/dns-stats/draft-dns-capture- 1189 format/blob/master/cdns_format.svg 1191 [3] https://github.com/dns-stats/draft-dns-capture- 1192 format/blob/master/qr_data_format.png 1194 [4] https://github.com/dns-stats/draft-dns-capture- 1195 format/blob/master/qr_data_format.svg 1197 [5] https://github.com/dns-stats/draft-dns-capture- 1198 format/blob/master/packet_matching.png 1200 [6] https://github.com/dns-stats/draft-dns-capture- 1201 format/blob/master/packet_matching.svg 1203 [7] https://github.com/miekg/mmark 1205 [8] https://www.nlnetlabs.nl/projects/nsd/ 1207 [9] https://www.knot-dns.cz/ 1209 Appendix A. CDDL 1211 ; CDDL specification of the file format for C-DNS, 1212 ; which describes a collection of DNS Query/Response pairs. 1214 File = [ 1215 file-type-id : tstr, ; "DNS-STAT" 1216 file-preamble : FilePreamble, 1217 file-blocks : [* Block], 1218 ] 1219 FilePreamble = { 1220 format-version => uint, 1221 ? configuration => Configuration, 1222 ? generator-id => tstr, 1223 ? host-id => tstr, 1224 } 1226 format-version = 0 1227 configuration = 1 1228 generator-id = 2 1229 host-id = 3 1231 Configuration = { 1232 ? query-timeout => uint, 1233 ? skew-timeout => uint, 1234 ? snaplen => uint, 1235 ? promisc => uint, 1236 ? interfaces => [* tstr], 1237 ? vlan-ids => [* uint], 1238 ? filter => tstr, 1239 ? query-options => uint, ; See below 1240 ? response-options => uint, 1241 ? accept-rr-types => [* tstr], 1242 ? ignore-rr-types => [* tstr], 1243 } 1245 ; query-options and response-options are bitmasks. A bit set adds in the 1246 ; specified sections. 1247 ; 1248 ; second & subsequent question sections = 1 1249 ; answer sections = 2 1250 ; authority sections = 4 1251 ; additional sections = 8 1253 query-timeout = 0 1254 skew-timeout = 1 1255 snaplen = 2 1256 promisc = 3 1257 interfaces = 4 1258 vlan-ids = 5 1259 filter = 6 1260 query-options = 7 1261 response-options = 8 1262 accept-rr-types = 9; 1263 ignore-rr-types = 10; 1265 Block = { 1266 preamble => BlockPreamble, 1267 ? statistics => BlockStatistics, ; Much of this could be derived 1268 tables => BlockTables, 1269 queries => [* QueryResponse], 1270 address-event-counts => [* AddressEventCount], 1271 } 1273 preamble = 0 1274 statistics = 1 1275 tables = 2 1276 queries = 3 1277 address-event-counts = 4 1279 BlockPreamble = { 1280 start-time => Timeval 1281 } 1283 start-time = 1 1285 Timeval = [ 1286 seconds : uint, 1287 microseconds : uint, 1288 ] 1290 BlockStatistics = { 1291 ? total-packets => uint, 1292 ? total-pairs => uint, 1293 ? unmatched_queries => uint, 1294 ? unmatched_responses => uint, 1295 ? malformed-packets => uint, 1296 ? non-dns-packets => uint, 1297 ? out-of-order-packets => uint, 1298 ? missing-pairs => uint, 1299 ? missing-packets => uint, 1300 ? missing-non-dns => uint, 1301 } 1303 total-packets = 0 1304 total-pairs = 1 1305 unmatched_queries = 2 1306 unmatched_responses = 3 1307 malformed-packets = 4 1308 non-dns-packets = 5 1309 out-of-order-packets = 6 1310 missing-pairs = 7 1311 missing-packets = 8 1312 missing-non-dns = 9 1313 BlockTables = { 1314 ip-address => [* bstr], 1315 classtype => [* ClassType], 1316 name-rdata => [* bstr], ; Holds both Name RDATA and RDATA 1317 query_sig => [* QuerySignature] 1318 ? qlist => [* QuestionList], 1319 ? qrr => [* Question], 1320 ? rrlist => [* RRList], 1321 ? rr => [* RR], 1322 } 1324 ip-address = 0 1325 classtype = 1 1326 name-rdata = 2 1327 query_sig = 3 1328 qlist = 4 1329 qrr = 5 1330 rrlist = 6 1331 rr = 7 1333 QueryResponse = { 1334 time-useconds => int, ; Time offset from start of block 1335 client-address-index => uint, 1336 client-port => uint, 1337 transaction-id => uint, 1338 query-signature-index => uint, 1339 ? client-hoplimit => uint, 1340 ? delay-useconds => int, ; Times may be -ve at capture 1341 ? query-name-index => uint, 1342 ? response-size => uint, ; DNS size of response 1343 ? query-extended => QueryResponseExtended, 1344 ? response-extended => QueryResponseExtended, 1345 } 1347 time-useconds = 0 1348 client-address-index = 1 1349 client-port = 2 1350 transaction-id = 3 1351 query-signature-index = 4 1352 client-hoplimit = 5 1353 delay-useconds = 6 1354 query-name-index = 7 1355 response-size = 8 1356 query-extended = 9 1357 response-extended = 10 1359 ClassType = { 1360 type => uint, 1361 class => uint, 1362 } 1364 type = 0 1365 class = 1 1367 QuerySignature = { 1368 server-address-index => uint, 1369 server-port => uint, 1370 transport-flags => uint, 1371 qr-sig-flags => uint, 1372 ? query-opcode => uint, 1373 qr-dns-flags => uint, 1374 ? query-rcode => uint, 1375 ? query-classtype-index => uint, 1376 ? query-qd-count => uint, 1377 ? query-an-count => uint, 1378 ? query-ar-count => uint, 1379 ? query-ns-count => uint, 1380 ? edns-version => uint, 1381 ? udp-buf-size => uint, 1382 ? opt-rdata-index => uint, 1383 ? response-rcode => uint, 1384 } 1386 server-address-index = 0 1387 server-port = 1 1388 transport-flags = 2 1389 qr-sig-flags = 3 1390 query-opcode = 4 1391 qr-dns-flags = 5 1392 query-rcode = 6 1393 query-classtype-index = 7 1394 query-qd-count = 8 1395 query-an-count = 9 1396 query-ar-count = 10 1397 query-ns-count = 11 1398 edns-version = 12 1399 udp-buf-size = 13 1400 opt-rdata-index = 14 1401 response-rcode = 15 1403 QuestionList = [ 1404 * uint, ; Index of Question 1405 ] 1407 Question = { ; Second and subsequent questions 1408 name-index => uint, ; Index to a name in the name-rdata table 1409 classtype-index => uint, 1410 } 1412 name-index = 0 1413 classtype-index = 1 1415 RRList = [ 1416 * uint, ; Index of RR 1417 ] 1419 RR = { 1420 name-index => uint, ; Index to a name in the name-rdata table 1421 classtype-index => uint, 1422 ttl => uint, 1423 rdata-index => uint, ; Index to RDATA in the name-rdata table 1424 } 1426 ttl = 2 1427 rdata-index = 3 1429 QueryResponseExtended = { 1430 ? question-index => uint, ; Index of QuestionList 1431 ? answer-index => uint, ; Index of RRList 1432 ? authority-index => uint, 1433 ? additional-index => uint, 1434 } 1436 question-index = 0 1437 answer-index = 1 1438 authority-index = 2 1439 additional-index = 3 1441 AddressEventCount = { 1442 ae-type => &AddressEventType, 1443 ? ae-code => uint, 1444 ae-address-index => uint, 1445 ae-count => uint, 1446 } 1448 ae-type = 0 1449 ae-code = 1 1450 ae-address-index = 2 1451 ae-count = 3 1453 AddressEventType = ( 1454 tcp-reset: 0, 1455 icmp-time-exceeded : 1, 1456 icmp-dest-unreachable : 2, 1457 icmpv6-time-exceeded : 3, 1458 icmpv6-dest-unreachable: 4, 1459 icmpv6-packet-too-big : 5, 1460 ) 1462 Appendix B. DNS Name compression example 1464 The basic algorithm which follows the guidance in [RFC1035] is simply 1465 to collect each name, and the offset in the packet at which it 1466 starts, during packet construction. As each name is added, it is 1467 offered to each of the collected names in order of collection, 1468 starting from the first name. If labels at the end of the name can 1469 be replaced with a reference back to part (or all) of the earlier 1470 name, and if the uncompressed part of the name is shorter than any 1471 compression already found, the earlier name is noted as the 1472 compression target for the name. 1474 The following tables illustrate the process. In an example packet, 1475 the first name is example.com. 1477 +---+-------------+--------------+--------------------+ 1478 | N | Name | Uncompressed | Compression Target | 1479 +---+-------------+--------------+--------------------+ 1480 | 1 | example.com | | | 1481 +---+-------------+--------------+--------------------+ 1483 The next name added is bar.com. This is matched against example.com. 1484 The com part of this can be used as a compression target, with the 1485 remaining uncompressed part of the name being bar. 1487 +---+-------------+--------------+--------------------+ 1488 | N | Name | Uncompressed | Compression Target | 1489 +---+-------------+--------------+--------------------+ 1490 | 1 | example.com | | | 1491 | 2 | bar.com | bar | 1 + offset to com | 1492 +---+-------------+--------------+--------------------+ 1494 The third name added is www.bar.com. This is first matched against 1495 example.com, and as before this is recorded as a compression target, 1496 with the remaining uncompressed part of the name being www.bar. It 1497 is then matched against the second name, which again can be a 1498 compression target. Because the remaining uncompressed part of the 1499 name is www, this is an improved compression, and so it is adopted. 1501 +---+-------------+--------------+--------------------+ 1502 | N | Name | Uncompressed | Compression Target | 1503 +---+-------------+--------------+--------------------+ 1504 | 1 | example.com | | | 1505 | 2 | bar.com | bar | 1 + offset to com | 1506 | 3 | www.bar.com | www | 2 | 1507 +---+-------------+--------------+--------------------+ 1509 As an optimization, if a name is already perfectly compressed - in 1510 other words, the uncompressed part of the name is empty - no further 1511 names will be considered for compression. 1513 B.1. NSD compression algorithm 1515 Using the above basic algorithm the packet lengths of responses 1516 generated by NSD [8] can be matched almost exactly. At the time of 1517 writing, a tiny number (<.01%) of the reconstructed packets had 1518 incorrect lengths. 1520 B.2. Knot Authoritative compression algorithm 1522 The Knot Authoritative [9] name server uses different compression 1523 behavior, which is the result of internal optimization designed to 1524 balance runtime speed with compression size gains. In brief, and 1525 omitting complications, Knot Authoritative will only consider the 1526 QNAME and names in the immediately preceding RR section in an RRSET 1527 as compression targets. 1529 A set of smart heuristics as described below can be implemented to 1530 mimic this and while not perfect it produces output nearly, but not 1531 quite, as good a match as with NSD. The heuristics are: 1533 1. A match is only perfect if the name is completely compressed AND 1534 the TYPE of the section in which the name occurs matches the TYPE 1535 of the name used as the compression target. 1537 2. If the name occurs in RDATA: 1539 a If the compression target name is in a query, then only the 1540 first RR in an RRSET can use that name as a compression 1541 target. 1543 b The compression target name MUST be in RDATA. 1545 c The name section TYPE must match the compression target name 1546 section TYPE. 1548 d The compression target name MUST be in the immediately 1549 preceding RR in the RRSET. 1551 Using this algorithm less than 0.1% of the reconstructed packets had 1552 incorrect lengths. 1554 B.3. Observed differences 1556 In sample traffic collected on a root name server around 2-4% of 1557 responses generated by Knot had different packet lengths to those 1558 produced by NSD. 1560 Appendix C. Comparison of Binary Formats 1562 Several binary representations were considered in particular CBOR, 1563 Apache Avro and Protocol Buffers. 1565 Protocol Buffers and Avro both require a data schema, and validate 1566 data being stored against that schema. 1568 [TODO: Finish pros and cons of CBOR vs Avro vs Protocol buffers - 1569 tools, schema, adoption, etc.] 1571 The difference in file sizes were mostly minimal See Appendix D.3. 1573 Appendix D. Sample data on the C-DNS format 1575 This section presents some example figures for the output size of 1576 capture files when using different block sizes, data representations 1577 and binary formats. The data is sample data for a root instance. 1579 [TODO: This section needs more work..] 1581 D.1. Comparison to full PCAPS 1583 As can be seen in more detail below for this sample data the 1584 compressed C-DNS files are around 30% the size of the full compressed 1585 PCAPs. It should also be noted that experiments showed that 1586 compression of the C-DNS format required very roughly an order of 1587 magnitude less CPU resources than compression of full PCAPSs when 1588 using one core from a 3.5GHz i7 processor. 1590 D.2. Block size choices 1592 [TODO: Discuss trade-off of file block size vs memory consumption.] 1594 [TODO: Add graph that demonstrates block size of 5000 is optimal for 1595 the sample data used.] 1597 D.3. Blocking vs more simple output 1599 Some experiments were conducted producing output in a very simple 1600 format involving a single record per Q/R data item (akin to a .csv 1601 representation). The aim here was to examine whether the blocking 1602 mechanism (using a block size of 5000) was worth the complexity, 1603 particularly after compression of the output file using several 1604 general purpose compression tools. The original PCAP file was 1605 325.79M and compressed using xz to 24.3Mb. 1607 +-------------+-------------+--------+--------+-------+ 1608 | Format | Output size | lz4 | gzip | xz | 1609 +-------------+-------------+--------+--------+-------+ 1610 | cbor-simple | 44.23M | 16.06M | 11.50M | 7.51M | 1611 | cbor-block | 22.44M | 15.14M | 10.70M | 7.23M | 1612 +-------------+-------------+--------+--------+-------+ 1614 It might be expected that blocking is exploiting commonality that a 1615 general purpose compression engine could also exploit, and the 1616 figures do indeed bear this out. The more powerful (and resource- 1617 consuming) the compression, the closer the compressed simple file 1618 size gets to the compressed chunk file size. With no compression, 1619 the blocked output size is typically half that of the simple output, 1620 but as greater degrees of compression are applied the gap shrinks. 1621 However, even with the stronger compressor, the chunked output 1622 remains roughly 5-10% smaller than the simple output. This, and the 1623 higher gains at lower compression, might be significant, depending on 1624 the target environment. 1626 [TODO: Add data on reduction in CPU overhead of compressing blocked 1627 output vs simple output.] 1629 This was repeated using some other binary representations: 1631 +-----------------+-------------+--------+--------+-------+ 1632 | Format | Output size | lz4 | gzip | xz | 1633 +-----------------+-------------+--------+--------+-------+ 1634 | json-simple | 189.85M | 25.59M | 16.03M | 9.74M | 1635 | avro-simple | 43.31M | 16.07M | 11.92M | 7.99M | 1636 | avro-block | 17.44M | 12.94M | 10.08M | 7.18M | 1637 | protobuf-simple | 46.02M | 15.79M | 11.59M | 7.94M | 1638 | protobuf-block | 22.08M | 15.43M | 10.91M | 7.40M | 1639 +-----------------+-------------+--------+--------+-------+ 1641 There's not a lot to choose between the three contenders with simple 1642 output. Avro produces the smaller output, CBOR the next and Protocol 1643 Buffers the largest, but the different is under 10%. However, with 1644 blocking, while CBOR and Protocol Buffers are again within a few 1645 percentage points of each other (though Protocol Buffers now has a 1646 slight advantage), Avro produces files in the region of 20% smaller, 1647 and holds a diminishing advantage through increased compression. 1649 Authors' Addresses 1651 John Dickinson 1652 Sinodun IT 1653 Magdalen Centre 1654 Oxford Science Park 1655 Oxford OX4 4GA 1657 Email: jad@sinodun.com 1659 Jim Hague 1660 Sinodun IT 1661 Magdalen Centre 1662 Oxford Science Park 1663 Oxford OX4 4GA 1665 Email: jim@sinodun.com 1667 Sara Dickinson 1668 Sinodun IT 1669 Magdalen Centre 1670 Oxford Science Park 1671 Oxford OX4 4GA 1673 Email: sara@sinodun.com 1675 Terry Manderson 1676 ICANN 1678 Email: terry.manderson@icann.org 1680 John Bond 1681 ICANN 1683 Email: john.bond@icann.org