idnits 2.17.1 draft-moura-dnsop-authoritative-recommendations-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 11, 2019) is 1774 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-10) exists of draft-ietf-dnsop-serve-stale-05 ** Obsolete normative reference: RFC 5575 (Obsoleted by RFC 8955) ** Obsolete normative reference: RFC 8499 (Obsoleted by RFC 9499) Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 DNSOP Working Group G. Moura 3 Internet-Draft SIDN Labs/TU Delft 4 Intended status: Informational W. Hardaker 5 Expires: December 13, 2019 J. Heidemann 6 USC/Information Sciences Institute 7 M. Davids 8 SIDN Labs 9 June 11, 2019 11 Considerations for Large Authoritative DNS Servers Operators 12 draft-moura-dnsop-authoritative-recommendations-04 14 Abstract 16 This document summarizes recent research work exploring DNS 17 configurations and offers specific, tangible considerations to 18 operators for configuring authoritative servers. 20 This document is not an Internet Standards Track specification; it is 21 published for informational purposes. 23 Ed note 25 This draft will be renamed to draft-moura-dnsop-large-authoritative- 26 considerations in case adpoted by the WG, to reflect the new title. 28 Text inside square brackets ([RF:ABC]) refers to: 30 o individual comments we have received about the draft, and 31 enumerated under . 35 o Issues listed on our Github repository 37 Both types will be removed before publication. 39 This draft is being hosted on GitHub - , where the most 41 recent version of the document and open issues can be found. The 42 authors gratefully accept pull requests. 44 Status of This Memo 46 This Internet-Draft is submitted in full conformance with the 47 provisions of BCP 78 and BCP 79. 49 Internet-Drafts are working documents of the Internet Engineering 50 Task Force (IETF). Note that other groups may also distribute 51 working documents as Internet-Drafts. The list of current Internet- 52 Drafts is at https://datatracker.ietf.org/drafts/current/. 54 Internet-Drafts are draft documents valid for a maximum of six months 55 and may be updated, replaced, or obsoleted by other documents at any 56 time. It is inappropriate to use Internet-Drafts as reference 57 material or to cite them other than as "work in progress." 59 This Internet-Draft will expire on December 13, 2019. 61 Copyright Notice 63 Copyright (c) 2019 IETF Trust and the persons identified as the 64 document authors. All rights reserved. 66 This document is subject to BCP 78 and the IETF Trust's Legal 67 Provisions Relating to IETF Documents 68 (https://trustee.ietf.org/license-info) in effect on the date of 69 publication of this document. Please review these documents 70 carefully, as they describe your rights and restrictions with respect 71 to this document. Code Components extracted from this document must 72 include Simplified BSD License text as described in Section 4.e of 73 the Trust Legal Provisions and are provided without warranty as 74 described in the Simplified BSD License. 76 Table of Contents 78 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 79 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 3 80 3. C1: Use equally strong IP anycast in every authoritative 81 server (NS) for better load distribution . . . . . . . . . . 5 82 4. C2: Routing Can Matter More Than Locations . . . . . . . . . 6 83 5. C3: Collecting Detailed Anycast Catchment Maps Ahead of 84 Actual Deployment Can Improve Engineering Designs . . . . . . 7 85 6. C4: When under stress, employ two strategies . . . . . . . . 9 86 7. C5: Consider longer time-to-live values whenever possible . . 10 87 8. Security considerations . . . . . . . . . . . . . . . . . . . 12 88 9. Privacy Considerations . . . . . . . . . . . . . . . . . . . 12 89 10. IANA considerations . . . . . . . . . . . . . . . . . . . . . 13 90 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 13 91 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 92 12.1. Normative References . . . . . . . . . . . . . . . . . . 13 93 12.2. Informative References . . . . . . . . . . . . . . . . . 14 94 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 96 1. Introduction 98 This document summarizes recent research work exploring DNS 99 configurations and offers specific tangible considerations to DNS 100 authoritative servers operators (DNS operators hereafter). 101 [RF:JAb2]], [RF:MSJ1], [RF:DW2]. The considerations (C1-C5) 102 presented in this document are backed by previous research work, 103 which used wide-scale Internet measurements upon which to draw their 104 conclusions. This document describes the key engineering options, 105 and points readers to the pertinent papers for details and 106 [RF:Issue15] other research works related to each consideration here 107 presented. 109 [RF:JAb1, Issue#2, SJa-02]. These considerations are designed for 110 operators of "large" authoritative servers. In this context, "large" 111 authoritative servers refers to those with a significant global user 112 population, like TLDs, run by a single or multiple operators. These 113 considerations may not be appropriate for smaller domains, such as 114 those used by an organization with users in one city or region, where 115 goals such as uniform low latency are less strict. 117 It is likely that these considerations might be useful in a wider 118 context, such as for any stateless/short-duration, anycasted service. 119 Because the conclusions of the studies don't verify this fact, the 120 wording in this document discusses DNS authoritative services only 121 ([RF:Issue13]). 123 2. Background 125 The domain name system (DNS) has main two types of DNS servers: 126 authoritative servers and recursive resolvers. Figure 1 shows their 127 relationship. An authoritative server (ATn in Figure 1) knows the 128 content of a DNS zone from local knowledge, and thus can answer 129 queries about that zone without needing to query other servers 130 [RFC2181]. A recursive resolver (Re_n) is a program that extracts 131 information from name servers in response to client requests 132 [RFC1034]. A client (stub in Figure 1) refers to stub resolver 133 [RFC1034] that is typically located within the client software. 135 +-----+ +-----+ +-----+ +-----+ 136 | AT1 | | AT2 | | AT3 | | AT4 | 137 +-----+ +-----+ +-----+ +-----+ 138 ^ ^ ^ ^ 139 | | | | 140 | +-----+ | | 141 +------|Re_1 |------+ | 142 | +-----+ | 143 | ^ | 144 | | | 145 | +-----+ +-----+ | 146 +------|Re_2 | |Re_3 |-----+ 147 +-----+ +-----+ 148 ^ ^ 149 | | 150 | +------+ | 151 +-| stub |--+ 152 +------+ 154 Figure 1: Relationship between recursive resolvers (Re_n) and 155 authoritative name servers (ATn) 157 DNS queries/responses contribute to user's perceived latency and 158 affect user experience [Sigla2014], and the DNS system has been 159 subject to repeated Denial of Service (DoS) attacks (for example, in 160 November 2015 [Moura16b]) in order to degrade user experience. 162 To reduce latency and improve resiliency against DoS attacks, DNS 163 uses several types of server replication. Replication at the 164 authoritative server level can be achieved with (i) the deployment of 165 multiple servers for the same zone [RFC1035] (AT1--AT4 in Figure 1), 166 (ii) the use of IP anycast [RFC1546][RFC4786][RFC7094] that allows 167 the same IP address to be announced from multiple locations (each of 168 them referred to as anycast instance [RFC8499]) and (iii) by using 169 load balancers to support multiple servers inside a single 170 (potentially anycasted) instance. As a consequence, there are many 171 possible ways an authoritative DNS provider can engineer its 172 production authoritative server network, with multiple viable choices 173 and no single optimal design. 175 In the next sections we cover specific considerations (C1-C5) for 176 large authoritative DNS servers operators. 178 3. C1: Use equally strong IP anycast in every authoritative server (NS) 179 for better load distribution 181 Authoritative DNS servers operators announce their authoritative 182 servers as NS records[RFC1034]. Different authoritatives for a given 183 zone should return the same content, typically by staying 184 synchronized using DNS zone transfers (AXFR[RFC5936] and 185 IXFR[RFC1995]) to coordinate the authoritative zone data to return to 186 their clients. 188 DNS heavily relies upon replication to support high reliability, 189 capacity and to reduce latency [Moura16b]. DNS has two complementary 190 mechanisms to replicate the service. First, the protocol itself 191 supports nameserver replication of DNS service for a DNS zone through 192 the use of multiple nameservers that each operate on different IP 193 addresses, listed by a zone's NS records. Second, each of these 194 network addresses can run from multiple physical locations through 195 the use of IP anycast[RFC1546][RFC4786][RFC7094], by announcing the 196 same IP address from each instance and allowing Internet routing 197 (BGP[RFC4271]) to associate clients with their topologically nearest 198 anycast instance. Outside the DNS protocol, replication can be 199 achieved by deploying load balancers at each physical location. 200 Nameserver replication is recommended for all zones (multiple NS 201 records), and IP anycast is used by most large zones such as the DNS 202 Root, most top-level domains[Moura16b] and large commercial 203 enterprises, governments and other organizations. 205 Most DNS operators strive to reduce latency for users of their 206 service. However, because they control only their authoritative 207 servers, and not the recursive resolvers communicating with those 208 servers, it is difficult to ensure that recursives will be served by 209 the closest authoritative server. Server selection is up to the 210 recursive resolver's software implementation, and different software 211 vendors and releases employ different criteria to chose which 212 authoritative servers with which to communicate. 214 Knowing how recursives choose authoritative servers is a key step to 215 better engineer the deployment of authoritative servers. 216 [Mueller17b] evaluates this with a measurement study in which they 217 deployed seven unicast authoritative name servers in different global 218 locations and queried these authoritative servers from more than 9k 219 RIPE Atlas probes and and their respective recursive resolvers. 221 In the wild, [Mueller17b] found that recursives query all available 222 authoritative servers, regardless of the observed latency. But the 223 distribution of queries tend to be skewed towards authoritatives with 224 lower latency: the lower the latency between a recursive resolver and 225 an authoritative server, the more often the recursive will send 226 queries to that authoritative. These results were obtained by 227 aggregating results from all vantage points and not specific to any 228 vendor/version. 230 The hypothesis is that this behavior is a consequence of two main 231 criteria employed by resolvers when choosing authoritatives: 232 performance (lower latency) and diversity of authoritatives, where a 233 resolver checks all authoritative servers to determine which is 234 closer and to provide alternatives if one is unavailable. 236 For a DNS operator, this policy means that latency of all 237 authoritatives (NS records [RF:SJa-01]) matter, so all must be 238 similarly capable, since all available authoritatives will be queried 239 by most recursive resolvers. Since unicast cannot deliver good 240 latency worldwide (a unicast authoritative server in Europe will 241 always have high latency to resolvers in California, for example, 242 given its geographical distance), [Mueller17b] recommends to DNS 243 operators that they deploy equally strong IP anycast in every 244 authoritative server (ie.e, on each NS record [RF:SJa-01]), in terms 245 of number of instances and peering, and, consequently, to phase out 246 unicast, so they can deliver good latency values to global clients. 247 However, [Mueller17b] also notes that DNS operators should also take 248 architectural considerations into account when planning for deploying 249 anycast [RFC1546]. 251 This consideration was deployed at the ".nl" TLD zone, which 252 originally had seven authoritative severs (mixed unicast/anycast 253 setup). .nl has moved in early 2018 to a setup with 4 anycast 254 authoritative name servers. This is not to say that .nl was the 255 first - other zones, have been running anycast only authoritatives 256 (e.g., .be since 2013). [Mueller17b] contribution is to show that 257 unicast cannot deliver good latency worldwide, and that anycast has 258 to be deployed to deliver good latency worldwide. 260 4. C2: Routing Can Matter More Than Locations 262 A common metric when choosing an anycast DNS provider or setting up 263 an anycast service is the number of anycast instances[RFC4786], i.e., 264 the number of global locations from which the same address is 265 announced with BGP. Intuitively, one could think that more instances 266 will lead to shorter response times. 268 However, this is not necessarily true. In fact, [Schmidt17a] found 269 that routing can matter more than the total number of locations. 270 They analyzed the relationship between the number of anycast 271 instances and the performance of a service (latency-wise, RTT) and 272 measured the overall performance of four DNS Root servers, namely C, 273 F, K and L, from more than 7.9k RIPE Atlas probes. 275 [Schmidt17a] found that C-Root, a smaller anycast deployment 276 consisting of only 8 instances (they refer to anycast instance as 277 anycast site), provided a very similar overall performance than that 278 of the much larger deployments of K and L, with 33 and 144 instances 279 respectively. The median RTT for C, K and L Root was between 280 30-32ms. 282 Given that Atlas has better coverage in Europe than other regions, 283 the authors specifically analyzed results per region and per country 284 (Figure 5 in [Schmidt17a]), and show that Atlas bias to Europe does 285 not change the conclusion that location of anycast instances 286 dominates latency. [RF:Issue12] 288 [Schmidt17a] consideration for DNS operators when engineering anycast 289 services is consider factors other than just the number of instances 290 (such as local routing connectivity) when designing for performance. 291 They showed that 12 instances can provide reasonable latency, given 292 they are globally distributed and have good local interconnectivity. 293 However, more instances can be useful for other reasons, such as when 294 handling DDoS attacks [Moura16b]. 296 5. C3: Collecting Detailed Anycast Catchment Maps Ahead of Actual 297 Deployment Can Improve Engineering Designs 299 An anycast DNS service may have several dozens or even more than one 300 hundred instances (such as L-Root does). Anycast leverages Internet 301 routing to distribute the incoming queries to a service's distributed 302 anycast instances; in theory, BGP (the Internet's defacto routing 303 protocol) forwards incoming queries to a nearby anycast instance (in 304 terms of BGP distance). However, usually queries are not evenly 305 distributed across all anycast instances, as found in the case of 306 L-Root [IcannHedge18]. 308 Adding new instances to an anycast service may change the load 309 distribution across all instances, leading to suboptimal usage of the 310 service or even stressing some instances while others remain 311 underutilized. This is a scenario that operators constantly face 312 when expanding an anycast service. Besides, when setting up a new 313 anycast service instance, operators cannot directly estimate the 314 query distribution among the instances in advance of enabling the new 315 instance. 317 To estimate the query loads across instances of an expanding service 318 or a when setting up an entirely new service, operators need detailed 319 anycast maps and catchment estimates (i.e., operators need to know 320 which prefixes will be matched to which anycast instance). To do 321 that, [Vries17b] developed a new technique enabling operators to 322 carry out active measurements, using an open-source tool called 323 Verfploeter (available at [VerfSrc]). Verfploeter maps a large 324 portion of the IPv4 address space, allowing DNS operators to predict 325 both query distribution and clients catchment before deploying new 326 anycast instances. 328 [Vries17b] shows how this technique was used to predict both the 329 catchment and query load distribution for the new anycast service of 330 B-Root. Using two anycast instances in Miami (MIA) and Los Angeles 331 (LAX) from the operational B-Root server, they sent ICMP echo packets 332 to IP addresses to each IPv4 /24 on the Internet using a source 333 address within the anycast prefix. Then, they recorded which 334 instance the ICMP echo replies arrived at based on the Internet's BGP 335 routing. This analysis resulted in an Internet wide catchment map. 336 Weighting was then applied to the incoming traffic prefixes based on 337 of 1 day of B-Root traffic (2017-04-12, DITL datasets [Ditl17]). The 338 combination of the created catchment mapping and the load per prefix 339 created an estimate predicting that 81.6% of the traffic would go to 340 the LAX instance. The actual value was 81.4% of traffic going to 341 LAX, showing that the estimation was pretty close and the Verfploeter 342 technique was a excellent method of predicting traffic loads in 343 advance of a new anycast instance deployment ([Vries17b] also uses 344 the term anycast site to refer to anycast instance). 346 Besides that, Verfploeter can also be used to estimate how traffic 347 shifts among instances when BGP manipulations are executed, such as 348 AS Path prepending that is frequently used by production networks 349 during DDoS attacks. A new catchment mapping for each prepending 350 configuration configuration: no prepending, and prepending with 1, 2 351 or 3 hops at each instance. Then, [Vries17b] shows that this mapping 352 can accurately estimate the load distribution for each configuration. 354 An important operational takeaway from [Vries17b] is that DNS 355 operators can make informed choices when engineering new anycast 356 instances or when expending new ones by carrying out active 357 measurements using Verfploeter in advance of operationally enabling 358 the fully anycast service. Operators can spot sub-optimal routing 359 situations early, with a fine granularity, and with significantly 360 better coverage than using traditional measurement platforms such as 361 RIPE Atlas. 363 To date, Verfploeter has been deployed on B-Root[Vries17b], on a 364 operational testbed (Anycast testbed) [AnyTest], and on a large 365 unnamed operator. 367 The consideration is therefore to deploy a small test Verfploeter- 368 enabled platform in advance at a potential anycast instance may 369 reveal the realizable benefits of using that instance as an anycast 370 interest, potentially saving significant financial and labor costs of 371 deploying hardware to a new instance that was less effective than as 372 had been hoped. 374 6. C4: When under stress, employ two strategies 376 DDoS attacks are becoming bigger, cheaper, and more frequent 377 [Moura16b]. The most powerful recorded DDoS attack to DNS servers to 378 date reached 1.2 Tbps, by using IoT devices [Perlroth16]. Such 379 attacks call for an answer for the following question: how should a 380 DNS operator engineer its anycast authoritative DNS server react to 381 the stress of a DDoS attack? This question is investigated in study 382 [Moura16b] in which empirical observations are grounded with the 383 following theoretical evaluation of options. 385 An authoritative DNS server deployed using anycast will have many 386 server instances distributed over many networks. Ultimately, the 387 relationship between the DNS provider's network and a client's ISP 388 will determine which anycast instance will answer queries for a given 389 client, given that BGP is the protocol that maps clients to specific 390 anycast instances by using routing information [RF:KDar02]. As a 391 consequence, when an anycast authoritative server is under attack, 392 the load that each anycast instance receives is likely to be unevenly 393 distributed (a function of the source of the attacks), thus some 394 instances may be more overloaded than others which is what was 395 observed analyzing the Root DNS events of Nov. 2015 [Moura16b]. 396 Given the fact that different instances may have different capacity 397 (bandwidth, CPU, etc.), making a decision about how to react to 398 stress becomes even more difficult. 400 In practice, an anycast instance under stress, overloaded with 401 incoming traffic, has two options: 403 o It can withdraw or pre-prepend its route to some or to all of its 404 neighbors, ([RF:Issue3]) perform other traffic shifting tricks 405 (such as reducing the propagation of its announcements using BGP 406 communities[RFC1997]) which shrinks portions of its catchment), 407 use FlowSpec [RFC5575] or other upstream communication mechanisms 408 to deploy upstream filtering. The goals of these techniques is to 409 perform some combination of shifting of both legitimate and attack 410 traffic to other anycast instances (with hopefully greater 411 capacity) or to block the traffic entirely. 413 o Alternatively, it can be become a degraded absorber, continuing to 414 operate, but with overloaded ingress routers, dropping some 415 incoming legitimate requests due to queue overflow. However, 416 continued operation will also absorb traffic from attackers in its 417 catchment, protecting the other anycast instances. 419 [Moura16b] saw both of these behaviors in practice in the Root DNS 420 events, observed through instance reachability and route-trip time 421 (RTTs). These options represent different uses of an anycast 422 deployment. The withdrawal strategy causes anycast to respond as a 423 waterbed, with stress displacing queries from one instance to others. 424 The absorption strategy behaves as a conventional mattress, 425 compressing under load, with some queries getting delayed or dropped. 427 Although described as strategies and policies, these outcomes are the 428 result of several factors: the combination of operator and host ISP 429 routing policies, routing implementations withdrawing under load, the 430 nature of the attack, and the locations of the instances and the 431 attackers. Some policies are explicit, such as the choice of local- 432 only anycast instances, or operators removing an instance for 433 maintenance or modifying routing to manage load. However, under 434 stress, the choices of withdrawal and absorption can also be results 435 that emerge from a mix of explicit choices and implementation 436 details, such as BGP timeout values. 438 [Moura16b] speculates that more careful, explicit, and automated 439 management of policies may provide stronger defenses to overload, an 440 area currently under study. For DNS operators, that means that 441 besides traditional filtering, two other options are available 442 (withdraw/prepend/communities or isolate instances), and the best 443 choice depends on the specifics of the attack. 445 Note that this consideration refers to the operation of one anycast 446 service, i.e., one anycast NS record. However, DNS zones with 447 multiple NS anycast services may expect load to spill from one 448 anycast server to another,as resolvers switch from authoritative to 449 authoritative when attempting to resolve a name [Mueller17b]. 451 7. C5: Consider longer time-to-live values whenever possible 453 [RF:Issue7]: this section has been completely rewritten. 455 Caching is the cornerstone of good DNS performance and reliability. 456 A 15 ms response to a new DNS query is fast, but a 1 ms cache hit to 457 a repeat query is far faster. Caching also protects users from short 458 outages and can mute even significant DDoS attacks [Moura18b]. 460 DNS record TTLs (time-to-live values) directly control cache 461 durations [RFC1034][RFC1035] and, therefore, affect latency, 462 resilience, and the role of DNS in CDN server selection. Some early 463 work modeled caches as a function of their TTLs [Jung03a], and recent 464 work examined their interaction with DNS[Moura18b], but no research 465 provides considerations about what TTL values are good. With this 466 goal Moura et. al. [Moura19a] carried out a measurement study 467 investigating TTL choices and its impact on user experience. 469 First, they identified several reasons why operators/zone owners may 470 want to choose longer or shorter TTLs: 472 o Longer caching results in faster responses, given that cache hits 473 are faster than cache misses in resolvers. [Moura19a] shows that 474 the change in TTL for .uy TLD from 1 day to 5 minutes reduced the 475 RTT from 15k Atlas vantage points significantly: the median was 476 reduced from 28.7ms to 8ms, while the 75%ile decreased from 183ms 477 to 21ms. 479 o Longer caching results in lower DNS traffic: authoritative servers 480 will experience less traffic if TTLs are extended, given that 481 repeated queries will be answered by resolver caches. 483 o Longer caching results in lower cost if DNS is metered: some DNS- 484 As-A-Service providers charges are metered, with a per query cost 485 (often added to a fixed monthly cost). 487 o Longer caching is more robust to DDoS attacks on DNS: DDoS attacks 488 on a DNS service provider harmed several prominent websites 489 [Perlroth16]. Recent work has shown that DNS caching can greatly 490 reduce the effects of DDoS on DNS, provided caches last longer 491 than the attack [Moura18b]. 493 o Shorter caching supports operational changes: An easy way to 494 transition from an old server to a new one is to change the DNS 495 records. Since there is no method to remove cached DNS records, 496 the TTL duration represents a necessary transition delay to fully 497 shift to a new server, so low TTLs allow more rapid transition. 498 However, when deployments are planned in advance (that is, longer 499 than the TTL), then TTLs can be lowered ''just-before'' a major 500 operational change, and raised again once accomplished. 502 o Shorter caching can with DNS-based load balancing: Some DDoS- 503 scrubbing services use DNS to redirect traffic during an attack. 504 Since DDoS attacks arrive unannounced, DNS-based traffic 505 redirection requires the TTL be kept quite low at all times to be 506 ready to respond to a potential attack. 508 As such, choice of TTL depends in part on external factors so no 509 single recommendation is appropriate for all. Organizations must 510 weigh these trade-offs to find a good balance. Still, some 511 guidelines can be used when choosing TTLs: 513 o For general users, [Moura19a] recommends longer TTLs, of at least 514 one hour, and ideally 4, 8, 12, or 24 hours. Assuming planned 515 maintenance can be scheduled at least a day in advance, long TTLs 516 have little cost. 518 o For TLD operators: TLD operators that allow public registration of 519 domains (such as most ccTLDs and .com, .net, .org) host, in their 520 zone files, NS records (and glues if in-bailiwick) of their 521 respective domains. [Moura19a] shows that most resolvers will use 522 TTL values provided by the child delegations, but some will choose 523 the TTL provided by the parents. As such, similarly to general 524 users, [Moura19a] recommends longer TTLs for NS records of their 525 delegations (at least one hour, preferably more). 527 o Users of DNS-based load balancing or DDoS-prevention may require 528 short TTLs: TTLs may be as short as 5 minutes, although 15 minutes 529 may provide sufficient agility for many operators. Shorter TTLs 530 here help agility; they are are an exception to the consideration 531 for longer TTLs. 533 o Use A/AAAA and NS records: TTLs of A/AAAA records should be 534 shorter or equal to the TTL for NS records for in-bailiwick 535 authoritative DNS servers, given that the authors [Moura19a] found 536 that, for such scenarios, once NS record expires, their associated 537 A/AAAA will also be updated (glue is sent by the parents). For 538 out-of-bailiwick servers, A and NS records are usually cached 539 independently, so different TTLs, if desired, will be effective. 540 In either case, short A and AAAA records may be desired if DDoS- 541 mitigation services are an option. 543 8. Security considerations 545 This document suggests the use of [I-D.ietf-dnsop-serve-stale]. It 546 be noted that usage of such methods may affect data integrity of DNS 547 information. This document describes methods of mitigating changes 548 of a denial of service threat within a DNS service. 550 As this document discusses research, there are no further security 551 considerations, other than the ones mentioned in the normative 552 references. 554 9. Privacy Considerations 556 This document does not add any practical new privacy issues. 558 10. IANA considerations 560 This document has no IANA actions. 562 11. Acknowledgements 564 This document is a summary of the main considerations of six research 565 works referred in this document. As such, they were only possible 566 thanks to the hard work of the authors of these research works. 568 The authors of this document are also co-authors of these research 569 works. However, not all thirteen authors of these research papers 570 are also authors of this document. We would like to thank those not 571 included in this document's author list for their work: Ricardo de O. 572 Schmidt, Wouter B de Vries, Moritz Mueller, Lan Wei, Cristian 573 Hesselman, Jan Harm Kuipers, Pieter-Tjerk de Boer and Aiko Pras. 575 We would like also to thank the various reviewers of different 576 versions of this draft: Duane Wessels, Joe Abley, Toema Gavrichenkov, 577 John Levine, Michael StJohns, Kristof Tuyteleers, Stefan Ubbink, 578 Klaus Darilion and Samir Jafferali, and comments provided at the IETF 579 DNSOP session (IETF104). 581 Besides those, we would like thank those who have been individually 582 thanked in each research work, RIPE NCC and DNS OARC for their tools 583 and datasets used in this research, as well as the funding agencies 584 sponsoring the individual research works. 586 12. References 588 12.1. Normative References 590 [I-D.ietf-dnsop-serve-stale] 591 Lawrence, D., Kumari, W., and P. Sood, "Serving Stale Data 592 to Improve DNS Resiliency", draft-ietf-dnsop-serve- 593 stale-05 (work in progress), April 2019. 595 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 596 STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, 597 . 599 [RFC1035] Mockapetris, P., "Domain names - implementation and 600 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, 601 November 1987, . 603 [RFC1546] Partridge, C., Mendez, T., and W. Milliken, "Host 604 Anycasting Service", RFC 1546, DOI 10.17487/RFC1546, 605 November 1993, . 607 [RFC1995] Ohta, M., "Incremental Zone Transfer in DNS", RFC 1995, 608 DOI 10.17487/RFC1995, August 1996, 609 . 611 [RFC1997] Chandra, R., Traina, P., and T. Li, "BGP Communities 612 Attribute", RFC 1997, DOI 10.17487/RFC1997, August 1996, 613 . 615 [RFC2181] Elz, R. and R. Bush, "Clarifications to the DNS 616 Specification", RFC 2181, DOI 10.17487/RFC2181, July 1997, 617 . 619 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 620 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 621 DOI 10.17487/RFC4271, January 2006, 622 . 624 [RFC4786] Abley, J. and K. Lindqvist, "Operation of Anycast 625 Services", BCP 126, RFC 4786, DOI 10.17487/RFC4786, 626 December 2006, . 628 [RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., 629 and D. McPherson, "Dissemination of Flow Specification 630 Rules", RFC 5575, DOI 10.17487/RFC5575, August 2009, 631 . 633 [RFC5936] Lewis, E. and A. Hoenes, Ed., "DNS Zone Transfer Protocol 634 (AXFR)", RFC 5936, DOI 10.17487/RFC5936, June 2010, 635 . 637 [RFC7094] McPherson, D., Oran, D., Thaler, D., and E. Osterweil, 638 "Architectural Considerations of IP Anycast", RFC 7094, 639 DOI 10.17487/RFC7094, January 2014, 640 . 642 [RFC8499] Hoffman, P., Sullivan, A., and K. Fujiwara, "DNS 643 Terminology", BCP 219, RFC 8499, DOI 10.17487/RFC8499, 644 January 2019, . 646 12.2. Informative References 648 [AnyTest] Schmidt, R., "Anycast Testbed", December 2018, 649 . 651 [Ditl17] OARC, D., "2017 DITL data", October 2018, 652 . 654 [IcannHedge18] 655 ICANN, ., "DNS-STATS - Hedgehog 2.4.1", October 2018, 656 . 658 [Jung03a] Jung, J., Berger, A., and H. Balakrishnan, "Modeling TTL- 659 based Internet caches", ACM 2003 IEEE INFOCOM, 660 DOI 10.1109/INFCOM.2003.1208693, July 2003, 661 . 663 [Moura16b] 664 Moura, G., Schmidt, R., Heidemann, J., Mueller, M., Wei, 665 L., and C. Hesselman, "Anycast vs DDoS Evaluating the 666 November 2015 Root DNS Events.", ACM 2016 Internet 667 Measurement Conference, DOI /10.1145/2987443.2987446, 668 October 2016, 669 . 671 [Moura18b] 672 Moura, G., Heidemann, J., Mueller, M., Schmidt, R., and M. 673 Davids, "When the Dike Breaks: Dissecting DNS Defenses 674 During DDos", ACM 2018 Internet Measurement Conference, 675 DOI 10.1145/3278532.3278534, October 2018, 676 . 678 [Moura19a] 679 Moura, G., Heidemann, J., Schmidt, R., and W. Hardaker, 680 "TBA", June 2019, 681 . 683 [Mueller17b] 684 Mueller, M., Moura, G., Schmidt, R., and J. Heidemann, 685 "Recursives in the Wild- Engineering Authoritative DNS 686 Servers.", ACM 2017 Internet Measurement Conference, 687 DOI 10.1145/3131365.3131366, October 2017, 688 . 690 [Perlroth16] 691 Perlroth, N., "Hackers Used New Weapons to Disrupt Major 692 Websites Across U.S.", October 2016, 693 . 696 [Schmidt17a] 697 Schmidt, R., Heidemann, J., and J. Kuipers, "Anycast 698 Latency - How Many Sites Are Enough. In Proceedings of the 699 Passive and Active Measurement Workshop", PAM Passive and 700 Active Measurement Conference, March 2017, 701 . 703 [Sigla2014] 704 Singla, A., Chandrasekaran, B., Godfrey, P., and B. Maggs, 705 "The Internet at the speed of light. In Proceedings of the 706 13th ACM Workshop on Hot Topics in Networks (Oct 2014)", 707 ACM Workshop on Hot Topics in Networks, October 2014, 708 . 711 [VerfSrc] Vries, W., "Verfploeter source code", November 2018, 712 . 714 [Vries17b] 715 Vries, W., Schmidt, R., Hardaker, W., Heidemann, J., Boer, 716 P., and A. Pras, "Verfploeter - Broad and Load-Aware 717 Anycast Mapping", ACM 2017 Internet Measurement 718 Conference, DOI 10.1145/3131365.3131371, October 2017, 719 . 721 Authors' Addresses 723 Giovane C. M. Moura 724 SIDN Labs/TU Delft 725 Meander 501 726 Arnhem 6825 MD 727 The Netherlands 729 Phone: +31 26 352 5500 730 Email: giovane.moura@sidn.nl 732 Wes Hardaker 733 USC/Information Sciences Institute 734 PO Box 382 735 Davis 95617-0382 736 U.S.A. 738 Phone: +1 (530) 404-0099 739 Email: ietf@hardakers.net 741 John Heidemann 742 USC/Information Sciences Institute 743 4676 Admiralty Way 744 Marina Del Rey 90292-6695 745 U.S.A. 747 Phone: +1 (310) 448-8708 748 Email: johnh@isi.edu 749 Marco Davids 750 SIDN Labs 751 Meander 501 752 Arnhem 6825 MD 753 The Netherlands 755 Phone: +31 26 352 5500 756 Email: marco.davids@sidn.nl