idnits 2.17.1 draft-ietf-dnsop-avoid-fragmentation-04.txt: -(377): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 2 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (22 February 2021) is 1159 days in the past. Is this intentional? Checking references for intended status: Best Current Practice ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 8499 (Obsoleted by RFC 9499) Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group K. Fujiwara 3 Internet-Draft JPRS 4 Intended status: Best Current Practice P. Vixie 5 Expires: 26 August 2021 Farsight 6 22 February 2021 8 Fragmentation Avoidance in DNS 9 draft-ietf-dnsop-avoid-fragmentation-04 11 Abstract 13 EDNS0 enables a DNS server to send large responses using UDP and is 14 widely deployed. Path MTU discovery remains widely undeployed due to 15 security issues, and IP fragmentation has exposed weaknesses in 16 application protocols. Currently, DNS is known to be the largest 17 user of IP fragmentation. It is possible to avoid IP fragmentation 18 in DNS by limiting response size where possible, and signaling the 19 need to upgrade from UDP to TCP transport where necessary. This 20 document proposes to avoid IP fragmentation in DNS. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at https://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on 26 August 2021. 39 Copyright Notice 41 Copyright (c) 2021 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 46 license-info) in effect on the date of publication of this document. 47 Please review these documents carefully, as they describe your rights 48 and restrictions with respect to this document. Code Components 49 extracted from this document must include Simplified BSD License text 50 as described in Section 4.e of the Trust Legal Provisions and are 51 provided without warranty as described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 56 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 57 3. Proposal to avoid IP fragmentation in DNS . . . . . . . . . . 4 58 3.1. Recommendations for UDP responders . . . . . . . . . . . 4 59 3.2. Recommendations for UDP requestors . . . . . . . . . . . 5 60 3.3. Default Maximum DNS/UDP payload size . . . . . . . . . . 5 61 4. Incremental deployment . . . . . . . . . . . . . . . . . . . 6 62 5. Request to zone operators and DNS server operators . . . . . 7 63 6. Considerations . . . . . . . . . . . . . . . . . . . . . . . 7 64 6.1. Protocol compliance . . . . . . . . . . . . . . . . . . . 7 65 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 66 8. Security Considerations . . . . . . . . . . . . . . . . . . . 7 67 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 8 68 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 69 10.1. Normative References . . . . . . . . . . . . . . . . . . 8 70 10.2. Informative References . . . . . . . . . . . . . . . . . 9 71 Appendix A. How to retrieve path MTU value to a destination from 72 applications . . . . . . . . . . . . . . . . . . . . . . 10 73 Appendix B. Minimal-responses . . . . . . . . . . . . . . . . . 10 74 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 76 1. Introduction 78 DNS has EDNS0 [RFC6891] mechanism. It enables a DNS server to send 79 large responses using UDP. EDNS0 is now widely deployed, and DNS 80 (over UDP) is said to be the biggest user of IP fragmentation. 82 However, "Fragmentation Considered Poisonous" [Herzberg2013] proposed 83 effective off-path DNS cache poisoning attack vectors using IP 84 fragmentation. "IP fragmentation attack on DNS" [Hlavacek2013] and 85 "Domain Validation++ For MitM-Resilient PKI" [Brandt2018] proposed 86 that off-path attackers can intervene in path MTU discovery [RFC1191] 87 to perform intentionally fragmented responses from authoritative 88 servers. [RFC7739] stated the security implications of predictable 89 fragment identification values. 91 DNSSEC is a countermeasure against cache poisoning attacks that use 92 IP fragmentation. However, DNS delegation responses are not signed 93 with DNSSEC, and DNSSEC does not have a mechanism to get the correct 94 response if an incorrect delegation is injected. This is a denial- 95 of-service vulnerability that can yield failed name resolutions. If 96 cache poisoning attacks can be avoided, DNSSEC validation failures 97 will be avoided. 99 In Section 3.2 (Message Side Guidelines) of UDP Usage Guidelines 100 [RFC8085] we are told that an application SHOULD NOT send UDP 101 datagrams that result in IP packets that exceed the Maximum 102 Transmission Unit (MTU) along the path to the destination. 104 A DNS message receiver cannot trust fragmented UDP datagrams 105 primarily due to the small amount of entropy provided by UDP port 106 numbers and DNS message identifiers, each of which being only 16 bits 107 in size, and both likely being in the first fragment of a packet, if 108 fragmentation occurs. By comparison, TCP protocol stack controls 109 packet size and avoid IP fragmentation under ICMP NEEDFRAG attacks. 110 In TCP, fragmentation should be avoided for performance reasons, 111 whereas for UDP, fragmentation should be avoided for resiliency and 112 authenticity reasons. 114 [RFC8900] summarized that IP fragmentation introduces fragility to 115 Internet communication. The transport of DNS messages over UDP 116 should take account of the observations stated in that document. 118 TCP avoids fragmentation using its Maximum Segment Size (MSS) 119 parameter, but each transmitted segment is header-size aware such 120 that the size of the IP and TCP headers is known, as well as the far 121 end's MSS parameter and the interface or path MTU, so that the 122 segment size can be chosen so as to keep the each IP datagram below a 123 target size. This takes advantage of the elasticity of TCP's 124 packetizing process as to how much queued data will fit into the next 125 segment. In contrast, DNS over UDP has little datagram size 126 elasticity and lacks insight into IP header and option size, and so 127 must make more conservative estimates about available UDP payload 128 space. 130 This document proposes to set IP_DONTFRAG / IPV6_DONTFRAG in DNS/UDP 131 messages in order to avoid IP fragmentation, and describes how to 132 avoid packet losses due to IP_DONTFRAG / IPV6_DONTFRAG. 134 2. Terminology 136 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 137 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 138 "OPTIONAL" in this document are to be interpreted as described in 139 BCP14 [RFC2119] [RFC8174] when, and only when, they appear in all 140 capitals, as shown here. 142 "Requestor" refers to the side that sends a request. "Responder" 143 refers to an authoritative, recursive resolver or other DNS component 144 that responds to questions. (Quoted from EDNS0 [RFC6891]) 146 "Path MTU" is the minimum link MTU of all the links in a path between 147 a source node and a destination node. (Quoted from [RFC8201]) 149 "Path MTU discovery" is defined by [RFC1191], [RFC8201] and 150 [RFC8899]. 152 Many of the specialized terms used in this document are defined in 153 DNS Terminology [RFC8499]. 155 3. Proposal to avoid IP fragmentation in DNS 157 The methods to avoid IP fragmentation in DNS are described below: 159 3.1. Recommendations for UDP responders 161 * UDP responders SHOULD send DNS responses with IP_DONTFRAG / 162 IPV6_DONTFRAG [RFC3542] options. 164 * If the UDP responder detects immediate error that the UDP packet 165 cannot be sent beyond the path MTU size (EMSGSIZE), the UDP 166 responder MAY recreate response packets fit in path MTU size, or 167 TC bit set. 169 * UDP responders MAY probe to discover the real MTU value per 170 destination. 172 * UDP responders SHOULD compose UDP responses that result in IP 173 packets that do not exceed the path MTU to the requestor. If the 174 path MTU discovery failed or is impossible, UDP responders SHOULD 175 compose UDP responses that result in IP packets that do not exceed 176 the default maximum DNS/UDP payload size described in Section 3.3. 178 The cause and effect of the TC bit is unchanged from EDNS0 179 [RFC6891]. 181 3.2. Recommendations for UDP requestors 183 * UDP requestors SHOULD send DNS requests with IP_DONTFRAG / 184 IPV6_DONTFRAG [RFC3542] options. 186 * UDP requestors MAY probe to discover the real MTU value per 187 destination. Then, calculate their maximum DNS/UDP payload size 188 as the reported path MTU minus IPv4/IPv6 header size (20 or 40) 189 minus UDP header size (8). If the path MTU discovery failed or is 190 impossible, use the default maximum DNS/UDP payload size described 191 in Section 3.3. 193 * UDP requestors SHOULD use the requestor's payload size as the 194 calculated or the default maximum DNS/UDP payload size. 196 * UDP requestors MAY drop fragmented DNS/UDP responses without IP 197 reassembly to avoid cache poisoning attacks. 199 * DNS responses may be dropped by IP fragmentation. Upon a timeout, 200 UDP requestors may retry using TCP or UDP, per local policy. 202 3.3. Default Maximum DNS/UDP payload size 204 Default maximum DNS/UDP payload size for IPv6 is XXXX. (Choose 1232, 205 1400, 1472 or other good values before/at WGLC) 207 Default maximum DNS/UDP payload size for IPv4 is XXXX. (Choose 1232, 208 1400, 1452 or other good values before/at WGLC) 210 Operators of DNS servers SHOULD measure their path MTU to well-known 211 locations on the Internet, such as [a-m].root-servers.net or [a- 212 m].gtld-servers.net at setting up the servers. The smallest value of 213 path MTU is the server's path MTU to the Internet. The server's 214 maximum DNS/UDP payload size for IPv4 is the reported path MTU minus 215 IPv4 header size (20) minus UDP header size (8). The server's 216 maximum DNS/UDP payload size for IPv6 is the reported path MTU minus 217 IPv6 header size (40) minus UDP header size (8). 219 Discussions under here will be moved to appendix as a background of 220 default maximum DNS/UDP payload size when the discussion is over. 222 There are many discussions for default path MTU size and maximum DNS/ 223 UDP payload size. 225 * The minimum MTU for an IPv6 interface is 1280 octets (see 226 Section 5 of [RFC8200]). Then, we can use it as default path MTU 227 value for IPv6. 229 * Most of the Internet and especially the inner core has an MTU of 230 at least 1500 octets. An operator of a full resolver would be 231 well advised to measure their path MTU to several authority name 232 servers and to a random sample of their expected stub resolver 233 client networks, to find the upper boundary on IP/UDP packet size 234 in the average case. This limit should not be exceeded by most 235 messages received or transmitted by a full resolver, or else 236 fallback to TCP will occur too often. An operator of 237 authoritative servers would also be well advised to measure their 238 path MTU to several full-service resolvers. The Linux tool 239 "tracepath" can be used to measure the path MTU to well known 240 authority name servers such as [a-m].root-servers.net or [a- 241 m].gtld-servers.net. If the reported path MTU is for example no 242 smaller than 1460, then the maximum DNS/UDP payload would be 1432 243 for IP4 (which is 1460 - IP4 header(20) - UDP header(8)) and 1412 244 for IP6 (which is 1460 - IP6 header(40) - UDP header(8)). To 245 allow for possible IP options and distant tunnel overhead, a 246 useful default for maximum DNS/UDP payload size would be 1400. 248 * [RFC4035] defines that "A security-aware name server MUST support 249 the EDNS0 message size extension, MUST support a message size of 250 at least 1220 octets". Then, the smallest number of the maximum 251 DNS/UDP payload size is 1220. 253 * In order to avoid IP fragmentation, [DNSFlagDay2020] proposed that 254 the UDP requestors set the requestor's payload size to 1232, and 255 the UDP responders compose UDP responses fit in 1232 octets. The 256 size 1232 is based on an MTU of 1280, which is required by the 257 IPv6 specification [RFC8200], minus 48 octets for the IPv6 and UDP 258 headers. 260 * [Huston2021] analyzed the result of [DNSFlagDay2020], reported 261 that their measurements suggest that in the interior of the 262 Internet between recursive resolvers and authoritative servers the 263 prevailing MTU is at 1,500 and there is no measurable signal of 264 use of smaller MTUs in this part of the Internet, and proposed 265 that their measurements suggest setting the EDNS0 Buffer size to 266 IPv4 1472 octets and IPv6 1452 octets. 268 4. Incremental deployment 270 The proposed method supports incremental deployment. 272 When a full-service resolver implements the proposed method, its stub 273 resolvers (clients) and the authority server network will no longer 274 observe IP fragmentation or reassembly from that server, and will 275 fall back to TCP when necessary. 277 When an authoritative server implements the proposed method, its full 278 service resolvers (clients) will no longer observe IP fragmentation 279 or reassembly from that server, and will fall back to TCP when 280 necessary. 282 5. Request to zone operators and DNS server operators 284 Large DNS responses are the result of zone configuration. Zone 285 operators SHOULD seek configurations resulting in small responses. 286 For example, 288 * Use smaller number of name servers (13 may be too large) 290 * Use smaller number of A/AAAA RRs for a domain name 292 * Use 'minimal-responses' configuration: Some implementations have 293 'minimal responses' configuration that causes DNS servers to make 294 response packets smaller, containing only mandatory and required 295 data (Appendix B). 297 * Use smaller signature / public key size algorithm for DNSSEC. 298 Notably, the signature size of ECDSA or EdDSA is smaller than RSA. 300 6. Considerations 302 6.1. Protocol compliance 304 In prior research ([Fujiwara2018] and dns-operations mailing list 305 discussions), there are some authoritative servers that ignore EDNS0 306 requestor's UDP payload size, and return large UDP responses. 308 It is also well known that there are some authoritative servers that 309 do not support TCP transport. 311 Such non-compliant behavior cannot become implementation or 312 configuration constraints for the rest of the DNS. If failure is the 313 result, then that failure must be localized to the non-compliant 314 servers. 316 7. IANA Considerations 318 This document has no IANA actions. 320 8. Security Considerations 321 9. Acknowledgments 323 The author would like to specifically thank Paul Wouters, Mukund 324 Sivaraman Tony Finch and Hugo Salgado for extensive review and 325 comments. 327 10. References 329 10.1. Normative References 331 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 332 DOI 10.17487/RFC1191, November 1990, 333 . 335 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 336 Requirement Levels", BCP 14, RFC 2119, 337 DOI 10.17487/RFC2119, March 1997, 338 . 340 [RFC4035] Arends, R., Austein, R., Larson, M., Massey, D., and S. 341 Rose, "Protocol Modifications for the DNS Security 342 Extensions", RFC 4035, DOI 10.17487/RFC4035, March 2005, 343 . 345 [RFC5155] Laurie, B., Sisson, G., Arends, R., and D. Blacka, "DNS 346 Security (DNSSEC) Hashed Authenticated Denial of 347 Existence", RFC 5155, DOI 10.17487/RFC5155, March 2008, 348 . 350 [RFC6891] Damas, J., Graff, M., and P. Vixie, "Extension Mechanisms 351 for DNS (EDNS(0))", STD 75, RFC 6891, 352 DOI 10.17487/RFC6891, April 2013, 353 . 355 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 356 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 357 March 2017, . 359 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 360 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 361 May 2017, . 363 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 364 (IPv6) Specification", STD 86, RFC 8200, 365 DOI 10.17487/RFC8200, July 2017, 366 . 368 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 369 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 370 DOI 10.17487/RFC8201, July 2017, 371 . 373 [RFC8499] Hoffman, P., Sullivan, A., and K. Fujiwara, "DNS 374 Terminology", BCP 219, RFC 8499, DOI 10.17487/RFC8499, 375 January 2019, . 377 [RFC8899] Fairhurst, G., Jones, T., Tüxen, M., Rüngeler, I., and T. 378 Völker, "Packetization Layer Path MTU Discovery for 379 Datagram Transports", RFC 8899, DOI 10.17487/RFC8899, 380 September 2020, . 382 [RFC8900] Bonica, R., Baker, F., Huston, G., Hinden, R., Troan, O., 383 and F. Gont, "IP Fragmentation Considered Fragile", 384 BCP 230, RFC 8900, DOI 10.17487/RFC8900, September 2020, 385 . 387 10.2. Informative References 389 [Brandt2018] 390 Brandt, M., Dai, T., Klein, A., Shulman, H., and M. 391 Waidner, "Domain Validation++ For MitM-Resilient PKI", 392 Proceedings of the 2018 ACM SIGSAC Conference on Computer 393 and Communications Security , 2018. 395 [DNSFlagDay2020] 396 "DNS flag day 2020", n.d., . 398 [Fujiwara2018] 399 Fujiwara, K., "Measures against cache poisoning attacks 400 using IP fragmentation in DNS", OARC 30 Workshop , 2019. 402 [Herzberg2013] 403 Herzberg, A. and H. Shulman, "Fragmentation Considered 404 Poisonous", IEEE Conference on Communications and Network 405 Security , 2013. 407 [Hlavacek2013] 408 Hlavacek, T., "IP fragmentation attack on DNS", RIPE 67 409 Meeting , 2013, . 412 [Huston2021] 413 Huston, G. and J. Damas, "Measuring DNS Flag Day 2020", 414 OARC 34 Workshop , February 2021. 416 [RFC3542] Stevens, W., Thomas, M., Nordmark, E., and T. Jinmei, 417 "Advanced Sockets Application Program Interface (API) for 418 IPv6", RFC 3542, DOI 10.17487/RFC3542, May 2003, 419 . 421 [RFC7739] Gont, F., "Security Implications of Predictable Fragment 422 Identification Values", RFC 7739, DOI 10.17487/RFC7739, 423 February 2016, . 425 Appendix A. How to retrieve path MTU value to a destination from 426 applications 428 Socket options: "IP_MTU (since Linux 2.2) Retrieve the current known 429 path MTU of the current socket. Valid only when the socket has been 430 connected. Returns an integer. Only valid as a getsockopt(2)." 431 (Quoted from Debian GNU Linux manual: ip(7)) 433 "IPV6_MTU getsockopt(): Retrieve the current known path MTU of the 434 current socket. Only valid when the socket has been connected. 435 Returns an integer." (Quoted from Debian GNU Linux manual: ipv6(7)) 437 Appendix B. Minimal-responses 439 Some implementations have 'minimal responses' configuration that 440 causes a DNS server to make response packets smaller, containing only 441 mandatory and required data. 443 Under the minimal-responses configuration, DNS servers compose 444 response messages using only RRSets corresponding to queries. In 445 case of delegation, DNS servers compose response packets with 446 delegation NS RRSet in authority section and in-domain (in-zone and 447 below-zone) glue in the additional data section. In case of non- 448 existent domain name or non-existent type, the start of authority 449 (SOA RR) will be placed in the Authority Section. 451 In addition, if the zone is DNSSEC signed and a query has the DNSSEC 452 OK bit, signatures are added in answer section, or the corresponding 453 DS RRSet and signatures are added in authority section. Details are 454 defined in [RFC4035] and [RFC5155]. 456 Authors' Addresses 458 Kazunori Fujiwara 459 Japan Registry Services Co., Ltd. 460 Chiyoda First Bldg. East 13F, 3-8-1 Nishi-Kanda, Chiyoda-ku, Tokyo 461 101-0065 462 Japan 463 Phone: +81 3 5215 8451 464 Email: fujiwara@jprs.co.jp 466 Paul Vixie 467 Farsight Security Inc 468 177 Bovet Road, Suite 180 469 San Mateo, CA, 94402 470 United States of America 472 Phone: +1 650 393 3994 473 Email: vixie@fsi.io