idnits 2.17.1 draft-fujiwara-dnsop-avoid-fragmentation-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (April 13, 2020) is 1472 days in the past. Is this intentional? Checking references for intended status: Best Current Practice ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 3542 ** Downref: Normative reference to an Informational RFC: RFC 7739 ** Obsolete normative reference: RFC 8499 (Obsoleted by RFC 9499) Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group K. Fujiwara 3 Internet-Draft JPRS 4 Intended status: Best Current Practice P. Vixie 5 Expires: October 15, 2020 Farsight 6 April 13, 2020 8 Fragmentation Avoidance in DNS 9 draft-fujiwara-dnsop-avoid-fragmentation-03 11 Abstract 13 Path MTU discovery remains widely undeployed due to security issues, 14 and IP fragmentation has exposed weaknesses in application protocols. 15 Currently, DNS is known to be the largest user of IP fragmentation. 16 It is possible to avoid IP fragmentation in DNS by limiting response 17 size where possible, and signaling the need to upgrade from UDP to 18 TCP transport where necessary. This document proposes to avoid IP 19 fragmentation in DNS. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at https://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on October 15, 2020. 38 Copyright Notice 40 Copyright (c) 2020 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (https://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 56 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 3. Proposal to avoid IP fragmentation in DNS . . . . . . . . . . 3 58 4. Maximum DNS/UDP payload size . . . . . . . . . . . . . . . . 5 59 5. Incremental deployment . . . . . . . . . . . . . . . . . . . 5 60 6. Request to zone operator and DNS server operator . . . . . . 5 61 7. Considerations . . . . . . . . . . . . . . . . . . . . . . . 6 62 7.1. Protocol compliance . . . . . . . . . . . . . . . . . . . 6 63 7.2. DNS packet size . . . . . . . . . . . . . . . . . . . . . 6 64 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 65 9. Security Considerations . . . . . . . . . . . . . . . . . . . 7 66 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 67 10.1. Normative References . . . . . . . . . . . . . . . . . . 7 68 10.2. Informative References . . . . . . . . . . . . . . . . . 9 69 Appendix A. How to retrieve path MTU value to a destination . . 9 70 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 72 1. Introduction 74 DNS has EDNS0 [RFC6891] mechanism. It enables that DNS server can 75 send large size response using UDP. Now EDNS0 is widely deployed, 76 and DNS (over UDP) is said to be the biggest user of IP 77 fragmentation. 79 However, "Fragmentation Considered Poisonous" [Herzberg2013] proposed 80 effective off-path DNS cache poisoning attack vectors using IP 81 fragmentation. "IP fragmentation attack on DNS" [Hlavacek2013] and 82 "Domain Validation++ For MitM-Resilient PKI" [Brandt2018] proposed 83 that off-path attackers can intervene in path MTU discovery [RFC1191] 84 to perform intentionally fragmented responses from authoritative 85 servers. [RFC7739] stated security implications of predictable 86 fragment identification values. 88 And more, Section 3.2 Message Side Guidelines of UDP Usage Guidelines 89 [RFC8085] specifies that an application SHOULD NOT send UDP datagrams 90 that result in IP packets that exceed the Maximum Transmission Unit 91 (MTU) along the path to the destination. 93 As a result, we cannot trust fragmented UDP datagrams, primarily due 94 to the small amount of entropy provided by UDP port numbers and DNS 95 message identifiers, each of which being only 16 bits in size. By 96 comparison, TCP is considered resistant against IP fragmentation 97 attacks because TCP has a 32-bit sequence number and 32-bit 98 acknowledgement number in each segment. In TCP, fragmentation should 99 be avoided for performance reasons, whereas for UDP, fragmentation 100 should be avoided for resiliency and authenticity reasons. 102 [I-D.ietf-intarea-frag-fragile] summarized that IP fragmentation 103 introduces fragility to Internet communication. The transport of DNS 104 messages over UDP should take account of the observations stated in 105 that document. 107 This document proposes to avoid IP fragmentation in DNS/UDP. 109 2. Terminology 111 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 112 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 113 "OPTIONAL" in this document are to be interpreted as described in 114 BCP14 [RFC2119] [RFC8174] when, and only when, they appear in all 115 capitals, as shown here. 117 "Requestor" refers to the side that sends a request. "Responder" 118 refers to an authoritative, recursive resolver or other DNS component 119 that responds to questions. (Quoted from EDNS0 [RFC6891]) 121 "path MTU" is the minimum link MTU of all the links in a path between 122 a source node and a destination node. (Quoted from [RFC8201]) 124 Many of the specialized terms used in this document are defined in 125 DNS Terminology [RFC8499]. 127 3. Proposal to avoid IP fragmentation in DNS 129 TCP avoids fragmentation using its Maximum Segment Size (MSS) 130 parameter, but each transmitted segment is header-size aware such 131 that the size of the IP and TCP headers is known, as well as the far 132 end's MSS parameter and the interface or path MTU, so that the 133 segment size can be chosen so as to keep the each IP datagram below a 134 target size. The takes advantage of the elasticity of TCP's 135 packetizing process as to how much queued data will fit into the next 136 segment. In contrast, DNS has no message size elasticity and lacks 137 insight into IP header and option size, and so must make more 138 conservative estimates about available UDP payload space. 140 The minimum MTU for an IPv4 interface is 68 octets, and all receivers 141 must be able to receive and reassemble datagrams at least 576 octets 142 in size (see Section 2.1, NOTE 1 of [I-D.ietf-intarea-frag-fragile]). 143 The minimum MTU for and for an IPv6 interface is 1280 octets (see 144 Section 5 of [RFC8200]). These are theoretic limits and no modern 145 networks implement them. In practice, the smallest MTU witnessed in 146 the operational DNS community is 1500 octets, the Ethernet maximum 147 payload size. While many non-ethernet networks exist such as Packet 148 on SONET (PoS), Fiber Distributed Data Exchange (FDDI), and Ethernet 149 Jumbo Frame, there is no reliable way of discovering such links in an 150 IP transmission path. Absent some kind of path MTU discovery result 151 or a static configuration by the server or system operator, a 152 conservative estimate must be chosen, even if it is less efficient 153 than the path MTU would have been had it been measurable. 155 The methods to avoid IP fragmentation in DNS are described below: 157 o UDP requestors and responders SHOULD send DNS responses with 158 IP_DONTFRAG / IPV6_DONTFRAG [RFC3542] options, which will yield 159 either a silent timeout, or a network (ICMP) error, if the path 160 MTU is exceeded. Upon a timeout, UDP requestors may retry using 161 TCP or UDP, per local policy. 163 o The estimated maximum DNS/UDP payload size SHOULD be the actual or 164 estimated path MTU minus the estimated header space. When actual 165 path MTU information is not available, use the default maximum 166 DNS/UDP payload size described in following section. 168 o The maximum buffer size offered by an EDNS0 requestor SHOULD be no 169 larger than the estimated maximum DNS/UDP payload size. If the 170 response cannot be reasonably expected fit into a buffer of that 171 size, the initiator should use TCP instead of UDP. 173 o Responders SHOULD compose UDP responses that result in IP packets 174 that do not exceed the path MTU to the requestor. Thus, if the 175 requestor offers a buffer size larger than responder's estimated 176 maximum DNS/UDP payload size, then the responder will behave as 177 though the requestor had specified a buffer size equal to the 178 responder's estimated maximum DNS/UDP payload size. 180 o Fragmented DNS/UDP messages may be dropped without IP reassembly. 181 An ICMP error should be sent in this case, with rate limiting to 182 prevent this logic from becoming a DDoS amplification vector. If 183 rate limiting is not possible, then no ICMP error should be sent. 184 (This is a countermeasure against DNS spoofing attacks using IP 185 fragmentation.) 187 The cause and effect of the TC bit is unchanged from EDNS0 [RFC6891]. 189 4. Maximum DNS/UDP payload size 191 o Most of the Internet and especially the inner core has an MTU of 192 at least 1500 octets. An operator of a full resolver would be 193 well advised to measure their path MTU to several authority name 194 servers and to a random sample of their expected stub resolver 195 client networks, to find the upper boundary on IP/UDP packet size 196 in the average case. This limit should not be exceeded by most 197 answers received or transmitted by a full resolver, or else 198 fallback to TCP will occur too often. An operator of 199 authoritative servers would be also well advised to measure their 200 path MTU to several full-service resolvers. The Linux tool 201 "tracepath" can be used to measure the path MTU to well known 202 authority name servers such as [a-m].root-servers.net or [a- 203 m].gtld-servers.net. If the reported path MTU is for example no 204 smaller than 1460, then the maximum DNS/UDP payload would be 1432 205 for IP4 (which is 1460 - IP4 header(20) - UDP header(8)) and 1412 206 for IP6 (which is 1460 - IP6 header(40) - UDP header(8)). To 207 allow for possible IP options and faraway tunnel overhead, a 208 useful default for maximum DNS/UDP payload size would be 1400. 210 o [RFC4035] defines that "A security-aware name server MUST support 211 the EDNS0 message size extension, MUST support a message size of 212 at least 1220 octets". Then, the smallest number of the maximum 213 DNS/UDP payload size is 1220. 215 o DNS flag day 2020 proposed 1232 as an EDNS buffer size. 216 [DNSFlagDay2020] 218 5. Incremental deployment 220 The proposed method supports incremental deployment. 222 When a full-service resolver implements the proposed method, its stub 223 resolvers (clients) and the authority server network will no longer 224 observe IP fragmentation or reassembly from that server, and will 225 fall back to TCP when necessary. 227 When an authoritative server implements the proposed method, its full 228 service resolvers (clients) will no longer observe IP fragmentation 229 or reassembly from that server, and will fall back to TCP when 230 necessary. 232 6. Request to zone operator and DNS server operator 234 Large DNS responses are the result of zone configuration. Zone 235 operators SHOULD seek configurations resulting in small responses. 236 For example, 237 o Use smaller number of name servers (13 may be too large) 239 o Use smaller number of A/AAAA RRs for a domain name 241 o Use smaller signature / public key size algorithm for DNSSEC. 242 Notably, the signature size of ECDSA or EdDSA is smaller than RSA. 244 o Use 'minimal-responses' configuration: Some implementations have 245 'minimal responses' configuration that enables that DNS servers 246 make response packets smaller, mandatory and required data only. 248 7. Considerations 250 7.1. Protocol compliance 252 In prior research ([Fujiwara2018] and dns-operations mailing list 253 discussions), there are some authoritative servers that ignore EDNS0 254 requestor's UDP payload size, and return large UDP responses. 256 It is also well known that there are some authoritative servers that 257 do not support TCP transport. 259 Such noncompliant behaviour cannot become implementation or 260 configuration constraints for the rest of the DNS. If failure is the 261 result, then that failure must be localized to the noncompliant 262 servers. 264 7.2. DNS packet size 266 Many stub resolvers do not set the DNSSEC OK bit. In this case, 267 responses from full-service resolvers may be small. 269 With 'minimal-response' configuration, DNS servers can be forced to 270 emit small responses. 272 Server |DNSSEC| Answer | Response data |response 273 type | OK | type | Answer/Authority/Add. | size 274 ---------+------+----------+-----------------------+----- 275 Resolver | No | Exist | RRSet// |RRSet 276 Resolver | No | Not exist| /SOA/ |SOA 277 Resolver | Yes | Exist | RRSet+RRSIG// |RRSet+RRSIG 278 Resolver | Yes | Not exist| /SOA+NSEC+RRSIG/ |SOA+NSEC*2+RRSIG*3 279 Auth. | No | Referral | /NS/glue |NS+glue 280 Auth. | No | Exist | RRSet// |RRSet 281 Auth. | No | Not exist| /SOA/ |SOA 282 Auth. | Yes | Referral | /DS+RRSIG+NS/glue |NS+glue+DS+RRSIG 283 Auth. | Yes | Referral | /NSEC+RRSIG+NS/glue |NS+glue+NSEC+RRSIG 284 Auth. | Yes | Exist | RRSet+RRSIG// |RRSet+RRSIG 285 Auth. | Yes | Not exist| /SOA+NSEC*2+RRSIG/ |SOA+NSEC*2+RRSIG*3 287 Non-existent answers with DNSSEC are largest. 289 Without 'minimal responses' configuration, DNS servers may add 290 unnecessary NS RRset in authority section and nameservers' A/AAAA 291 RRSet in additional section. 293 However, with 'minimal-responses' configuration, zone operators can 294 control the authoritative server's response size (selection of DNSKEY 295 algorithm and size, and number of resource records). 297 8. IANA Considerations 299 This document has no IANA actions. 301 9. Security Considerations 303 10. References 305 10.1. Normative References 307 [I-D.ietf-intarea-frag-fragile] 308 Bonica, R., Baker, F., Huston, G., Hinden, R., Troan, O., 309 and F. Gont, "IP Fragmentation Considered Fragile", draft- 310 ietf-intarea-frag-fragile-17 (work in progress), September 311 2019. 313 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 314 DOI 10.17487/RFC1191, November 1990, 315 . 317 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 318 Requirement Levels", BCP 14, RFC 2119, 319 DOI 10.17487/RFC2119, March 1997, 320 . 322 [RFC3542] Stevens, W., Thomas, M., Nordmark, E., and T. Jinmei, 323 "Advanced Sockets Application Program Interface (API) for 324 IPv6", RFC 3542, DOI 10.17487/RFC3542, May 2003, 325 . 327 [RFC4035] Arends, R., Austein, R., Larson, M., Massey, D., and S. 328 Rose, "Protocol Modifications for the DNS Security 329 Extensions", RFC 4035, DOI 10.17487/RFC4035, March 2005, 330 . 332 [RFC6891] Damas, J., Graff, M., and P. Vixie, "Extension Mechanisms 333 for DNS (EDNS(0))", STD 75, RFC 6891, 334 DOI 10.17487/RFC6891, April 2013, 335 . 337 [RFC7739] Gont, F., "Security Implications of Predictable Fragment 338 Identification Values", RFC 7739, DOI 10.17487/RFC7739, 339 February 2016, . 341 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 342 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 343 March 2017, . 345 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 346 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 347 May 2017, . 349 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 350 (IPv6) Specification", STD 86, RFC 8200, 351 DOI 10.17487/RFC8200, July 2017, 352 . 354 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 355 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 356 DOI 10.17487/RFC8201, July 2017, 357 . 359 [RFC8499] Hoffman, P., Sullivan, A., and K. Fujiwara, "DNS 360 Terminology", BCP 219, RFC 8499, DOI 10.17487/RFC8499, 361 January 2019, . 363 10.2. Informative References 365 [Brandt2018] 366 Brandt, M., Dai, T., Klein, A., Shulman, H., and M. 367 Waidner, "Domain Validation++ For MitM-Resilient PKI", 368 Proceedings of the 2018 ACM SIGSAC Conference on Computer 369 and Communications Security , 2018. 371 [DNSFlagDay2020] 372 "DNS flag day 2020", n.d., . 374 [Fujiwara2018] 375 Fujiwara, K., "Measures against cache poisoning attacks 376 using IP fragmentation in DNS", OARC 30 Workshop , 2019. 378 [Herzberg2013] 379 Herzberg, A. and H. Shulman, "Fragmentation Considered 380 Poisonous", IEEE Conference on Communications and Network 381 Security , 2013. 383 [Hlavacek2013] 384 Hlavacek, T., "IP fragmentation attack on DNS", RIPE 67 385 Meeting , 2013, . 388 Appendix A. How to retrieve path MTU value to a destination 390 Socket options: "IP_MTU (since Linux 2.2) Retrieve the current known 391 path MTU of the current socket. Valid only when the socket has been 392 connected. Returns an integer. Only valid as a getsockopt(2)." 393 (Quoted from Debian GNU Linux manual: ip(7)) 395 "IPV6_MTU getsockopt(): Retrieve the current known path MTU of the 396 current socket. Only valid when the socket has been connected. 397 Returns an integer." (Quoted from Debian GNU Linux manual: ipv6(7)) 399 Authors' Addresses 401 Kazunori Fujiwara 402 Japan Registry Services Co., Ltd. 403 Chiyoda First Bldg. East 13F, 3-8-1 Nishi-Kanda 404 Chiyoda-ku, Tokyo 101-0065 405 Japan 407 Phone: +81 3 5215 8451 408 Email: fujiwara@jprs.co.jp 409 Paul Vixie 410 Farsight Security Inc 411 177 Bovet Road, Suite 180 412 San Mateo, CA 94402 414 Phone: +1 650 393 3994 415 Email: paul@redbarn.org