idnits 2.17.1 draft-ietf-intarea-gre-mtu-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 392: '...s the vulnerability. Operators SHOULD...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 14, 2015) is 3242 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 1981 (Obsoleted by RFC 8201) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Intarea Working Group R. Bonica 3 Internet-Draft Juniper Networks 4 Intended status: Informational C. Pignataro 5 Expires: November 15, 2015 Cisco Systems 6 J. Touch 7 USC/ISI 8 May 14, 2015 10 A Widely-Deployed Solution To The Generic Routing Encapsulation (GRE) 11 Fragmentation Problem 12 draft-ietf-intarea-gre-mtu-05 14 Abstract 16 This memo describes how many vendors have solved the Generic Routing 17 Encapsulation (GRE) fragmentation problem. The solution described 18 herein is configurable. It is widely deployed on the Internet in its 19 default configuration. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on November 15, 2015. 38 Copyright Notice 40 Copyright (c) 2015 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 56 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 57 2. Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 4 58 2.1. RFC 4459 Solutions . . . . . . . . . . . . . . . . . . . 4 59 2.2. A Widely-Deployed Solution . . . . . . . . . . . . . . . 5 60 3. Implementation Details . . . . . . . . . . . . . . . . . . . 5 61 3.1. General . . . . . . . . . . . . . . . . . . . . . . . . . 6 62 3.2. GRE MTU (GMTU) Estimation and Discovery . . . . . . . . . 6 63 3.3. GRE Ingress Node Procedures . . . . . . . . . . . . . . . 6 64 3.3.1. Procedures Affecting the GRE Payload . . . . . . . . 6 65 3.3.2. Procedures Affecting The GRE Deliver Header . . . . . 7 66 3.4. GRE Egress Node Procedures . . . . . . . . . . . . . . . 8 67 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 68 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 69 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 70 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 71 7.1. Normative References . . . . . . . . . . . . . . . . . . 9 72 7.2. Informative References . . . . . . . . . . . . . . . . . 10 73 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 75 1. Introduction 77 Generic Routing Encapsulation (GRE) [RFC2784] [RFC2890] can be used 78 to carry any network layer protocol over any network layer protocol. 79 GRE has been implemented by many vendors and is widely deployed in 80 the Internet. 82 The GRE specification does not describe fragmentation procedures. 83 Lacking guidance from the specification, vendors have developed 84 implementation-specific fragmentation solutions. A GRE tunnel will 85 operate correctly only if its ingress and egress nodes support 86 compatible fragmentation solutions. [RFC4459] describes several 87 fragmentation solutions and evaluates their relative merits. 89 This memo reviews the fragmentation solutions presented in [RFC4459]. 90 It also describes how many vendors have solved the GRE fragmentation 91 problem. The solution described herein is configurable, and has been 92 widely deployed in its default configuration. 94 This memo addresses point-to-point unicast GRE tunnels that carry 95 IPv4, IPv6 or MPLS payloads over IPv4 or IPv6. All other tunnel 96 types are beyond the scope of this document. 98 1.1. Terminology 100 The following terms are specific to GRE and are taken from [RFC2784]: 102 o GRE delivery header - an IPv4 or IPv6 header whose source address 103 represents the GRE ingress node and whose destination address 104 represents the GRE egress node. The GRE delivery header 105 encapsulates a GRE header. 107 o GRE header - the GRE protocol header. The GRE header is 108 encapsulated in the GRE delivery header and encapsulates GRE 109 payload. 111 o GRE payload - a network layer packet that is encapsulated by the 112 GRE header. The GRE payload can be IPv4, IPv6 or MPLS. 113 Procedures for encapsulating IPv4 in GRE are described in 114 [RFC2784] and [RFC2890]. Procedures for encapsulating IPv6 in GRE 115 are described in [I-D.pignataro-intarea-gre-ipv6]. Procedures for 116 encapsulating MPLS in GRE are described in [RFC4023]. While other 117 protocols may be delivered over GRE, they are beyond the scope of 118 this document. 120 o GRE delivery packet - A packet containing a GRE delivery header, a 121 GRE header, and GRE payload. 123 o GRE payload header - the IPv4, IPv6 or MPLS header of the GRE 124 payload 126 o GRE overhead - the combined size of the GRE delivery header and 127 the GRE header, measured in octets 129 The following terms are specific to MTU discovery: 131 o link MTU (LMTU) - the maximum transmission unit, i.e., maximum 132 packet size in octets, that can be conveyed over a link. LMTU is 133 a unidirectional metric. A bidirectional link may be 134 characterized by one LMTU in the forward direction and another 135 LMTU in the reverse direction. 137 o path MTU (PMTU) - the minimum LMTU of all the links in a path 138 between a source node and a destination node. If the source and 139 destination node are connected through an equal cost multipath 140 (ECMP), the PMTU is equal to the minimum LMTU of all links 141 contributing to the multipath. 143 o GRE MTU (GMTU) - the maximum transmission unit, i.e., maximum 144 packet size in octets, that can be conveyed over a GRE tunnel 145 without fragmentation of any kind. The GMTU is equal to the PMTU 146 associated with the path between the GRE ingress and the GRE 147 egress, minus the GRE overhead 149 o Path MTU Discovery (PMTUD) - A procedure for dynamically 150 discovering the PMTU between two nodes on the Internet. PMTUD 151 procedures for IPv4 are defined in [RFC1191]. PMTUD procedures 152 for IPv6 are defined in [RFC1981]. 154 The following terms are introduced by this memo: 156 o fragmentable packet - A packet that can be fragmented by the GRE 157 ingress before being transported over a GRE tunnel. That is, an 158 IPv4 packet with DF-bit equal to 0 and whose payload is larger 159 than 64 bytes. IPv6 packets are not fragmentable. 161 o ICMP Packet Too Big (PTB) message - an ICMPv4 [RFC0792] 162 Destination Unreachable message (Type = 3) with code equal to 4 163 (fragmentation needed and DF set) or an ICMPv6 [RFC4443] Packet 164 Too Big message (Type = 2) 166 2. Solutions 168 2.1. RFC 4459 Solutions 170 Section 3 of [RFC4459] identifies several tunnel fragmentation 171 solutions. These solutions define procedures to be invoked when the 172 tunnel ingress router receives a packet so large that it cannot be 173 forwarded though the tunnel without fragmentation of any kind. When 174 applied to GRE, these procedures are: 176 1. Discard the incoming packet and send an ICMP PTB message to the 177 incoming packet's source. 179 2. Fragment the incoming packet and encapsulate each fragment within 180 a complete GRE header and GRE delivery header. 182 3. Encapsulate the incoming packet in a single GRE header and GRE 183 delivery header. Perform source fragmentation on the resulting 184 GRE delivery packet. 186 As per RFC 4459, Strategy 2) is applicable only when the incoming 187 packet is fragmentable. Also as per RFC 4459, each strategy has its 188 relative merits and costs. 190 2.2. A Widely-Deployed Solution 192 Many vendors have implemented a configurable GRE fragmentation 193 solution. In its default configuration, the solution behaves as 194 follows: 196 o When the GRE ingress node receives a fragmentable packet with 197 length greater than the GMTU, it fragments the incoming packet and 198 encapsulates each fragment within a complete GRE header and GRE 199 delivery header. Fragmentation logic is as specified by the 200 payload protocol. 202 o When the GRE ingress node receives a non-fragmentable packet with 203 length greater than the GMTU, it discards the packet and send an 204 ICMP PTB message to the packet's source. 206 o When the GRE egress node receives a GRE delivery packet fragment, 207 it silently discards the fragment, without attempting to 208 reassemble the GRE delivery packet to which the fragment belongs. 210 In non-default configurations, the GRE ingress node can execute any 211 of the procedures defined in RFC 4459. 213 The solution described above is widely-deployed on the Internet in 214 its default configuration. However, the default configuration is not 215 always appropriate for GRE tunnels that carry IPv6. 217 IPv6 requires that every link in the Internet have an MTU of 1280 218 octets or greater. On any link that cannot convey a 1280-octet 219 packet in one piece, link-specific fragmentation and reassembly must 220 be provided at a layer below IPv6. 222 Therefore, the default configuration is appropriate for tunnels that 223 carry IPv6 only if the network is engineered so that the GMTU is 224 guaranteed to be 1280-bytes or greater. In all other scenarios, a 225 non-default configuration is required. 227 In the non-default configuration, when the GRE ingress router 228 receives a packet lager than the GMTU, the GRE ingress router 229 encapsulates the entire packet in a single GRE and delivery header. 230 It then fragments the delivery header and sends the resulting 231 fragments to the GRE egress, where they are reassembled. 233 3. Implementation Details 235 This section describes how many vendors have implemented the solution 236 described in Section 2.2. 238 3.1. General 240 The GRE ingress nodes satisfy all of the requirements stated in 241 [RFC2784]. 243 3.2. GRE MTU (GMTU) Estimation and Discovery 245 GRE ingress nodes support a configuration option that associates a 246 GMTU with a GRE tunnel. By default, GMTU is equal to the MTU 247 associated with next-hop toward the GRE egress node minus the GRE 248 overhead. 250 Typically, GRE ingress nodes further refine their GMTU estimate by 251 executing PMTUD procedures. However, if an implementation supports 252 PMTUD for GRE tunnels, it also includes a configuration option that 253 disables PMTUD. This configuration option is required to mitigate 254 certain denial of service attacks (see Section 5). 256 The ingress node's GMTU estimate will not always reflect the actual 257 GMTU. It is only an estimate. When a tunnel's GMTU changes, the 258 tunnel ingress node will not discover that change immediately. 259 Likewise, if the ingress node performs PMTUD procedures and tunnel 260 interior nodes cannot deliver ICMP feedback to the tunnel ingress, 261 GMTU estimates may be inaccurate. 263 3.3. GRE Ingress Node Procedures 265 This section defines procedures that GRE ingress nodes execute when 266 they receive a packet whose size is greater than the relevant GMTU. 268 3.3.1. Procedures Affecting the GRE Payload 270 3.3.1.1. IPv4 Payloads 272 By default, if the payload is fragmentable, the GRE ingress node 273 fragments the incoming packet and encapsulates each fragment within a 274 complete GRE header and GRE delivery header. Therefore, the GRE 275 egress node receives several complete, non-fragmented delivery 276 packets. Each delivery packet contains a fragment of the GRE 277 payload. The GRE egress node forwards the payload fragments to their 278 ultimate destination where they are reassembled. 280 Also by default, if the payload is not fragmentable, the GRE ingress 281 node discards the packet and sends an ICMPv4 Destination Unreachable 282 message to the packet's source. The ICMPv4 Destination Unreachable 283 message code equals 4 (fragmentation needed and DF set). The ICMPv4 284 Destination Unreachable message also contains a Next-hop MTU (as 285 specified by [RFC1191]) and the next-hop MTU is equal to the GMTU 286 associated with the tunnel. 288 The GRE ingress node supports a non-default configuration option that 289 invokes an alternative behavior. If that option is configured, the 290 GRE ingress node fragments the delivery packet. See Section 3.3.2 291 for details. 293 3.3.1.2. IPv6 Payloads 295 By default, the GRE ingress node discards the packet and sends an 296 ICMPv6 [RFC4443] Packet Too Big message to the payload source. The 297 MTU specified in the Packet Too Big message is equal to the GMTU 298 associated with the tunnel. 300 The GRE ingress node supports a non-default configuration option that 301 invokes an alternative behavior. If that option is configured, the 302 GRE ingress node fragments the delivery packet. See Section 3.3.2 303 for details. 305 3.3.1.3. MPLS Payloads 307 By default, the GRE ingress node discards the packet. As it is 308 impossible to reliably identify the payload source, the GRE ingress 309 node does not attempt to send an ICMP PTB message to the payload 310 source. 312 The GRE ingress node supports a non-default configuration option that 313 invokes an alternative behavior. If that option is configured, the 314 GRE ingress node fragments the delivery packet. See Section 3.3.2. 316 3.3.2. Procedures Affecting The GRE Deliver Header 318 3.3.2.1. Tunneling GRE Over IPv4 320 By default, the GRE ingress node does not fragment delivery packets. 321 However, the GRE ingress node includes a configuration option that 322 allows delivery packet fragmentation. 324 By default, the GRE ingress node sets the DF-bit in the delivery 325 header to 1 (Don't Fragment). However, the GRE ingress node also 326 supports a configuration option that invokes the following behavior: 328 o when the GRE payload is IPv6, the DF-bit on the delivery header is 329 set to 0 (Fragments Allowed) 331 o when the GRE payload is IPv4, the DF-bit is copied from the 332 payload header to the delivery header 334 When the DF-bit on an IPv4 delivery header is set to 0, the GRE 335 delivery packet can be fragmented by any node between the GRE ingress 336 and the GRE egress. 338 If the GRE egress node is configured to support reassembly, it will 339 reassemble fragmented delivery packets. Otherwise, the GRE egress 340 node will discard delivery packet fragments. 342 3.3.2.2. Tunneling GRE Over IPv6 344 By default, the GRE ingress node does not fragment delivery packets. 345 However, the GRE ingress node includes a configuration option that 346 allows this. 348 If the GRE egress node is configured to support reassembly, it will 349 reassemble fragmented delivery packets. Otherwise, the GRE egress 350 node will discard delivery packet fragments. 352 3.4. GRE Egress Node Procedures 354 By default, the GRE egress node silently discards GRE delivery packet 355 fragments, without attempting to reassemble the GRE delivery packets 356 to which the fragments belongs. 358 However, the GRE egress node supports a configuration option that 359 allows it to reassemble GRE delivery packets. 361 4. IANA Considerations 363 This document makes no request of IANA. 365 5. Security Considerations 367 In the GRE fragmentation solution described above, either the GRE 368 payload or the GRE delivery packet can be fragmented. If the GRE 369 payload is fragmented, it is typically reassembled at its ultimate 370 destination. If the GRE delivery packet is fragmented, it is 371 typically reassembled at the GRE egress node. 373 The packet reassembly process is resource intensive and vulnerable to 374 several denial of service attacks. In the simplest attack, the 375 attacker sends fragmented packets more quickly than the victim can 376 reassemble them. In a variation on that attack, the first fragment 377 of each packet is missing, so that no packet can ever be reassembled. 379 Given that the packet reassembly process is resource intensive and 380 vulnerable to denial of service attacks, operators should decide 381 where reassembly process is best performed. Having made that 382 decision, they should decide whether to fragment the GRE payload or 383 GRE delivery packet, accordingly. 385 Some IP implementations are vulnerable to the Overlapping Fragment 386 Attack [RFC1858]. This vulnerability is not specific to GRE and 387 needs to be considered in all environments where IP fragmentation is 388 present. [RFC3128] describes a procedure by which IPv4 389 implementations can partially mitigate the vulnerability. [RFC5722] 390 mandates a procedure by which IPv6-compliant implementations are 391 required to mitigate the vulnerability. The procedure described in 392 RFC 5722 completely mitigates the vulnerability. Operators SHOULD 393 ensure that the vulnerability is mitigated to their satisfaction on 394 equipment that they deploy. 396 PMTU Discovery is vulnerable to two denial of service attacks (see 397 Section 8 of [RFC1191] for details). Both attacks are based upon on 398 a malicious party sending forged ICMPv4 Destination Unreachable or 399 ICMPv6 Packet Too Big messages to a host. In the first attack, the 400 forged message indicates an inordinately small PMTU. In the second 401 attack, the forged message indicates an inordinately large MTU. In 402 both cases, throughput is adversely affected. On order to mitigate 403 such attacks, GRE implementations include a configuration option to 404 disable PMTU discovery on GRE tunnels. Also, they can include a 405 configuration option that conditions the behavior of PMTUD to 406 establish a minimum PMTU. 408 6. Acknowledgements 410 The authors would like to thank Fred Baker, Fred Detienne, Jagadish 411 Grandhi, Jeff Haas, Brian Haberman, Vanitha Neelamegam, Masataka 412 Ohta, John Scudder, Mike Sullenberger, Tom Taylor and Wen Zhang for 413 their constructive comments. The authors also express their 414 gratitude to Vanessa Ameen, without whom this memo could not have 415 been written. 417 7. References 419 7.1. Normative References 421 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 422 RFC 792, September 1981. 424 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 425 November 1990. 427 [RFC1858] Ziemba, G., Reed, D., and P. Traina, "Security 428 Considerations for IP Fragment Filtering", RFC 1858, 429 October 1995. 431 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 432 for IP version 6", RFC 1981, August 1996. 434 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 435 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 436 March 2000. 438 [RFC2890] Dommety, G., "Key and Sequence Number Extensions to GRE", 439 RFC 2890, September 2000. 441 [RFC3128] Miller, I., "Protection Against a Variant of the Tiny 442 Fragment Attack (RFC 1858)", RFC 3128, June 2001. 444 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating 445 MPLS in IP or Generic Routing Encapsulation (GRE)", RFC 446 4023, March 2005. 448 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 449 Message Protocol (ICMPv6) for the Internet Protocol 450 Version 6 (IPv6) Specification", RFC 4443, March 2006. 452 [RFC5722] Krishnan, S., "Handling of Overlapping IPv6 Fragments", 453 RFC 5722, December 2009. 455 7.2. Informative References 457 [I-D.pignataro-intarea-gre-ipv6] 458 Pignataro, C., Bonica, R., and S. Krishnan, "IPv6 Support 459 for Generic Routing Encapsulation (GRE)", draft-pignataro- 460 intarea-gre-ipv6-01 (work in progress), October 2014. 462 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 463 Network Tunneling", RFC 4459, April 2006. 465 Authors' Addresses 467 Ron Bonica 468 Juniper Networks 469 2251 Corporate Park Drive Herndon 470 Herndon, Virginia 20170 471 USA 473 Email: rbonica@juniper.net 474 Carlos Pignataro 475 Cisco Systems 476 7200-12 Kit Creek Road 477 Research Triangle Park, North Carolina 27709 478 USA 480 Email: cpignata@cisco.com 482 Joe Touch 483 USC/ISI 484 4676 Admiralty Way 485 Marina del Rey, California 90292-6695 486 USA 488 Phone: +1 (310) 448-9151 489 Email: touch@isi.edu 490 URI: http://www.isi.edu/touch