idnits 2.17.1 draft-ietf-intarea-gre-mtu-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 23, 2015) is 3291 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 1981 (Obsoleted by RFC 8201) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Intarea Working Group R. Bonica 3 Internet-Draft Juniper Networks 4 Intended status: Informational C. Pignataro 5 Expires: October 25, 2015 Cisco Systems 6 J. Touch 7 USC/ISI 8 April 23, 2015 10 A Widely-Deployed Solution To The Generic Routing Encapsulation (GRE) 11 Fragmentation Problem 12 draft-ietf-intarea-gre-mtu-03 14 Abstract 16 This memo describes how many vendors have solved the Generic Routing 17 Encapsulation (GRE) fragmentation problem. The solution described 18 herein is configurable. It is widely deployed on the Internet in its 19 default configuration. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on October 25, 2015. 38 Copyright Notice 40 Copyright (c) 2015 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 56 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 57 2. Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 4 58 2.1. RFC 4459 Solutions . . . . . . . . . . . . . . . . . . . 4 59 2.2. A Widely-Deployed Solution . . . . . . . . . . . . . . . 5 60 3. Implementation Details . . . . . . . . . . . . . . . . . . . 5 61 3.1. General . . . . . . . . . . . . . . . . . . . . . . . . . 6 62 3.2. GRE MTU (GMTU) Estimation and Discovery . . . . . . . . . 6 63 3.3. GRE Ingress Node Procedures . . . . . . . . . . . . . . . 6 64 3.3.1. Procedures Affecting the GRE Payload . . . . . . . . 6 65 3.3.2. Procedures Affecting The GRE Deliver Header . . . . . 7 66 3.4. GRE Egress Node Procedures . . . . . . . . . . . . . . . 8 67 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 68 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 69 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 70 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 71 7.1. Normative References . . . . . . . . . . . . . . . . . . 9 72 7.2. Informative References . . . . . . . . . . . . . . . . . 10 73 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 75 1. Introduction 77 Generic Routing Encapsulation (GRE) [RFC2784] [RFC2890] can be used 78 to carry any network layer protocol over any network layer protocol. 79 GRE has been implemented by many vendors and is widely deployed in 80 the Internet. 82 The GRE specification does not describe fragmentation procedures. 83 Lacking guidance from the specification, vendors have developed 84 implementation-specific fragmentation solutions. A GRE tunnel will 85 operate correctly only if its ingress and egress nodes support 86 compatible fragmentation solutions. [RFC4459] describes several 87 fragmentation solutions and evaluates their relative merits. 89 This memo reviews the fragmentation solutions presented in [RFC4459]. 90 It also describes how many vendors have solved the GRE fragmentation 91 problem. The solution described herein is configurable, and has been 92 widely deployed in its default configuration. 94 This memo addresses point-to-point unicast GRE tunnels that carry 95 IPv4, IPv6 or MPLS payloads over IPv4 or IPv6. All other tunnel 96 types are beyond the scope of this document. 98 1.1. Terminology 100 The following terms are specific to GRE and are taken from [RFC2784]: 102 o GRE delivery header - an IPv4 or IPv6 header whose source address 103 represents the GRE ingress node and whose destination address 104 represents the GRE egress node. The GRE delivery header 105 encapsulates a GRE header. 107 o GRE header - the GRE protocol header. The GRE header is 108 encapsulated in the GRE delivery header and encapsulates GRE 109 payload. 111 o GRE payload - a network layer packet that is encapsulated by the 112 GRE header. The GRE payload can be IPv4, IPv6 or MPLS. 113 Procedures for encapsulating IPv4 in GRE are described in 114 [RFC2784] and [RFC2890]. Procedures for encapsulating IPv6 in GRE 115 are described in [I-D.pignataro-intarea-gre-ipv6]. Procedures for 116 encapsulating MPLS in GRE are described in [RFC4023]. While other 117 protocols may be delivered over GRE, they are beyond the scope of 118 this document. 120 o GRE delivery packet - A packet containing a GRE delivery header, a 121 GRE header, and GRE payload. 123 o GRE payload header - the IPv4, IPv6 or MPLS header of the GRE 124 payload 126 o GRE overhead - the combined size of the GRE delivery header and 127 the GRE header, measured in octets 129 The following terms are specific to MTU discovery: 131 o link MTU (LMTU) - the maximum transmission unit, i.e., maximum 132 packet size in octets, that can be conveyed over a link. LMTU is 133 a unidirectional metric. A bidirectional link may be 134 characterized by one LMTU in the forward direction and another 135 LMTU in the reverse direction. 137 o path MTU (PMTU) - the minimum LMTU of all the links in a path 138 between a source node and a destination node. If the source and 139 destination node are connected through an equal cost multipath 140 (ECMP), the PMTU is equal to the minimum LMTU of all links 141 contributing to the multipath. 143 o GRE MTU (GMTU) - the maximum transmission unit, i.e., maximum 144 packet size in octets, that can be conveyed over a GRE tunnel 145 without fragmentation of any kind. The GMTU is equal to the PMTU 146 associated with the path between the GRE ingress and the GRE 147 egress, minus the GRE overhead 149 o Path MTU Discovery (PMTUD) - A procedure for dynamically 150 discovering the PMTU between two nodes on the Internet. PMTUD 151 procedures for IPv4 are defined in [RFC1191]. PMTUD procedures 152 for IPv6 are defined in [RFC1981]. 154 The following terms are introduced by this memo: 156 o fragmentable packet - A packet that can be fragmented by the GRE 157 ingress before being transported over a GRE tunnel. That is, an 158 IPv4 packet with DF-bit equal to 0 and whose payload is larger 159 than 64 bytes. IPv6 packets are not fragmentable. 161 o ICMP Packet Too Big (PTB) message - an ICMPv4 [RFC0792] 162 Destination Unreachable message (Type = 3) with code equal to 4 163 (fragmentation needed and DF set) or an ICMPv6 [RFC4443] Packet 164 Too Big message (Type = 2) 166 2. Solutions 168 2.1. RFC 4459 Solutions 170 Section 3 of [RFC4459] identifies several tunnel fragmentation 171 solutions. These solutions define procedures to be invoked when the 172 tunnel ingress router receives a packet so large that it cannot be 173 forwarded though the tunnel without fragmentation of any kind. When 174 applied to GRE, these procedures are: 176 1. Discard the incoming packet and send an ICMP PTB message to the 177 incoming packet's source. 179 2. Fragment the incoming packet and encapsulate each fragment within 180 a complete GRE header and GRE delivery header. 182 3. Encapsulate the incoming packet in a single GRE header and GRE 183 delivery header. Perform source fragmentation on the resulting 184 GRE delivery packet. 186 As per RFC 4459, Strategy 2) is applicable only when the incoming 187 packet is fragmentable. Also as per RFC 4459, each strategy has its 188 relative merits and costs. 190 2.2. A Widely-Deployed Solution 192 Many vendors have implemented a configurable GRE fragmentation 193 solution. In its default configuration, the solution behaves as 194 follows: 196 o When the GRE ingress node receives a fragmentable packet with 197 length greater than the GMTU, it fragments the incoming packet and 198 encapsulates each fragment within a complete GRE header and GRE 199 delivery header. Fragmentation logic is as specified by the 200 payload protocol. 202 o When the GRE ingress node receives a non-fragmentable packet with 203 length greater than the GMTU, it discards the packet and send an 204 ICMP PTB message to the packet's source. 206 o When the GRE egress node receives a GRE delivery packet fragment, 207 it silently discards the fragment, without attempting to 208 reassemble the GRE delivery packet to which the fragment belongs. 210 In non-default configurations, the GRE ingress node can execute any 211 of the procedures defined in RFC 4459. 213 The solution described above is widely-deployed on the Internet in 214 its default configuration. However, the default configuration is not 215 always appropriate for GRE tunnels that carry IPv6. 217 IPv6 requires that every link in the Internet have an MTU of 1280 218 octets or greater. On any link that cannot convey a 1280-octet 219 packet in one piece, link-specific fragmentation and reassembly must 220 be provided at a layer below IPv6. 222 Therefore, the default configuration is appropriate for tunnels that 223 carry IPv6 only if the network is engineered so that the GMTU is 224 guaranteed to be 1280-bytes or greater. In all other scenarios, a 225 non-default configuration is required. 227 In the non-default configuration, when the GRE ingress router 228 receives a packet lager than the GMTU, the GRE ingress router 229 encapsulates the entire packet in a single GRE and delivery header. 230 It then fragments the delivery header and sends the resulting 231 fragments to the GRE egress, where they are reassembled. 233 3. Implementation Details 235 This section describes how many vendors have implemented the solution 236 described in Section 2.2. 238 3.1. General 240 The GRE ingress nodes satisfy all of the requirements stated in 241 [RFC2784]. 243 3.2. GRE MTU (GMTU) Estimation and Discovery 245 GRE ingress nodes support a configuration option that associates a 246 GMTU with a GRE tunnel. By default, GMTU is equal to the MTU 247 associated with next-hop toward the GRE egress node minus the GRE 248 overhead. 250 Typically, GRE ingress nodes further refine their GMTU estimate by 251 executing PMTUD procedures. However, if an implementation supports 252 PMTUD for GRE tunnels, it also includes a configuration option that 253 disables PMTUD. This configuration option is required to mitigate 254 certain denial of service attacks (see Section 5). 256 The ingress node's GMTU estimate will not always reflect the actual 257 GMTU. It is only an estimate. When a tunnel's GMTU changes, the 258 tunnel ingress node will not discover that change immediately. 259 Likewise, if the ingress node performs PMTUD procedures and tunnel 260 interior nodes cannot deliver ICMP feedback to the tunnel ingress, 261 GMTU estimates may be inaccurate. 263 3.3. GRE Ingress Node Procedures 265 This section defines procedures that GRE ingress nodes execute when 266 they receive a packet whose size is greater than the relevant GMTU. 268 3.3.1. Procedures Affecting the GRE Payload 270 3.3.1.1. IPv4 Payloads 272 By default, if the payload is fragmentable, the GRE ingress node 273 fragments the incoming packet and encapsulates each fragment within a 274 complete GRE header and GRE delivery header. Therefore, the GRE 275 egress node receives several complete, non-fragmented delivery 276 packets. Each delivery packet contains a fragment of the GRE 277 payload. The GRE egress node forwards the payload fragments to their 278 ultimate destination where they are reassembled. 280 Also by default, if the payload is not fragmentable, the GRE ingress 281 node discards the packet and sends an ICMPv4 Destination Unreachable 282 message to the packet's source. The ICMPv4 Destination Unreachable 283 message code equals 4 (fragmentation needed and DF set). The ICMPv4 284 Destination Unreachable message also contains an Next-hop MTU (as 285 specified by [RFC1191]) and the next-hop MTU is equal to the GMTU 286 associated with the tunnel. 288 The GRE ingress node supports a non-default configuration option that 289 invokes an alternative behavior. If that option is configured, the 290 GRE ingress node fragments the delivery header. See Section 3.3.2 291 for details. 293 3.3.1.2. IPv6 Payloads 295 By default, the GRE ingress node discards the packet and send an 296 ICMPv6 [RFC4443] Packet Too Big message to the payload source. The 297 MTU specified in the Packet Too Big message is equal to the GMTU 298 associated with the tunnel. 300 The GRE ingress node supports a non-default configuration option that 301 invokes an alternative behavior. If that option is configured, the 302 GRE ingress node fragments the delivery header.See Section 3.3.2 for 303 details. 305 3.3.1.3. MPLS Payloads 307 By default, the GRE ingress node discards the packet. As it is 308 impossible to reliably identify the payload source, the GRE ingress 309 node does not attempt to send an ICMP PTB message to the payload 310 source. 312 The GRE ingress node supports a non-default configuration option that 313 invokes an alternative behavior. If that option is configured, the 314 GRE ingress node fragments the delivery header. See Section 3.3.2. 316 3.3.2. Procedures Affecting The GRE Deliver Header 318 3.3.2.1. Tunneling GRE Over IPv4 320 By default, the GRE ingress node does not fragment delivery packets. 321 However, the GRE ingress node includes a configuration option that 322 allows delivery packet fragmentation. 324 By default, the GRE ingress node sets the DF-bit in the delivery 325 header to 1 (Don't Fragment). However, the GRE ingress node also 326 supports a configuration option that invokes the following behavior: 328 o when the GRE payload is IPv6, the DF-bit on the delivery header is 329 set to 0 (Fragments Allowed) 331 o when the GRE payload is IPv4, the DF-bit is copied from the 332 payload header to the delivery header 334 When the DF-bit on an IPv4 delivery header is set to 0, the GRE 335 delivery packet can be fragmented by any node between the GRE ingress 336 and the GRE egress. 338 If the delivery packet is fragmented, it is reassembled by the GRE 339 egress. 341 3.3.2.2. Tunneling GRE Over IPv6 343 By default, the GRE ingress node does not fragment delivery packets. 344 However, the GRE ingress node includes a configuration option that 345 allows this. 347 If the delivery packet is fragmented, it is reassembled by the GRE 348 egress. 350 3.4. GRE Egress Node Procedures 352 By default, the GRE egress node silently discards GRE delivery packet 353 fragments, without attempting to reassemble the GRE delivery packets 354 to which the fragments belongs. 356 However, the GRE egress node supports a configuration option that 357 allows it to reassemble GRE delivery packets. 359 4. IANA Considerations 361 This document makes no request of IANA. 363 5. Security Considerations 365 In the GRE fragmentation solution described above, either the GRE 366 payload or the GRE delivery packet can be fragmented. If the GRE 367 payload is fragmented, it is typically reassembled at its ultimate 368 destination. If the GRE delivery packet is fragmented, it is 369 typically reassembled at the GRE egress node. 371 The packet reassembly process is resource intensive and vulnerable to 372 several denial of service attacks. In the simplest attack, the 373 attacker sends fragmented packets more quickly than the victim can 374 reassemble them. In a variation on that attack, the first fragment 375 of each packet is missing, so that no packet can ever be reassembled. 377 Given that the packet reassembly process is resource intensive and 378 vulnerable to denial of service attacks, operators should decide 379 where reassembly process is best performed. Having made that 380 decision, they should decide whether to fragment the GRE payload or 381 GRE delivery packet, accordingly. 383 PMTU Discovery is vulnerable to two denial of service attacks (see 384 Section 8 of [RFC1191] for details). Both attacks are based upon on 385 a malicious party sending forged ICMPv4 Destination Unreachable or 386 ICMPv6 Packet Too Big messages to a host. In the first attack, the 387 forged message indicates an inordinately small PMTU. In the second 388 attack, the forged message indicates an inordinately large MTU. In 389 both cases, throughput is adversely affected. On order to mitigate 390 such attacks, GRE implementations includes a configuration option to 391 disable PMTU discovery on GRE tunnels. Also, they can include a 392 configuration option that conditions the behavior of PMTUD to 393 establish a minimum PMTU. 395 6. Acknowledgements 397 The authors would like to thank Fred Baker, Fred Detienne, Jagadish 398 Grandhi, Jeff Haas, Brian Haberman, Vanitha Neelamegam, Masataka 399 Ohta, John Scudder, Mike Sullenberger and Wen Zhang for their 400 constructive comments. The authors also express their gratitude to 401 Vanessa Ameen, without whom this memo could not have been written. 403 7. References 405 7.1. Normative References 407 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 408 RFC 792, September 1981. 410 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 411 November 1990. 413 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 414 for IP version 6", RFC 1981, August 1996. 416 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 417 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 418 March 2000. 420 [RFC2890] Dommety, G., "Key and Sequence Number Extensions to GRE", 421 RFC 2890, September 2000. 423 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating 424 MPLS in IP or Generic Routing Encapsulation (GRE)", RFC 425 4023, March 2005. 427 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 428 Message Protocol (ICMPv6) for the Internet Protocol 429 Version 6 (IPv6) Specification", RFC 4443, March 2006. 431 7.2. Informative References 433 [I-D.pignataro-intarea-gre-ipv6] 434 Pignataro, C., Bonica, R., and S. Krishnan, "IPv6 Support 435 for Generic Routing Encapsulation (GRE)", draft-pignataro- 436 intarea-gre-ipv6-01 (work in progress), October 2014. 438 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 439 Network Tunneling", RFC 4459, April 2006. 441 Authors' Addresses 443 Ron Bonica 444 Juniper Networks 445 2251 Corporate Park Drive Herndon 446 Herndon, Virginia 20170 447 USA 449 Email: rbonica@juniper.net 451 Carlos Pignataro 452 Cisco Systems 453 7200-12 Kit Creek Road 454 Research Triangle Park, North Carolina 27709 455 USA 457 Email: cpignata@cisco.com 459 Joe Touch 460 USC/ISI 461 4676 Admiralty Way 462 Marina del Rey, California 90292-6695 463 USA 465 Phone: +1 (310) 448-9151 466 Email: touch@isi.edu 467 URI: http://www.isi.edu/touch