idnits 2.17.1 draft-templin-seal-18.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1140. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1151. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1158. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1164. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 7, 2008) is 5802 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) -- Obsolete informational reference (is this intentional?): RFC 1063 (Obsoleted by RFC 1191) -- Obsolete informational reference (is this intentional?): RFC 1981 (Obsoleted by RFC 8201) Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Templin, Ed. 3 Internet-Draft Boeing Phantom Works 4 Intended status: Informational June 7, 2008 5 Expires: December 9, 2008 7 The Subnetwork Encapsulation and Adaptation Layer (SEAL) 8 draft-templin-seal-18.txt 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on December 9, 2008. 35 Abstract 37 For the purpose of this document, subnetworks are defined as virtual 38 topologies that span connected network regions bounded by 39 encapsulated border nodes. These virtual topologies may span 40 multiple IP- and/or sub-IP layer forwarding hops, and can introduce 41 failure modes due to packet duplication and/or links with diverse 42 Maximum Transmission Units (MTUs). This document specifies a 43 Subnetwork Encapsulation and Adaptation Layer (SEAL) that 44 accommodates such virtual topologies over diverse underlying link 45 technologies. 47 Table of Contents 49 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 50 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . 3 51 1.2. Approach . . . . . . . . . . . . . . . . . . . . . . . . . 5 52 2. Terminology and Requirements . . . . . . . . . . . . . . . . . 5 53 3. Applicability Statement . . . . . . . . . . . . . . . . . . . 6 54 4. SEAL Protocol Specification - Tunnel Mode . . . . . . . . . . 7 55 4.1. Model of Operation . . . . . . . . . . . . . . . . . . . . 7 56 4.2. ITE Specification . . . . . . . . . . . . . . . . . . . . 9 57 4.2.1. Tunnel Interface MTU . . . . . . . . . . . . . . . . . 9 58 4.2.2. Accounting for Headers . . . . . . . . . . . . . . . . 10 59 4.2.3. Segmentation and Encapsulation . . . . . . . . . . . . 11 60 4.2.4. Sending Probes . . . . . . . . . . . . . . . . . . . . 13 61 4.2.5. Packet Identification . . . . . . . . . . . . . . . . 14 62 4.2.6. Sending SEAL Protocol Packets . . . . . . . . . . . . 14 63 4.2.7. Processing Raw ICMPv4 Messages . . . . . . . . . . . . 14 64 4.2.8. Processing SEAL-Encapsulated ICMPv4 Messages . . . . . 14 65 4.3. ETE Specification . . . . . . . . . . . . . . . . . . . . 16 66 4.3.1. Reassembly Buffer Requirements . . . . . . . . . . . . 16 67 4.3.2. IPv4-Layer Reassembly . . . . . . . . . . . . . . . . 16 68 4.3.3. Generating SEAL-Encapsulated ICMPv4 Fragmentation 69 Needed Messages . . . . . . . . . . . . . . . . . . . 16 70 4.3.4. SEAL-Layer Reassembly . . . . . . . . . . . . . . . . 18 71 4.3.5. Decapsulation and Generating Other ICMPv4 Errors . . . 18 72 5. SEAL Protocol Specification - Transport Mode . . . . . . . . . 19 73 6. Link Requirements . . . . . . . . . . . . . . . . . . . . . . 19 74 7. End System Requirements . . . . . . . . . . . . . . . . . . . 19 75 8. Router Requirements . . . . . . . . . . . . . . . . . . . . . 20 76 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 77 10. Security Considerations . . . . . . . . . . . . . . . . . . . 20 78 11. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 20 79 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21 80 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21 81 13.1. Normative References . . . . . . . . . . . . . . . . . . . 21 82 13.2. Informative References . . . . . . . . . . . . . . . . . . 22 83 Appendix A. Historic Evolution of PMTUD . . . . . . . . . . . . . 23 84 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 25 85 Intellectual Property and Copyright Statements . . . . . . . . . . 26 87 1. Introduction 89 As Internet technology and communication has grown and matured, many 90 techniques have developed that use virtual topologies (frequently 91 tunnels of one form or another) over an actual network that suppors 92 the Internet Protocol (IP) [RFC0791][RFC2460]. Those virtual 93 topologies have elements which appear as one hop in the virtual 94 topology, but are actually multiple IP or sub-IP layer hops. These 95 multiple hops often have quite diverse properties which are often not 96 even visible to the end-points of the virtual hop. This introduces 97 many failure modes that are not dealt with well in current 98 approaches. 100 The use of IP encapsulation has long been considered as the means for 101 creating such virtual topologies. However, the insertion of an outer 102 IP header reduces the effective path MTU as-seen by the IP layer. 103 When IPv4 is used, this reduced MTU can be accommodated through the 104 use of IPv4 fragmentation, but unmitigated in-the-network 105 fragmentation has been deemed "harmful" through operational 106 experience and studies conducted over the course of many years 107 [FRAG][FOLK][RFC4963]. Additionally, classical path MTU discovery 108 [RFC1191] has known operational issues that are exacerbated by in- 109 the-network tunnels [RFC2923][RFC4459]. In the following 110 subsections, we present further details on the motivation and 111 approach for addressing these issues. 113 1.1. Motivation 115 Before discussing the approach, it is necessary to first understand 116 the problems. In both the Internet and private-use networks today, 117 IPv4 is ubiquitously deployed as the Layer 3 protocol. The two 118 primary functions of IPv4 are to provide for 1) addressing, and 2) a 119 fragmentation and reassembly capability used to accommodate links 120 with diverse MTUs. While it is well known that the addressing 121 properties of IPv4 are limited (hence the larger address space 122 provided by IPv6), there is a lesser-known but growing consensus that 123 other limitations may be unable to sustain continued growth. 125 First, the IPv4 header Identification field is only 16 bits in 126 length, meaning that at most 2^16 packets pertaining to the same 127 (source, destination, protocol, Identification)-tuple may be active 128 in the Internet at a given time. Due to the escalating deployment of 129 high-speed links (e.g., 1Gbps Ethernet), however, this number may 130 soon become too small by several orders of magnitude. Furthermore, 131 there are many well-known limitations pertaining to IPv4 132 fragmentation and reassembly - even to the point that it has been 133 deemed "harmful" in both classic and modern-day studies (cited 134 above). In particular, IPv4 fragmentation raises issues ranging from 135 minor annoyances (e.g., slow-path processing in routers) to the 136 potential for major integrity issues (e.g., mis-association of the 137 fragments of multiple IP packets during reassembly). 139 As a result of these perceived limitations, a fragmentation-avoiding 140 technique for discovering the MTU of the forward path from a source 141 to a destination node was devised through the deliberations of the 142 Path MTU Discovery Working Group (PMTUDWG) during the late 1980's 143 through early 1990's (see: Appendix A). In this method, the source 144 node provides explicit instructions to routers in the path to discard 145 the packet and return an ICMP error message if an MTU restriction is 146 encountered. However, this approach has several serious shortcomings 147 that lead to an overall "brittleness". 149 In particular, site border routers in the Internet are more and more 150 being configured to discard ICMP error messages coming from the 151 outside world. This is due in large part to the fact that malicious 152 spoofing of error messages in the Internet is made simple since there 153 is no way to authenticate the source of the messages. Furthermore, 154 when a source node that requires ICMP error message feedback when a 155 packets is dropped due to an MTU restriction does not receive the 156 messages, a path MTU-related black hole occurs. This means that the 157 source will continue to send packets that are too large and never 158 receive an indication from the network that they are being discarded. 160 The issues with both IPv4 fragmentation and this "classical" method 161 of path MTU discovery are exacerbated further when IP-in-IP tunneling 162 is used. For example, site border routers that are configured as 163 ingress tunnel endpoints may be required to forward packets into the 164 subnetwork on behalf of hundreds, thousands, or even more original 165 sources located within the site. If IPv4 fragmentation were used, 166 this would quickly wrap the 16-bit Identification field and could 167 lead to undetected data corruption. If "classical" IPv4 168 fragmentation were used instead, the site border router may be 169 bombarded by ICMP error messages coming from the subnetwork which may 170 be either untrustworthy or insufficiently provisioned to allow 171 translation into error message to be returned to the original 172 sources. 174 The situation is exacerbated further still by IPsec tunnels, since 175 only the first IPv4 fragment of a fragmented packet contains the 176 transport protocol selectors (e.g., the source and destination ports) 177 required for identifying the correct security association rendering 178 fragmentation useless under certain circumstances. Even worse, there 179 may be no way for a site border router the configures an IPsec tunnel 180 to transcribe the encrypted packet fragment contained in an ICMP 181 error message into a suitable ICMP error message to return to the 182 original source. Due to these many limitations, a new approach to 183 accommodate links with diverse MTUs is necessary. 185 1.2. Approach 187 For the purpose of this document, subnetworks are defined as virtual 188 topologies that span connected network regions bounded by 189 encapsulating border nodes. Examples include the global Internet 190 interdomain routing core, Mobile Ad hoc Networks (MANETs) and some 191 enterprise networks. Subnetwork border nodes forward unicast and 192 multicast IP packets over the virtual topology across multiple IP- 193 and/or sub-IP layer forwarding hops which may introduce packet 194 duplication and/or traverse links with diverse Maximum Transmission 195 Units (MTUs) 197 This document introduces a Subnetwork Encapsulation and Adaptation 198 Layer (SEAL) for tunnel-mode operation of IP over subnetworks that 199 connect the Ingress- and Egress Tunnel Endpoints (ITEs/ETEs) of 200 border nodes. Operation in transport mode is also supported when 201 subnetwork border node upper-layer protocols negotiate the use of 202 SEAL during connection establishment. SEAL accommodates links with 203 diverse MTUs and supports efficient duplicate packet detection by 204 introducing a minimal mid-layer encapsulation. 206 The SEAL encapsulation introduces an extended Identification field 207 for packet identification and a mid-layer segmentation and reassembly 208 capability that allows simplified cutting and pasting of packets. 209 Moreover, SEAL senses in-the-network IPv4 fragmentation as a "noise" 210 indication that packet sizing parameters are "out of tune" with 211 respect to the network path. Instead of experiencing this 212 fragmentation as a disasterous event, however, SEAL naturally tunes 213 its packet sizing parameters to eliminate the in-the-network 214 fragmentation and thereby squelch the noise. The SEAL encapsulation 215 layer and protocol is specified in the following sections. 217 2. Terminology and Requirements 219 The terms "inner", "mid-layer" and "outer" respectively refer to the 220 innermost IP {layer, protocol, header, packet, etc.} before any 221 encapsulation, the mid-layer IP {protocol, header, packet, etc.) 222 after any mid-layer '*' encapsulation and the outermost IP {layer, 223 protocol, header, packet etc.} after SEAL/*/IPv4 encapsulation. 225 The term "IP" used throughout the document refers to either Internet 226 Protocol version (IPv4 or IPv6). Additionally, the notation IPvX/*/ 227 SEAL/*/IPvY refers to an inner IPvX packet encapsulated in any mid- 228 layer '*' encapsulations followed by the SEAL header followed by any 229 outer '*' encapsulations followed by an outer IPvY header, where the 230 notation "IPvX" means either IP protocol version (IPv4 or IPv6). 232 The following abbreviations correspond to terms used within this 233 document and elsewhere in common Internetworking nomenclature: 235 ITE - Ingress Tunnel Endpoint 237 ETE - Egress Tunnel Endpoint 239 PTB - an ICMPv6 "Packet Too Big" or an ICMPv4 "Fragmentation 240 Needed" message 242 DF - the IPv4 header "Don't Fragment" flag 244 MHLEN - the length of any mid-layer '*' headers and trailers 246 OHLEN - the length of the outer encapsulating SEAL/*/IPv4 headers 248 S_MRU- the per-ETE SEAL Maximum Reassembly Unit 250 S_MSS - the SEAL Maximum Segment Size 252 SEAL_ID - a 32-bit Identification value; randomly initialized and 253 monotonically incremented for each SEAL protocol packet 255 SEAL_PROTO - an IPv4 protocol number used for SEAL 257 SEAL_PORT - a TCP/UDP service port number used for SEAL 259 SEAL_OPTION - a TCP option number used for (transport-mode) SEAL 261 The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, 262 SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this 263 document, are to be interpreted as described in [RFC2119]. 265 3. Applicability Statement 267 SEAL was motivated by the specific case of subnetwork abstraction for 268 Mobile Ad-hoc Networks (MANETs), however the domain of applicability 269 also extends to subnetwork abstractions of enterprise networks, the 270 interdomain routing core, etc. The domain of application therefore 271 also includes the map-and-encaps architecture proposals in the IRTF 272 Routing Research Group (RRG) (see: http://www3.tools.ietf.org/group/ 273 irtf/trac/wiki/RoutingResearchGroup). 275 SEAL introduces a minimal new sublayer for IPvX in IPvY encapsulation 276 (e.g., as IPv6/SEAL/IPv4), and appears as a subnetwork encapsulation 277 as seen by the inner IP layer. SEAL can also be used as a sublayer 278 for encapsulating inner IP packets within outer UDP/IPv4 header 279 (e.g., as IPv6/SEAL/UDP/IPv4) such as for the Teredo domain of 280 applicability [RFC4380]. When it appears immediately after the outer 281 IPv4 header, the SEAL header is processed exactly as for IPv6 282 extension headers. 284 SEAL can also be used in "transport-mode", e.g., when the inner layer 285 includes upper layer protocol data rather than an encapsulated IP 286 packet. For instance, TCP peers can negotiate the use of SEAL for 287 the carriage of protocol data encapsulated as TCP/SEAL/IPv4. In this 288 sense, the "subnetwork" becomes the entire end-to-end path between 289 the TCP peers and may potentially span the entire Internet. 291 The current document version is specific to the use of IPv4 as the 292 outer encapsulation layer, however the same principles apply when 293 IPv6 is used as the outer layer. 295 4. SEAL Protocol Specification - Tunnel Mode 297 4.1. Model of Operation 299 SEAL supports the encapsulation of inner IP packets in mid-layer and 300 outer encapsulating headers/trailers. For example, an inner IPv6 301 packet would appear as IPv6/*/SEAL/*/IPv4 after mid-layer and outer 302 encapsulations, where '*' denotes zero or more additional 303 encapsulation sublayers. Ingres Tunnel Endpoints (ITEs) add mid- 304 layer '*' and outer SEAL/*/IPv4 encapsulations to the inner packets 305 they inject into a subnetwork, where the outermost IPv4 header 306 contains the source and destination addresses of the subnetwork 307 entry/exit points (i.e., the ITE/ETE), respectively. SEAL uses a new 308 Internet Protocol type and a new encapsulation sublayer for both 309 unicast and multicast. The ITE encapsulates an inner IP packet in 310 mid-layer and outer encapsulations as shown in Figure 1: 312 +-------------------------+ 313 | | 314 ~ Outer */IPv4 headers ~ 315 | | 316 I +-------------------------+ 317 n | SEAL Header | 318 n +-------------------------+ +-------------------------+ 319 e ~ Any mid-layer * headers ~ ~ Any mid-layer * headers ~ 320 r +-------------------------+ +-------------------------+ 321 | | | | 322 I --> ~ Inner IP ~ --> ~ Inner IP ~ 323 P --> ~ Packet ~ --> ~ Packet ~ 324 | | | | 325 P +-------------------------+ +-------------------------+ 326 a ~ Any mid-layer trailers ~ ~ Any mid-layer trailers ~ 327 c +-------------------------+ +-------------------------+ 328 k ~ Any outer trailers ~ 329 e +-------------------------+ 330 t 331 (After mid-layer encaps.) (After SEAL/*/IPv4 encaps.) 333 Figure 1: SEAL Encapsulation 335 where the SEAL header is inserted as follows: 337 o For simple IPvX/IPv4 encapsulations (e.g., 338 [RFC2003][RFC2004][RFC4213]), the SEAL header is inserted between 339 the inner IP and outer IPv4 headers as: IPvX/SEAL/IPv4. 341 o For tunnel-mode IPsec encapsulations over IPv4, [RFC4301], the 342 SEAL header is inserted between the {AH,ESP} header and outer IPv4 343 headers as: IPvX/*/{AH,ESP}/SEAL/IPv4. 345 o For IP encapsulations over transports such as UDP, the SEAL header 346 is inserted immediately after the outer transport layer header, 347 e.g., as IPvX/*/SEAL/UDP/IPv4. 349 SEAL-encapsulated packets include a 32-bit SEAL_ID formed from the 350 concatenation of the 16-bit ID Extension field in the SEAL header as 351 the most-significant bits, and with the 16-bit ID value in the outer 352 IPv4 header as the least-significant bits. (For tunnels that 353 traverse IPv4 Network Address Translators, the SEAL_ID is instead 354 maintained only within the 16-bit ID Extension field in the SEAL 355 header.) Routers within the subnetwork use the SEAL_ID for duplicate 356 packet detection, and ITEs/ETEs use the SEAL_ID for SEAL segmentation 357 and reassembly. 359 SEAL enables a multi-level segmentation and reassembly capability. 361 First, the ITE can use IPv4 fragmentation to fragment inner IPv4 362 packets with DF=0 before SEAL encapsulation to avoid lower-level 363 segmentation and reassembly. Secondly, the SEAL layer itself 364 provides a simple mid-layer cutting-and-pasting of mid-layer packets 365 to avoid IPv4 fragmentation on the outer packet. Finally, ordinary 366 IPv4 fragmentation is permitted on the outer packet after SEAL 367 encapsulation and used to detect and dampen any in-the-network 368 fragmentation as quickly as possible. 370 The following sections specifiy the SEAL-related operations of the 371 ITE and ETE, respectively: 373 4.2. ITE Specification 375 4.2.1. Tunnel Interface MTU 377 The ITE configures a tunnel virtual interface over one or more 378 underlying links that connect the border node to the subnetwork. The 379 tunnel interface must present a fixed MTU to the inner IP layer 380 (i.e., Layer 3) as the size for admission of inner IP packets into 381 the tunnel. Since the tunnel interface may support a potentially 382 large set of ETEs, however, care must be taken in setting a greatest- 383 common-denominator MTU for all ETEs while still upholding end system 384 expectations. 386 Due to the ubiquitous deployment of standard Ethernet and similar 387 networking gear, the nominal Internet cell size has become 1500 388 bytes; this is the de facto size that end systems have come to expect 389 will either be delivered by the network without loss due to an MTU 390 restriction on the path or a suitable PTB message returned. However, 391 the network may not always deliver the necessary PTBs, leading to 392 MTU-related black holes [RFC2923]. The ITE therefore requires a 393 means for conveying 1500 byte (or smaller) packets to the ETE without 394 loss due to MTU restrictions and without dependence on PTB messages 395 from within the subnetwork. 397 In common deployments, there may be many forwarding hops between the 398 original source and the ITE. Within those hops, there may be 399 additional encapsulations (IPSec, L2TP, etc.) such that a 1500 byte 400 packet sent by the original source might grow to a larger size by the 401 time it reaches the ITE for encapsulation as an inner IP packet. 402 Similarly, additional encapsulations on the path from the ITE to the 403 ETE could cause the encapsulated packet to become larger still and 404 trigger in-the-network fragmentation. In order to preserve the end 405 system expectations, the ITE therefore requires a means for conveying 406 these larger packets to the ETE even though there may be links within 407 the subnetwork that configure a smaller MTU. 409 The ITE should therefore set a tunnel virtual interface MTU of 1500 410 bytes plus extra room to accommodate any additional encapsulations 411 that may occur on the path from the original source (i.e., even if 412 the underlying links do not support an MTU of this size). The ITE 413 can set larger MTU values still, but should select a value that is 414 not so large as to cause excessive PTBs coming from within the tunnel 415 interface (see: Sections 4.2.2 and 4.2.6). The ITE can also set 416 smaller MTU values, however care must be taken not to set so small a 417 value that original sources would experience an MTU underflow. In 418 particular, IPv6 sources must see a minimum path MTU of 1280 bytes, 419 and IPv4 sources should see a minimum path MTU of 576 bytes. 421 The inner IP layer consults the tunnel interface MTU when admitting a 422 packet into the interface. For inner IPv4 packets larger than the 423 tunnel interface MTU and with the IPv4 Don't Fragment (DF) bit set to 424 0, the inner IPv4 layer uses IPv4 fragmentation to break the packet 425 into fragments no larger than the tunnel interface MTU then admits 426 each fragment into the tunnel as an independent packet. For all 427 other inner packets (IPv4 or IPv6), the ITE admits the packet if it 428 is no larger than the tunnel interface MTU; otherwise, it drops the 429 packet and sends an ICMP PTB message with an MTU value of the tunnel 430 interface MTU to the source. 432 4.2.2. Accounting for Headers 434 As for any transport layer protocol, ITEs use the MTU of the 435 underlying IPv4 interface, the length of any mid-layer '*' headers 436 and trailers, and the length of the outer SEAL/*/IPv4 headers to 437 determine the maximum-sized upper layer payload. For example, when 438 the underlying IPv4 interface advertises an MTU of 1500 bytes and the 439 ITE inserts a minimum-length (i.e., 20 byte) IPv4 header, the ITE 440 sees a maximum payload size of 1480 bytes. When the ITE inserts IPv4 441 header options, the size is further reduced by as many as 40 442 additional bytes (the maximum length for IPv4 options) such that as 443 few as 1440 bytes may be available for the upper layer payload. When 444 the ITE inserts additional '*' encapsulations, the available MTU for 445 the upper layer payload is reduced further still. 447 The ITE must additionally account for the length of the SEAL header 448 itself as an extra encapsulation that further reduces the size 449 available for the upper layer payload. The length of the SEAL header 450 is not incorporated in the IPv4 header length, therefore the network 451 does not observe the SEAL header as an IPv4 option. In this way, the 452 SEAL header is inserted after the IPv4 options but before the upper 453 layer payload in exactly the same manner as for IPv6 extension 454 headers. 456 4.2.3. Segmentation and Encapsulation 458 For each ETE, the ITE maintains the length of any mid-layer '*' 459 encapsulation headers and trailers (e.g., for '*' = AH, ESP, NULL, 460 etc.) in a variable 'MHLEN' and maintains the length of the outer 461 SEAL/*/IPv4 encapsulation headers in a variable 'OHLEN'. The ITE 462 maintains a SEAL Maximum Segment Size (S_MSS) value for each ETE as 463 soft state within the tunnel interface (e.g., in the IPv4 destination 464 cache). The ITE initializes S_MSS to the minimum of (the underlying 465 IPv4 interface MTU minus OHLEN) and 128 bytes, and decreases or 466 increases S_MSS based on any ICMPv4 Fragmentation Needed messages 467 received (see: Section 4.2.6). The ITE additionally maintains a SEAL 468 Maximum Reassembly Unit (S_MRU) value for each ETE, initialized to a 469 value no larger than 2KB. 471 The ITE performs segmentation and encapsulation on inner packets that 472 have been admitted into the tunnel interface. For inner IPv4 packets 473 with the DF bit set to 0, if the length of the inner packet is larger 474 than (S_MRU - MHLEN) the ITE uses IPv4 fragmentation to break the 475 packet into IPv4 fragments no larger than (S_MRU - MHLEN). For 476 unfragmentable inner packets (e.g., IPv6 packets, IPv4 packets with 477 DF=1, etc.), if the length of the inner packet is larger than 478 (MAX(S_MRU, S_MSS) - MHLEN), the ITE drops the packet and sends an 479 ICMP PTB message with an MTU value of (MAX(S_MRU, S_MSS) - MHLEN) 480 back to the original source. 482 The ITE then encapsulates each inner packet/fragment in the MHLEN 483 bytes of mid-layer '*' headers and trailers. For each such resulting 484 mid-layer packet, if the length of the mid-layer packet is no larger 485 than S_MRU but is larger than S_MSS, the ITE breaks it into N 486 segments (N <= 16) that are no larger than S_MSS bytes each. Each 487 segment except the final one MUST be of equal length, while the final 488 segment MUST be no larger than the initial segment. The first byte 489 of each segment MUST begin immediately after the final byte of the 490 previous segment, i.e., the segments MUST NOT overlap. 492 Note that this SEAL segmentation is used only for mid-layer packets 493 that are no larger than S_MRU; mid-layer packets that are larger than 494 S_MRU are instead encapsulated as a single segment. Note also that 495 this SEAL segmentation ignores the fact that the mid-layer packet may 496 be unfragmentable. This segmentation process is a mid-layer (not an 497 IP layer) operation employed by the ITE to adapt the mid-layer packet 498 to the subnetwork path characteristics, and the ETE will restore the 499 packet to its original form during decapsulation. Therefore, the 500 fact that the packet may have been segmented within the subnetwork is 501 not observable after decapsulation. 503 The ITE next encapsulates each segment in a SEAL header formatted as 504 follows: 506 0 1 2 3 507 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 508 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 509 | ID Extension |P|R|D|M|Segment| Next Header | 510 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 512 Figure 2: SEAL Header Format 514 where the header fields are defined as follows: 516 ID Extension (16) 517 a 16-bit extension of the ID field in the outer IPv4 header; 518 encodes the most-significant 16 bits of a 32 bit SEAL_ID value. 520 P (1) 521 the "Probe" bit. Set to 1 if the ITE wishes to receive an 522 explicit acknowledgement from the ETE. 524 R (1) 525 the "Report Fragmentation" bit. Set to 1 if the ITE wishes to 526 receive a report from the ETE if any IPv4 fragmentation occurs. 528 D (1) 529 the "Dont Reassemble" bit. Set to 1 if the reassembled SEAL 530 protocol packet is to be discarded by the ETE if any IPv4 531 reassemly is required. 533 M (1) 534 the "More Segments" bit. Set to 1 if this SEAL protocol packet 535 contains a non-final segment of a multi-segment mid-layer packet. 537 Segment (4) 538 a 4-bit Segment number. Encodes a segment number between 0 - 15. 540 Next Header (8) an 8-bit field that encodes an Internet Protocol 541 number the same as for the IPv4 protocol and IPv6 next header 542 fields. 544 For single-segment mid-layer packets, the ITE encapsulates the 545 segment in a SEAL header with (M=0; Segment=0). For N-segment mid- 546 layer packets (N <= 16), the ITE encapsulates each segment in a SEAL 547 header with (M=1; Segment=0) for the first segment, (M=1; Segment=1) 548 for the second segment, etc., with the final segment setting (M=0; 549 Segment=N-1). For each encapsulated segment, the ITE sets D=0 in the 550 SEAL header if the ETE is to accept the packet even if it arrives as 551 multiple IPv4 fragments; for example, the ITE may set D=0 in the SEAL 552 header of each segment for all mid-layer packets no larger than 553 S_MRU. The ITE instead sets D=1 in the SEAL header if the ETE is to 554 discard the packet if it arrives as multiple IPv4 fragments; in 555 particular, the ITE should set D=1 in the SEAL header of each segment 556 for all mid-layer packets larger than S_MRU. 558 The ITE next sets the P and R bits in the SEAL header of each segment 559 according to whether the packet is to be used as an explicit/implicit 560 probe as specified in Section 4.2.4, then writes the Internet 561 Protocol number corresponding to the mid-layer packet in the SEAL 562 'Next Header' field. Next, the ITE encapsulates the segment in the 563 requisite */IPv4 outer headers according to the specific 564 encapsulation format (e.g., [RFC2003], [RFC4213], [RFC4380], etc.), 565 except that it writes 'SEAL_PROTO' in the protocol field of the outer 566 IPv4 header (when simple IPv4 encapsualtion is used) or writes 567 'SEAL_PORT' in the outer destination service port field (e.g., when 568 UDP/IPv4 encapsulation is used). The ITE finally sets packet 569 identification values as specified in Section 4.2.5 and sends the 570 packets as specified in Section 4.2.6. 572 4.2.4. Sending Probes 574 When S_MSS is larger than 128, the ITE sends ordinary encapsulated 575 data packets as implicit probes to detect in-the-network IPv4 576 fragmentation and to determine new values for S_MSS. The ITE sets 577 R=1 in the SEAL header and DF=0 in the outer IPv4 header of each 578 segment of a SEAL-segmented packet to be used as an implicit probe, 579 and will receive ICMPv4 Fragmentation Needed messages from the ETE if 580 any IPv4 fragmentation occurs. When S_MSS is no larger than 128, the 581 ITE instead sets R=0 in the SEAL header to avoid generating 582 fragmentation reports for unavoidable in-the-network fragmentation. 584 The ITE should send explicit probes periodically to manage a window 585 of SEAL_IDs of outstanding probes as a means to validate any ICMPv4 586 messages it receives. The ITE sets P=1 in the SEAL header of each 587 segment of a SEAL-segmented packet to be used as an explicit probe, 588 where the probe can be either an ordinary data packet or a NULL 589 packet created by setting the 'Next Header' field in the SEAL header 590 to a value of "No Next Header" (see: [RFC2460], Section 4.7. 592 The ITE should further send explicit probes periodically to detect 593 increases in S_MSS by resetting S_MSS to the minimum of (the 594 underlying IPv4 interface MTU minus OHLEN) and 128 bytes, and/or 595 sending explicit probes that are larger than the current S_MSS. 597 4.2.5. Packet Identification 599 For the purpose of packet identification, the ITE maintains a 32-bit 600 SEAL_ID value as per-ETE soft state, e.g. in the IPv4 destination 601 cache. The ITE randomly-initializes SEAL_ID when the soft state is 602 created and monotonically increments it (modulo 2^32) for each 603 successive SEAL protocol packet it sends to the ETE. For each 604 packet, the ITE writes the least-significant 16 bits of the SEAL_ID 605 value in the ID field in the outer IPv4 header, and writes the most- 606 significant 16 bits in the ID Extension field in the SEAL header. 608 For packets that may traverse IPv4 Network Address Translators 609 (NATs), the ITE instead maintains SEAL_ID as a 16-bit value that it 610 randomly-initializes when the soft state is created and monotonically 611 increments (modulo 2^16) for each successive SEAL protocol packet. 612 For each packet, the ITE writes SEAL_ID in the ID extension field of 613 the SEAL header and writes a random 16-bit value in the ID field in 614 the outer IPv4 header. This requires that both the ITE and ETE 615 participate in this alternate scheme. 617 4.2.6. Sending SEAL Protocol Packets 619 Following SEAL segmentation and encapsulation, the ITE sets DF=0 in 620 the outer IPv4 header of every outer packet it sends. For 621 "expendable" packets (e.g., for NULL packets used as probes - see: 622 Section 4.2.6), the ITE may optionally set DF=1. 624 The ITE then sends each outer packet that encapsulates a segment of 625 the same mid-layer packet into the tunnel in canonical order, i.e., 626 Segment 0 first, followed by Segment 1, etc. and finally Segment N-1. 628 4.2.7. Processing Raw ICMPv4 Messages 630 The ITE may receive "raw" ICMPv4 error messages from routers within 631 the subnetwork that comprise an outer IPv4 header followed by an 632 ICMPv4 header followed by a portion of the SEAL packet that generated 633 the error (also known as the "packet-in-error"). For such messages, 634 the ITE can use the 32-bit SEAL ID encoded in the packet-in-error as 635 a nonce to confirm that the ICMP message came from an on-path router 636 within the subnetwork. The ITE MAY process raw ICMPv4 messages as 637 soft errors indicating that the path to the ETE may be failing, but 638 it discards any raw ICMPv4 Fragmentation Needed messages for which 639 the IPv4 header of the packet-in-error has DF=0. 641 4.2.8. Processing SEAL-Encapsulated ICMPv4 Messages 643 In addition to any raw ICMPv4 messages, the ITE may receive SEAL- 644 encapsulated ICMPv4 messages from subnetwork border nodes that 645 comprise outer ICMPv4/*/SEAL/*/IPv4 headers followed by a portion of 646 the SEAL-encapsulated packet-in-error. The ITE can use the 32-bit 647 SEAL ID encoded in the packet-in-error as well as information in the 648 outer IPv4 and SEAL headers as nonces to confirm that the ICMP 649 message came from a legitimate ETE. The ITE then verifies that the 650 SEAL_ID encoded in the packet-in-error is within the current window 651 of transmitted SEAL_IDs for this ETE. If the SEAL_ID is outside of 652 the window, the ITE discards the message; otherwise, it advances the 653 window and processes the message. 655 The ITE processes SEAL-encapsulated ICMPv4 messages other than ICMPv4 656 Fragmentation Needed exactly as specified in [RFC0792]. For SEAL- 657 encapsulated ICMPv4 Fragmentation Needed messages, if the IPv4 length 658 of the packet-in-error minus OHLEN is larger than S_MSS the ITE sets 659 S_MSS to this new value. For SEAL-encapsulated ICMPv4 Fragmentation 660 Needed messages with MF=1 in the IPv4 header of the packet-in-error, 661 the ITE instead sets S_MSS to this new value if the value is no 662 smaller than (576 - OHLEN) and sets S_MSS to MAX(S_MSS/2, 128) if the 663 value is smaller than (576 - OHLEN). 665 Note that in the above, 576 accounts for the nominal minimum MTU for 666 common IPv4 links. When an ETE returns a packet-in-error with MF=1 667 and with length smaller than 576, the ITE performs a "limited 668 halving" of S_MSS to account for IPv4 links with unusually small MTUs 669 or cases in which the ETE otherwise receives an undersized IPv4 670 first-fragment. This limited halving may require multiple iterations 671 of sending probes and receiving ICMPv4 Fragmentation Needed messages, 672 but will soon converge to a stable S_MSS value. When performing this 673 limited having, it is important that the ITE adjust its S_MSS size 674 based on the first ICMPv4 Fragmentation Needed message and refrain 675 from reducing S_MSS until ICMPv4 Fragmentation Needed messages 676 pertaining to packets sent under the new S_MSS are received. For 677 example, the ITE should not halve the S_MSS repeatedly based on a 678 flurry of ICMPv4 Fragmentation Needed messages all pertaining to 679 packets sent under the same S_MSS. 681 After deterimining a new value for S_MSS, if the IPv4 header of the 682 packet-in-error has MF=1 and its SEAL header has D=1 the ITE MAY 683 transcribe the message into an ICMP PTB message to send back to the 684 original source. To do so, the ITE discards the SEAL/*/IPv4 headers 685 plus any mid-layer '*' headers/trailers of the packet-in-error then 686 encapsulates the remaining inner IP packet portion in a PTB message 687 with the MTU field set to MAX((S_MRUS, S_MSS) - MHLEN). Note that 688 this may not be possible when the inner IP packet portion was 689 encrypted (e.g. via IPsec/ESP), and is otherwise not entirely 690 necessary since the ITE will discard subsequent large packets and 691 send back an ICMP PTB *before* encapsulating them and sending to the 692 ETE. Transcribing ICMPv4 Fragmentation Needed messages into ICMP 693 PTBs is therefore offered only as an optional optimization. 695 4.3. ETE Specification 697 4.3.1. Reassembly Buffer Requirements 699 ETEs MUST be capable of using IPv4-layer reassembly to reassemble 700 SEAL protocol outer IPv4 packets of at least (2KB + OHELN) and MUST 701 also be capable of using SEAL-layer reassembly to reassemble mid- 702 layer packets of at least (2KB + OHLEN). The term OHLEN is included 703 to account for the length of the SEAL/*/IPv4 header, which must be 704 retained for the purpose of associating the fragments/segments of the 705 same packet. Note that the term S_MRU used in section 4.2 omits 706 OHLEN for the purpose of specification clarity. 708 4.3.2. IPv4-Layer Reassembly 710 The ETE performs IPv4 reassembly as-normal, and should maintain a 711 conservative high- and low-water mark for the number of outstanding 712 reassemblies pending for each ITE. When the size of the reassembly 713 buffer exceeds this high-water mark, the ETE actively discards 714 incomplete reassemblies (e.g., using an Active Queue Management (AQM) 715 strategy) until the size falls below the low-water mark. The ETE 716 should also use a reduced IPv4 maximum segment lifetime value (e.g., 717 15 seconds), i.e., the time after which it will discard an incomplete 718 IPv4 reassembly for a SEAL protocol packet. 720 After reassembly, the ETE either accepts or discards the reassembled 721 packet based on the current status of the IPv4 reassembly cache 722 (congested vs uncongested). The SEAL_ID included in the IPv4 first- 723 fragment provides an additional level of reassembly assurance, since 724 it can record a distinct arrival timestamp useful for associating the 725 first-fragment with its corresponding non-initial fragments. The 726 choice of accepting/discarding a reassembly may also depend on the 727 strength of the upper-layer integrity check if known (e.g., IPSec/ESP 728 provides a strong upper-layer integrity check) and/or the corruption 729 tolerance of the data (e.g., multicast streaming audio/video may be 730 more corruption-tolerant than file transfer, etc.). In the limiting 731 case, the ETE may choose to discard all IPv4 reassemblies and process 732 only the IPv4 first-fragment for SEAL-encapsulated error generation 733 purposes (see the following sections). 735 4.3.3. Generating SEAL-Encapsulated ICMPv4 Fragmentation Needed 736 Messages 738 During IPv4-layer reassembly, the ETE determines whether the packet 739 belongs to the SEAL protocol by checking for SEAL_PROTO in the outer 740 IPv4 header (i.e., for simple IPv4 encapsulation) or for SEAL_PORT in 741 the outer */IPv4 header (e.g., for '*'=UDP). When the ETE processes 742 the IPv4 first-fragment (i.e, one with DF=1 and Offset =0 in the IPv4 743 header) of a SEAL protocol IPv4 packet with (R=1; Segment=0) in the 744 SEAL header, it sends a SEAL-encapsulated ICMPv4 Fragmentation Needed 745 message back to the ITE with the MTU value set to 0. 747 When the ETE processes a SEAL protocol IPv4 packet with (P=1; 748 Segment=0) for which no IPv4 reassembly was required, it sends a 749 SEAL-encapsulated ICMPv4 Fragmentation Needed message back to the ITE 750 with the MTU value set to 0. Note therefore that when the P bit is 751 set the R bit is "don't-care" and that the ETE only sends a single 752 IPv4 Fragmentation Needed message, i.e., it does not send two 753 separate messages (one for the first fragment and a second for the 754 reassembled whole IPv6 packet). 756 The ETE prepares the ICMPv4 Fragmentation Needed message by 757 encapsulating as much of the first fragment (or the whole IPv4 758 packet) as possible in outer */SEAL/*/IPv4 headers without the length 759 of the message exceeding 576 bytes as shown in Figure 3: 761 +-------------------------+ - 762 | | \ 763 ~ Outer */SEAL/*/IPv4 hdrs~ | 764 | | | 765 +-------------------------+ | 766 | ICMPv4 Header | | 767 |(Dest Unreach; Frag Need)| | 768 +-------------------------+ | 769 | | > Up to 576 bytes 770 ~ IP/*/SEAL/*/IPv4 ~ | 771 ~ hdrs of packet/fragment ~ | 772 | | | 773 +-------------------------+ | 774 | | | 775 ~ Data of packet/fragment ~ | 776 | | / 777 +-------------------------+ - 779 Figure 3: SEAL-encapsulated ICMPv4 Fragmentation Needed Message 781 The ETE next sets D=0, P=0, R=0, M=0 and Segment=0 in the outer SEAL 782 header, sets the SEAL_ID the same as for any SEAL packet, then sets 783 the SEAL Next Header field and the fields of the outer */IPv4 headers 784 the same as for ordinay SEAL encapsulation. The ETE then sets outer 785 IPv4 destination address to the source address of the first-fragment 786 and sets the outer IPv4 source address to the destination address of 787 the first-fragment. If the destination address in the first-fragment 788 was multicast, the ETE instead sets the outer IPv4 source address to 789 an address assigned to the underlying IPv4 interface. The ETE 790 finally sends the SEAL-encapsulated ICMPv4 message to the ITE the 791 same as specified in Section 4.2.5, except that the ETE may send the 792 messages subject to rate limiting since it is not entirely critical 793 that all fragmentation be reported to the ITE. 795 4.3.4. SEAL-Layer Reassembly 797 Following IPv4 reassembly of a SEAL protocol packet, the ETE adds the 798 SEAL packet to a SEAL-Layer pending-reassembly queue (if necessary). 799 If the packet arrived as multiple IPv4 fragments and with D=1 in the 800 SEAL header, the ETE marks the packet and/or pending reassembly queue 801 as "discard following reassembly". The ETE also marks the packet as 802 "discard following reassembly" if the (Next Header, P, R, D) fields 803 of the packet's SEAL header differ from their respective values in 804 other SEAL segments already in the queue, i.e., the (Next Header, P, 805 R, D)-tuple serves as a reassembly nonce. 807 The ETE performs SEAL-layer reassembly for multi-segment mid-layer 808 packets through simple in-order concatenation of the encapsulated 809 segments from N consecutive SEAL protocol packets from the same mid- 810 layer packet. SEAL-layer reassembly requires the ETE to maintain a 811 cache of recently received SEAL packet segments for a hold time that 812 would allow for reasonable inter-segment delays. The ETE uses a SEAL 813 maximum segment lifetime of 15 seconds for this purpose, i.e., the 814 time after which it will discard an incomplete reassembly. However, 815 the ETE should also actively discard any pending reassemblies that 816 clearly have no opportunity for completion, e.g., when a considerable 817 number of new SEAL packets have been received before a packet that 818 completes a pending reassembly has arrived. 820 The ETE reassembles the mid-layer packet segments in SEAL protocol 821 packets that contain Segment numbers 0 through N-1, with M=1/0 in 822 non-final/final segments, respectively, and with consecutive SEAL_ID 823 values. That is, for an N-segment mid-layer packet, reassembly 824 entails the concatenation of the SEAL-encapsulated segments with 825 (Segment 0, SEAL_ID i), followed by (Segment 1, SEAL_ID ((i + 1) mod 826 2^32)), etc. up to (Segment N-1, SEAL_ID ((i + N-1) mod 2^32)). (For 827 tunnels that may traverse IPv4 NATs, the ETE instead uses only a 16- 828 bit SEAL_ID value, and uses mod 2^16 arithmetic to associate the 829 segments of the same packet.) 831 4.3.5. Decapsulation and Generating Other ICMPv4 Errors 833 Following SEAL-layer reassembly, if the packet had the value "No Next 834 Header" in the SEAL header's Next Header field, or if the packet was 835 marked "discard following reassembly" and IPv4 fragmentation was 836 experienced, the ETE silently discards the reassembled mid-layer 837 packet. Otherwise, the ETE decapsulates the inner packet and 838 processes it as normal. If the ETE determines that the decapsulated 839 inner packet cannot be processed further it drops the packet, 840 prepares an appropriate SEAL-encapsulated ICMPv4 error message and 841 sends the error message back to the ITE exactly as for ICMPv4 842 Fragmentation Needed messages (See: Section 4.3.3). 844 5. SEAL Protocol Specification - Transport Mode 846 Section 4 specifies the operation of SEAL in "tunnel mode", i.e., 847 when there is both an inner and outer IP layer and with a SEAL 848 encapsulation layer between. However, the SEAL protocol can also be 849 used in a "transport mode" of operation in which the inner layer 850 corresponds to a transport layer protocol (e.g., UDP, TCP, etc.) 851 instead of an inner IP layer. 853 For example, two TCP endpoints connected to the same subnetwork 854 region can negotiate the use of transport-mode SEAL for a connection 855 by inserting a 'SEAL_OPTION' TCP option during the connection 856 establishment phase. If both TCPs agree on the use of SEAL, their 857 protocol messages will be carriaged as TCP/SEAL/IPv4 and the 858 connection will be serviced by the SEAL protocol using TCP (nstead of 859 an encapsulating tunnel endpoint) as the transport layer protocol. 860 The SEAL protocol for transport mode otherwise observes the same 861 specifications as for Section 4. 863 6. Link Requirements 865 Subnetwork designers are strongly encouraged to follow the 866 recommendations in [RFC3819] when configuring link MTUs, where all 867 IPv4 links SHOULD configure a minimum MTU of 576 bytes. Links that 868 cannot configure an MTU of at least 576 bytes (e.g., due to 869 performance characteristics) SHOULD implement transparent link-layer 870 segmentation and reassembly such that an MTU of at least 576 can 871 still be presented to the IPv4 layer. 873 7. End System Requirements 875 SEAL provides robust mechanisms for returning PTB messages to the 876 original source, however end systems that send unfragmentable IP 877 packets larger than 1500 bytes are strongly encouraged to use 878 Packetization Layer Path MTU Discovery per [RFC4821]. 880 8. Router Requirements 882 IPv4 routers within the subnetwork are strongly encouraged to 883 implement IPv4 fragmentation such that the first fragment is the 884 largest and approximately the size of the underlying link MTU. 886 9. IANA Considerations 888 SEAL_PROTO, SEAL_PORT and SEAL_OPTION are taken from their respective 889 range of experimental values documented in [RFC3692][RFC4727]. These 890 values are for experimentation purposes only, and not to be used for 891 any kind of deployments (i.e., they are not to be shipped in any 892 products). This document therefore has no actions for IANA. 894 10. Security Considerations 896 Unlike IPv4 fragmentation, overlapping fragment attacks are not 897 possible due to the requirement that SEAL segments be non- 898 overlapping. 900 An amplification/reflection attack is possible when an attacker sends 901 IPv4 first-fragments with spoofed source addresses to an ETE, 902 resulting in a stream of ICMPv4 Fragmentation Needed messages 903 returned to a victim ITE. The encapsulated segment of the spoofed 904 IPv4 first-fragment provides mitigation for the ITE to detect and 905 discard spurious ICMPv4 Fragmentation Needed messages. 907 The SEAL header is sent in-the-clear (outside of any IPsec/ESP 908 encapsulations) the same as for the IPv4 header. As for IPv6 909 extension headers, the SEAL header is protected only by L2 integrity 910 checks and is not covered under any L3 integrity checks. 912 11. Related Work 914 Section 3.1.7 of [RFC2764] provides a high-level sketch for 915 supporting large tunnel MTUs via a tunnel-level segmentation and 916 reassembly capability to avoid IP level fragmentation, which is in 917 part the same approach used by tunnel-mode SEAL. SEAL could 918 therefore be considered as a fully-functioned manifestation of the 919 method postulated by that informational reference, however SEAL also 920 supports other modes of operation including transport-mode and 921 duplicate packet detection. 923 Section 3 of[RFC4459] describes inner and outer fragmentation at the 924 tunnel endpoints as alternatives for accommodating the tunnel MTU, 925 however the SEAL protocol specifies a mid-layer segmentation and 926 reassembly capability that is distinct from both inner and outer 927 fragmentation. 929 Section 4 of [RFC2460] specifies a method for inserting and 930 processing extension headers between the base IPv6 header and 931 transport layer protocol data. The SEAL header is in fact inserted 932 and processed in exactly the same manner. 934 The concepts of path MTU determination through the report of 935 fragmentation and extending the IP Identification field were first 936 proposed in deliberations of the TCP-IP mailing list and the Path MTU 937 Discovery Working Group (MTUDWG) during the late 1980's and early 938 1990's. SEAL supports a report fragmentation capability using bits 939 in an extension header (the original proposal used a spare bit in the 940 IP header) and supports ID extension through a 16 bit field in an 941 extension header (the original proposal used a new IP option). An 942 historical analysis of the evolution of these concepts as well as the 943 development of the eventual path MTU discovery mechanism for IP 944 appears in Appendix A of this document. 946 12. Acknowledgments 948 The following individuals are acknowledged for helpful comments and 949 suggestions: Jari Arkko, Fred Baker, Iljitsch van Beijnum, Teco Boot, 950 Bob Braden, Brian Carpenter, Steve Casner, Ian Chakeres, Remi Denis- 951 Courmont, Aurnaud Ebalard, Gorry Fairhurst, Joel Halpern, John 952 Heffner, Bob Hinden, Christian Huitema, Joe Macker, Matt Mathis, Dan 953 Romascanu, Dave Thaler, Joe Touch, Magnus Westerlund, Robin Whittle, 954 James Woodyatt and members of the Boeing PhantomWorks DC&NT group. 956 Path MTU determination through the report of fragmentation was first 957 proposed by Charles Lynn on the TCP-IP mailing list in 1987. 958 Extending the IP identification field was first proposed by Steve 959 Deering on the MTUDWG mailing list in 1989. 961 13. References 963 13.1. Normative References 965 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 966 September 1981. 968 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 969 RFC 792, September 1981. 971 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 972 Requirement Levels", BCP 14, RFC 2119, March 1997. 974 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 975 (IPv6) Specification", RFC 2460, December 1998. 977 13.2. Informative References 979 [FOLK] C, C., D, D., and k. k, "Beyond Folklore: Observations on 980 Fragmented Traffic", December 2002. 982 [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", 983 October 1987. 985 [MTUDWG] "IETF MTU Discovery Working Group mailing list, 986 gatekeeper.dec.com/pub/DEC/WRL/mogul/mtudwg-log, November 987 1989 - February 1995.". 989 [RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP 990 MTU discovery options", RFC 1063, July 1988. 992 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 993 November 1990. 995 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 996 for IP version 6", RFC 1981, August 1996. 998 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 999 October 1996. 1001 [RFC2004] Perkins, C., "Minimal Encapsulation within IP", RFC 2004, 1002 October 1996. 1004 [RFC2764] Gleeson, B., Heinanen, J., Lin, A., Armitage, G., and A. 1005 Malis, "A Framework for IP Based Virtual Private 1006 Networks", RFC 2764, February 2000. 1008 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", 1009 RFC 2923, September 2000. 1011 [RFC3692] Narten, T., "Assigning Experimental and Testing Numbers 1012 Considered Useful", BCP 82, RFC 3692, January 2004. 1014 [RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D., 1015 Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. 1016 Wood, "Advice for Internet Subnetwork Designers", BCP 89, 1017 RFC 3819, July 2004. 1019 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 1020 for IPv6 Hosts and Routers", RFC 4213, October 2005. 1022 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 1023 Internet Protocol", RFC 4301, December 2005. 1025 [RFC4380] Huitema, C., "Teredo: Tunneling IPv6 over UDP through 1026 Network Address Translations (NATs)", RFC 4380, 1027 February 2006. 1029 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1030 Network Tunneling", RFC 4459, April 2006. 1032 [RFC4727] Fenner, B., "Experimental Values In IPv4, IPv6, ICMPv4, 1033 ICMPv6, UDP, and TCP Headers", RFC 4727, November 2006. 1035 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 1036 Discovery", RFC 4821, March 2007. 1038 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 1039 Errors at High Data Rates", RFC 4963, July 2007. 1041 [TCP-IP] "TCP-IP mailing list archives, 1042 http://www-mice.cs.ucl.ac.uk/multimedia/mist/tcpip, May 1043 1987 - May 1990.". 1045 Appendix A. Historic Evolution of PMTUD 1047 (Taken from 'draft-templin-v6v4-ndisc-01.txt'; written 10/30/2002): 1049 The topic of Path MTU discovery (PMTUD) saw a flurry of discussion 1050 and numerous proposals in the late 1980's through early 1990. The 1051 initial problem was posed by Art Berggreen on May 22, 1987 in a 1052 message to the TCP-IP discussion group [TCP-IP]. The discussion that 1053 followed provided significant reference material for [FRAG]. An IETF 1054 Path MTU Discovery Working Group [MTUDWG] was formed in late 1989 1055 with charter to produce an RFC. Several variations on a very few 1056 basic proposals were entertained, including: 1058 1. Routers record the PMTUD estimate in ICMP-like path probe 1059 messages (proposed in [FRAG] and later [RFC1063]) 1061 2. The destination reports any fragmentation that occurs for packets 1062 received with the "RF" (Report Fragmentation) bit set (Steve 1063 Deering's 1989 adaptation of Charles Lynn's Nov. 1987 proposal) 1065 3. A hybrid combination of 1) and Charles Lynn's Nov. 1987 proposal 1066 (straw RFC draft by McCloughrie, Fox and Mogul on Jan 12, 1990) 1068 4. Combination of the Lynn proposal with TCP (Fred Bohle, Jan 30, 1069 1990) 1071 5. Fragmentation avoidance by setting "IP_DF" flag on all packets 1072 and retransmitting if ICMPv4 "fragmentation needed" messages 1073 occur (Geof Cooper's 1987 proposal; later adapted into [RFC1191] 1074 by Mogul and Deering). 1076 Option 1) seemed attractive to the group at the time, since it was 1077 believed that routers would migrate more quickly than hosts. Option 1078 2) was a strong contender, but repeated attempts to secure an "RF" 1079 bit in the IPv4 header from the IESG failed and the proponents became 1080 discouraged. 3) was abandoned because it was perceived as too 1081 complicated, and 4) never received any apparent serious 1082 consideration. Proposal 5) was a late entry into the discussion from 1083 Steve Deering on Feb. 24th, 1990. The discussion group soon 1084 thereafter seemingly lost track of all other proposals and adopted 1085 5), which eventually evolved into [RFC1191] and later [RFC1981]. 1087 In retrospect, the "RF" bit postulated in 2) is not needed if a 1088 "contract" is first established between the peers, as in proposal 4) 1089 and a message to the MTUDWG mailing list from jrd@PTT.LCS.MIT.EDU on 1090 Feb 19. 1990. These proposals saw little discussion or rebuttal, and 1091 were dismissed based on the following the assertions: 1093 o routers upgrade their software faster than hosts 1095 o PCs could not reassemble fragmented packets 1097 o Proteon and Wellfleet routers did not reproduce the "RF" bit 1098 properly in fragmented packets 1100 o Ethernet-FDDI bridges would need to perform fragmentation (i.e., 1101 "translucent" not "transparent" bridging) 1103 o the 16-bit IP_ID field could wrap around and disrupt reassembly at 1104 high packet arrival rates 1106 The first four assertions, although perhaps valid at the time, have 1107 been overcome by historical events leaving only the final to 1108 consider. But, [FOLK] has shown that IP_ID wraparound simply does 1109 not occur within several orders of magnitude the reassembly timeout 1110 window on high-bandwidth networks. 1112 (Authors 2/11/08 note: this final point was based on a loose 1113 interpretation of [FOLK], and is more accurately addressed in 1114 [RFC4963].) 1116 Author's Address 1118 Fred L. Templin (editor) 1119 Boeing Phantom Works 1120 P.O. Box 3707 1121 Seattle, WA 98124 1122 USA 1124 Email: fltemplin@acm.org 1126 Full Copyright Statement 1128 Copyright (C) The IETF Trust (2008). 1130 This document is subject to the rights, licenses and restrictions 1131 contained in BCP 78, and except as set forth therein, the authors 1132 retain all their rights. 1134 This document and the information contained herein are provided on an 1135 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1136 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 1137 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 1138 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1139 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1140 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1142 Intellectual Property 1144 The IETF takes no position regarding the validity or scope of any 1145 Intellectual Property Rights or other rights that might be claimed to 1146 pertain to the implementation or use of the technology described in 1147 this document or the extent to which any license under such rights 1148 might or might not be available; nor does it represent that it has 1149 made any independent effort to identify any such rights. Information 1150 on the procedures with respect to rights in RFC documents can be 1151 found in BCP 78 and BCP 79. 1153 Copies of IPR disclosures made to the IETF Secretariat and any 1154 assurances of licenses to be made available, or the result of an 1155 attempt made to obtain a general license or permission for the use of 1156 such proprietary rights by implementers or users of this 1157 specification can be obtained from the IETF on-line IPR repository at 1158 http://www.ietf.org/ipr. 1160 The IETF invites any interested party to bring to its attention any 1161 copyrights, patents or patent applications, or other proprietary 1162 rights that may cover technology that may be required to implement 1163 this standard. Please address the information to the IETF at 1164 ietf-ipr@ietf.org.