idnits 2.17.1 draft-templin-intarea-seal-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (June 17, 2009) is 5426 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC3692' is defined on line 1648, but no explicit reference was found in the text == Unused Reference: 'RFC4727' is defined on line 1669, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-24) exists of draft-ietf-lisp-01 == Outdated reference: A later version (-05) exists of draft-russert-rangers-00 == Outdated reference: A later version (-09) exists of draft-templin-ranger-07 -- Obsolete informational reference (is this intentional?): RFC 1063 (Obsoleted by RFC 1191) -- Obsolete informational reference (is this intentional?): RFC 1146 (Obsoleted by RFC 6247) -- Obsolete informational reference (is this intentional?): RFC 1981 (Obsoleted by RFC 8201) Summary: 2 errors (**), 0 flaws (~~), 8 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Templin, Ed. 3 Internet-Draft Boeing Research & Technology 4 Intended status: Standards Track June 17, 2009 5 Expires: December 19, 2009 7 The Subnetwork Encapsulation and Adaptation Layer (SEAL) 8 draft-templin-intarea-seal-01.txt 10 Status of this Memo 12 This Internet-Draft is submitted to IETF in full conformance with the 13 provisions of BCP 78 and BCP 79. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt. 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 This Internet-Draft will expire on December 19, 2009. 33 Copyright Notice 35 Copyright (c) 2009 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents in effect on the date of 40 publication of this document (http://trustee.ietf.org/license-info). 41 Please review these documents carefully, as they describe your rights 42 and restrictions with respect to this document. 44 Abstract 46 For the purpose of this document, subnetworks are defined as virtual 47 topologies that span connected network regions bounded by 48 encapsulating border nodes. These virtual topologies may span 49 multiple IP and/or sub-IP layer forwarding hops, and can introduce 50 failure modes due to packet duplication and/or links with diverse 51 Maximum Transmission Units (MTUs). This document specifies a 52 Subnetwork Encapsulation and Adaptation Layer (SEAL) that 53 accommodates such virtual topologies over diverse underlying link 54 technologies. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . 4 60 1.2. Approach . . . . . . . . . . . . . . . . . . . . . . . . . 6 61 2. Terminology and Requirements . . . . . . . . . . . . . . . . . 6 62 3. Applicability Statement . . . . . . . . . . . . . . . . . . . 8 63 4. SEAL Protocol Specification (Version 0) . . . . . . . . . . . 8 64 4.1. Model of Operation . . . . . . . . . . . . . . . . . . . . 8 65 4.2. SEAL Header Format (Version 0) . . . . . . . . . . . . . . 10 66 4.3. ITE Specification . . . . . . . . . . . . . . . . . . . . 11 67 4.3.1. Tunnel Interface MTU . . . . . . . . . . . . . . . . . 11 68 4.3.2. Admitting Packets into the Tunnel Interface . . . . . 12 69 4.3.3. Segmentation . . . . . . . . . . . . . . . . . . . . . 13 70 4.3.4. Encapsulation . . . . . . . . . . . . . . . . . . . . 15 71 4.3.5. Probing Strategy . . . . . . . . . . . . . . . . . . . 15 72 4.3.6. Packet Identification . . . . . . . . . . . . . . . . 16 73 4.3.7. Sending SEAL Protocol Packets . . . . . . . . . . . . 16 74 4.3.8. Processing Raw ICMP Messages . . . . . . . . . . . . . 17 75 4.3.9. Processing SEAL Control Messages . . . . . . . . . . . 17 76 4.4. ETE Specification . . . . . . . . . . . . . . . . . . . . 19 77 4.4.1. Reassembly Buffer Requirements . . . . . . . . . . . . 19 78 4.4.2. IP-Layer Reassembly . . . . . . . . . . . . . . . . . 19 79 4.4.3. SEAL-Layer Reassembly . . . . . . . . . . . . . . . . 20 80 4.4.4. Decapsulation and Delivery to Upper Layers . . . . . . 21 81 4.4.5. Sending SEAL Control Messages . . . . . . . . . . . . 21 82 5. SEAL Protocol Specification (Version 1) . . . . . . . . . . . 28 83 5.1. Model of Operation . . . . . . . . . . . . . . . . . . . . 28 84 5.2. SEAL Header Format (Version 1) . . . . . . . . . . . . . . 28 85 5.3. ITE Specification . . . . . . . . . . . . . . . . . . . . 29 86 5.3.1. Tunnel Interface MTU . . . . . . . . . . . . . . . . . 29 87 5.3.2. Admitting Packets into the Tunnel Interface . . . . . 29 88 5.3.3. Segmentation . . . . . . . . . . . . . . . . . . . . . 29 89 5.3.4. Encapsulation . . . . . . . . . . . . . . . . . . . . 30 90 5.3.5. Probing Strategy . . . . . . . . . . . . . . . . . . . 30 91 5.3.6. Packet Identification . . . . . . . . . . . . . . . . 30 92 5.3.7. Sending SEAL Protocol Packets . . . . . . . . . . . . 30 93 5.3.8. Processing Raw ICMP Messages . . . . . . . . . . . . . 30 94 5.3.9. Processing SEAL Control Messages . . . . . . . . . . . 30 95 5.4. ETE Specification . . . . . . . . . . . . . . . . . . . . 30 96 5.4.1. Reassembly Buffer Requirements . . . . . . . . . . . . 30 97 5.4.2. IP-Layer Reassembly . . . . . . . . . . . . . . . . . 31 98 5.4.3. SEAL-Layer Reassembly . . . . . . . . . . . . . . . . 31 99 5.4.4. Decapsulation and Delivery to Upper Layers . . . . . . 31 100 5.4.5. Sending SEAL Control Messages . . . . . . . . . . . . 31 101 6. Link Requirements . . . . . . . . . . . . . . . . . . . . . . 31 102 7. End System Requirements . . . . . . . . . . . . . . . . . . . 31 103 8. Router Requirements . . . . . . . . . . . . . . . . . . . . . 31 104 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 32 105 10. Security Considerations . . . . . . . . . . . . . . . . . . . 32 106 11. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 32 107 12. SEAL Advantages over Classical Methods . . . . . . . . . . . . 33 108 13. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 34 109 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 34 110 14.1. Normative References . . . . . . . . . . . . . . . . . . . 34 111 14.2. Informative References . . . . . . . . . . . . . . . . . . 35 112 Appendix A. Reliability . . . . . . . . . . . . . . . . . . . . . 37 113 Appendix B. Transport Mode . . . . . . . . . . . . . . . . . . . 37 114 Appendix C. Historic Evolution of PMTUD . . . . . . . . . . . . . 38 115 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 39 117 1. Introduction 119 As Internet technology and communication has grown and matured, many 120 techniques have developed that use virtual topologies (including 121 tunnels of one form or another) over an actual network that supports 122 the Internet Protocol (IP) [RFC0791][RFC2460]. Those virtual 123 topologies have elements that appear as one hop in the virtual 124 topology, but are actually multiple IP or sub-IP layer hops. These 125 multiple hops often have quite diverse properties that are often not 126 even visible to the endpoints of the virtual hop. This introduces 127 failure modes that are not dealt with well in current approaches. 129 The use of IP encapsulation has long been considered as the means for 130 creating such virtual topologies. However, the insertion of an outer 131 IP header reduces the effective path MTU as-seen by the IP layer. 132 When IPv4 is used, this reduced MTU can be accommodated through the 133 use of IPv4 fragmentation, but unmitigated in-the-network 134 fragmentation has been found to be harmful through operational 135 experience and studies conducted over the course of many years 136 [FRAG][FOLK][RFC4963]. Additionally, classical path MTU discovery 137 [RFC1191] has known operational issues that are exacerbated by in- 138 the-network tunnels [RFC2923][RFC4459]. The following subsections 139 present further details on the motivation and approach for addressing 140 these issues. 142 1.1. Motivation 144 Before discussing the approach, it is necessary to first understand 145 the problems. In both the Internet and private-use networks today, 146 IPv4 is ubiquitously deployed as the Layer 3 protocol. The two 147 primary functions of IPv4 are to provide for 1) addressing, and 2) a 148 fragmentation and reassembly capability used to accommodate links 149 with diverse MTUs. While it is well known that the addressing 150 properties of IPv4 are limited (hence, the larger address space 151 provided by IPv6), there is a lesser-known but growing consensus that 152 other limitations may be unable to sustain continued growth. 154 First, the IPv4 header Identification field is only 16 bits in 155 length, meaning that at most 2^16 packets pertaining to the same 156 (source, destination, protocol, Identification)-tuple may be active 157 in the Internet at a given time. Due to the escalating deployment of 158 high-speed links (e.g., 1Gbps Ethernet), however, this number may 159 soon become too small by several orders of magnitude. Furthermore, 160 there are many well-known limitations pertaining to IPv4 161 fragmentation and reassembly - even to the point that it has been 162 deemed "harmful" in both classic and modern-day studies (cited 163 above). In particular, IPv4 fragmentation raises issues ranging from 164 minor annoyances (e.g., in-the-network router fragmentation) to the 165 potential for major integrity issues (e.g., mis-association of the 166 fragments of multiple IP packets during reassembly). 168 As a result of these perceived limitations, a fragmentation-avoiding 169 technique for discovering the MTU of the forward path from a source 170 to a destination node was devised through the deliberations of the 171 Path MTU Discovery Working Group (PMTUDWG) during the late 1980's 172 through early 1990's (see Appendix C). In this method, the source 173 node provides explicit instructions to routers in the path to discard 174 the packet and return an ICMP error message if an MTU restriction is 175 encountered. However, this approach has several serious shortcomings 176 that lead to an overall "brittleness". 178 In particular, site border routers in the Internet are being 179 configured more and more to discard ICMP error messages coming from 180 the outside world. This is due in large part to the fact that 181 malicious spoofing of error messages in the Internet is made simple 182 since there is no way to authenticate the source of the messages. 183 Furthermore, when a source node that requires ICMP error message 184 feedback when a packet is dropped due to an MTU restriction does not 185 receive the messages, a path MTU-related black hole occurs. This 186 means that the source will continue to send packets that are too 187 large and never receive an indication from the network that they are 188 being discarded. 190 The issues with both IPv4 fragmentation and this "classical" method 191 of path MTU discovery are exacerbated further when IP-in-IP tunneling 192 is used. For example, site border routers that are configured as 193 ingress tunnel endpoints may be required to forward packets into the 194 subnetwork on behalf of hundreds, thousands, or even more original 195 sources located within the site. If IPv4 fragmentation were used, 196 this would quickly wrap the 16-bit Identification field and could 197 lead to undetected data corruption. If classical IPv4 path MTU 198 discovery were used instead, the site border router may be 199 inconvenienced by excessive ICMP error messages coming from the 200 subnetwork that may be either untrustworthy or insufficiently 201 provisioned to allow translation into error messages to be returned 202 to the original sources. 204 The situation is exacerbated further still by IPsec tunnels, since 205 only the first IPv4 fragment of a fragmented packet contains the 206 transport protocol selectors (e.g., the source and destination ports) 207 required for identifying the correct security association rendering 208 fragmentation useless under certain circumstances. Even worse, there 209 may be no way for a site border router that configures an IPsec 210 tunnel to transcribe the encrypted packet fragment contained in an 211 ICMP error message into a suitable ICMP error message to return to 212 the original source. 214 Due to these many limitations, a new approach to accommodate links 215 with diverse MTUs is necessary. 217 1.2. Approach 219 For the purpose of this document, subnetworks are defined as virtual 220 topologies that span connected network regions bounded by 221 encapsulating border nodes. Subnetworks in this sense correspond 222 exactly to the "enterprise" abstraction defined in Virtual Enterprise 223 Traversal (VET) [I-D.templin-autoconf-dhcp] and Routing and 224 Addressing in Next-Generation EnteRprises (RANGER) 225 [I-D.templin-ranger][I-D.russert-rangers]. Examples include the 226 global Internet interdomain routing core, Mobile Ad hoc Networks 227 (MANETs) and enterprise networks. Subnetwork border nodes forward 228 unicast and multicast IP packets over the virtual topology across 229 multiple IP and/or sub-IP layer forwarding hops that may introduce 230 packet duplication and/or traverse links with diverse Maximum 231 Transmission Units (MTUs). 233 This document introduces a Subnetwork Encapsulation and Adaptation 234 Layer (SEAL) for tunnel-mode operation of IP over subnetworks that 235 connect Ingress and Egress Tunnel Endpoints (ITEs/ETEs) of border 236 nodes. It provides a standalone specification designed to be 237 tailored to specific associated tunneling protocols such as VET 238 [I-D.templin-autoconf-dhcp], the Locator-Identifier Split Protocol 239 (LISP) [I-D.ietf-lisp] and others. A transport-mode of operation is 240 also possible, and described in Appendix B. SEAL accommodates links 241 with diverse MTUs, protects against off-path denial-of-service 242 attacks, and supports efficient duplicate packet detection through 243 the use of a minimal mid-layer encapsulation. 245 SEAL encapsulation introduces an extended Identification field for 246 packet identification and a mid-layer segmentation and reassembly 247 capability that allows simplified cutting and pasting of packets. 248 Moreover, SEAL senses in-the-network IPv4 fragmentation as a "noise" 249 indication that packet sizing parameters are "out of tune" with 250 respect to the network path. As a result, SEAL can naturally tune 251 its packet sizing parameters to eliminate the in-the-network 252 fragmentation. 254 SEAL encapsulation additionally includes a 2-bit version number. 255 This document specifies SEAL protocol versions 0 and 1. 257 2. Terminology and Requirements 259 The terms "inner", "mid-layer", and "outer", respectively, refer to 260 the innermost IP (layer, protocol, header, packet, etc.) before any 261 encapsulation, the mid-layer IP (protocol, header, packet, etc.) 262 after any mid-layer '*' encapsulation, and the outermost IP (layer, 263 protocol, header, packet etc.) after SEAL/*/IPv4 encapsulation. 265 The term "IP" used throughout the document refers to either Internet 266 Protocol version (IPv4 or IPv6). Additionally, the notation IPvX/*/ 267 SEAL/*/IPvY refers to an inner IPvX packet encapsulated in any mid- 268 layer '*' encapsulations, followed by the SEAL header, followed by 269 any outer '*' encapsulations, followed by an outer IPvY header, where 270 the notation "IPvX" means either IP protocol version (IPv4 or IPv6). 272 The following abbreviations correspond to terms used within this 273 document and elsewhere in common Internetworking nomenclature: 275 ITE - Ingress Tunnel Endpoint 277 ETE - Egress Tunnel Endpoint 279 PTB - an ICMPv6 "Packet Too Big", an ICMPv4 "Fragmentation Needed" 280 or a SEAL Reassembly Report message. 282 DF - the IPv4 header "Don't Fragment" flag 284 MHLEN - the length of any mid-layer '*' headers and trailers 286 OHLEN - the length of the outer encapsulating SEAL/*/IPv4 headers 288 HLEN - the sum of MHLEN and OHLEN 290 S_MRU - the SEAL Maximum Reassembly Unit 292 S_MSS - the SEAL Maximum Segment Size 294 S_CSS - the SEAL Clamped Segment Size 296 SEAL_ID - a 32-bit Identification value, randomly initialized and 297 monotonically incremented for each SEAL protocol packet 299 SEAL_PROTO - an IPv4 protocol number used for SEAL 301 SEAL_CPORT - a TCP/UDP service port number used for SEAL control 302 plane messaging 304 SEAL_DPORT - a TCP/UDP service port number used for SEAL data 305 plane messaging 307 The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, 308 SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this 309 document, are to be interpreted as described in [RFC2119]. 311 3. Applicability Statement 313 SEAL was motivated by the specific case of subnetwork abstraction for 314 Mobile Ad hoc Networks (MANETs); however, the domain of applicability 315 also extends to subnetwork abstractions of enterprise networks, ISP 316 networks, SOHO networks, the interdomain routing core, and many 317 others. In particular, SEAL is a natural complement to the 318 enterprise network abstraction manifested through the VET mechanism 319 [I-D.templin-autoconf-dhcp], the RANGER architecture 320 [I-D.templin-ranger][I-D.russert-rangers] and the LISP protocol 321 [I-D.ietf-lisp]. The term "subnetwork" within this document is used 322 synonymously with the term "enterprise" that appears in these 323 references. 325 SEAL introduces a minimal new sublayer for IPvX in IPvY encapsulation 326 (e.g., as IPv6/SEAL/IPv4), and appears as a subnetwork encapsulation 327 as seen by the inner IP layer. SEAL can also be used as a sublayer 328 for encapsulating inner IP packets within outer UDP/IPv4 headers 329 (e.g., as IPv6/SEAL/UDP/IPv4) such as for the Teredo domain of 330 applicability [RFC4380]. When it appears immediately after the outer 331 IPv4 header, the SEAL header is processed exactly as for IPv6 332 extension headers. 334 This document discusses the use of IPv4 as the outer encapsulation 335 layer; however, the same principles apply when IPv6 is used as the 336 outer layer. 338 4. SEAL Protocol Specification (Version 0) 340 This section specifies the fully-functioned version of SEAL known as 341 "SEAL Version 0", or "SEAL-MAX". A minimal version of SEAL known as 342 "SEAL Version 1", or "SEAL-LITE", is specified in Section 5. 344 4.1. Model of Operation 346 SEAL provides an encapsulation sublayer that supports the 347 transmission of unicast and multicast packets across an underlying IP 348 network. SEAL-enabled ITEs insert a SEAL header during the 349 encapsulation of inner IP packets in mid-layer and outer 350 encapsulating headers/trailers. For example, an inner IPv6 packet 351 would appear as IPv6/*/SEAL/*/IPv4 after mid-layer and outer 352 encapsulations, where '*' denotes zero or more additional 353 encapsulation sublayers. 355 SEAL-enabled ITEs add mid-layer '*' and outer SEAL/*/IPv4 356 encapsulations to the inner packets they inject into a subnetwork, 357 where the outermost IPv4 header contains the source and destination 358 addresses of the subnetwork entry/exit points (i.e., the ITE/ETE), 359 respectively. ITEs encapsulate each inner IP packet in mid-layer and 360 outer encapsulations as shown in Figure 1: 362 +-------------------------+ 363 | | 364 ~ Outer */IPv4 headers ~ 365 | | 366 I +-------------------------+ 367 n | SEAL Header | 368 n +-------------------------+ +-------------------------+ 369 e ~ Any mid-layer * headers ~ ~ Any mid-layer * headers ~ 370 r +-------------------------+ +-------------------------+ 371 | | | | 372 I --> ~ Inner IP ~ --> ~ Inner IP ~ 373 P --> ~ Packet ~ --> ~ Packet ~ 374 | | | | 375 P +-------------------------+ +-------------------------+ 376 a ~ Any mid-layer trailers ~ ~ Any mid-layer trailers ~ 377 c +-------------------------+ +-------------------------+ 378 k ~ Any outer trailers ~ 379 e +-------------------------+ 380 t 381 (After mid-layer encaps.) (After SEAL/*/IPv4 encaps.) 383 Figure 1: SEAL Encapsulation 385 where the SEAL header is inserted as follows: 387 o For simple IPvX/IPvY encapsulations (e.g., 388 [RFC2003][RFC2004][RFC2473][RFC4213]), the SEAL header is inserted 389 between the inner and outer IP headers as: IPvX/SEAL/IPvY. 391 o For tunnel-mode IPsec encapsulations, [RFC4301], the SEAL header 392 is inserted between the {AH,ESP} header and outer IP headers as: 393 IPvX/*/{AH,ESP}/SEAL/IPvY. 395 o For IP encapsulations over transports such as UDP, the SEAL header 396 is inserted immediately after the outer transport layer header, 397 e.g., as IPvX/*/SEAL/UDP/IPvY. 399 SEAL-encapsulated packets include a SEAL_ID to uniquely identify each 400 packet. Routers within the subnetwork use the SEAL_ID for duplicate 401 packet detection, and ITEs/ETEs use the SEAL_ID for SEAL 402 segmentation/reassembly and protection against off-path attacks. 404 For IPv4, the SEAL_ID is formed from the concatenation of the 16-bit 405 ID Extension field in the SEAL header as the most-significant bits, 406 and with the 16-bit Identification value in the outer IPv4 header as 407 the least-significant bits. For IPv6, the SEAL_ID is written into 408 the 32-bit Identification field of the fragment header. For tunnels 409 that traverse middleboxes that might rewrite the IP ID field, e.g., a 410 Network Address Translator, the SEAL_ID is instead maintained only 411 within the ID field in the SEAL header. 413 SEAL enables a multi-level segmentation and reassembly capability. 414 First, the ITE can use IPv4 fragmentation to fragment inner IPv4 415 packets before SEAL encapsulation. Secondly, the SEAL layer itself 416 provides a simple cutting-and-pasting capability for mid-layer 417 packets to avoid IP fragmentation on the outer packet. Finally, 418 ordinary IP fragmentation is permitted on the outer packet after SEAL 419 encapsulation and used to detect and dampen any in-the-network 420 fragmentation as quickly as possible. 422 The following sections specify the SEAL header format and SEAL- 423 related operations of the ITE and ETE, respectively. 425 4.2. SEAL Header Format (Version 0) 427 The SEAL version 0 header is formatted as follows: 429 0 1 2 3 430 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 431 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 432 |VER|A|I|F|M|RSV| NEXTHDR/SEG | ID Extension | 433 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 435 Figure 2: SEAL Version 0 Header Format 437 where the header fields are defined as: 439 VER (2) 440 a 2-bit value that encodes the SEAL protocol version number. This 441 section describes Version 0 of the SEAL protocol, i.e., the VER 442 field encodes the value '00'. 444 A (1) 445 the "Acknowledgement Requested" bit. Set to 1 if the ITE wishes 446 to receive an explicit acknowledgement from the ETE. 448 I (1) 449 the "Information Request Solicit" bit. Set to 1 if the ITE wishes 450 the ETE to initiate an Information Request. 452 F (1) 453 the "First Segment" bit. Set to 1 if this SEAL protocol packet 454 contains the first segment (i.e., Segment #0) of a mid-layer 455 packet. 457 M (1) 458 the "More Segments" bit. Set to 1 if this SEAL protocol packet 459 contains a non-final segment of a multi-segment mid-layer packet. 461 RSV (2) 462 a 2-bit Reserved field. Set to 0 for the purpose of this 463 specification. 465 NEXTHDR/SEG (8) an 8-bit field. When 'F'=1, encodes the next header 466 Internet Protocol number the same as for the IPv4 protocol and 467 IPv6 next header fields. When 'F'=0, encodes a segment number of 468 a multi-segment mid-layer packet. (The segment number 0 is 469 reserved.) 471 ID Extension (16) 472 a 16-bit Identification extension field. 474 4.3. ITE Specification 476 4.3.1. Tunnel Interface MTU 478 The ITE configures a tunnel virtual interface over one or more 479 underlying links that connect the border node to the subnetwork. The 480 tunnel interface must present a fixed MTU to the inner IP layer 481 (i.e., Layer 3) as the size for admission of inner IP packets into 482 the tunnel. Since the tunnel interface may support a potentially 483 large set of ETEs, however, care must be taken in setting a large- 484 enough MTU for all ETEs while still upholding end system 485 expectations. 487 Due to the ubiquitous deployment of standard Ethernet and similar 488 networking gear, the nominal Internet cell size has become 1500 489 bytes; this is the de facto size that end systems have come to expect 490 will either be delivered by the network without loss due to an MTU 491 restriction on the path or a suitable PTB message returned. However, 492 the network may not always deliver the necessary PTBs, leading to 493 MTU-related black holes [RFC2923]. The ITE therefore requires a 494 means for conveying 1500 byte (or smaller) packets to the ETE without 495 loss due to MTU restrictions and without dependence on PTB messages 496 from within the subnetwork. 498 In common deployments, there may be many forwarding hops between the 499 original source and the ITE. Within those hops, there may be 500 additional encapsulations (IPSec, L2TP, other SEAL encapsulations, 501 etc.) such that a 1500 byte packet sent by the original source might 502 grow to a larger size by the time it reaches the ITE for 503 encapsulation as an inner IP packet. Similarly, additional 504 encapsulations on the path from the ITE to the ETE could cause the 505 encapsulated packet to become larger still and trigger in-the-network 506 fragmentation. In order to preserve the end system expectations, the 507 ITE therefore requires a means for conveying these larger packets to 508 the ETE even though there may be links within the subnetwork that 509 configure a smaller MTU. 511 The ITE should therefore set a tunnel virtual interface MTU of 1500 512 bytes plus extra room to accommodate any additional encapsulations 513 that may occur on the path from the original source (i.e., even if 514 the path to the ETE does not support an MTU of this size). The ITE 515 can set larger MTU values still, but should select a value that is 516 not so large as to cause excessive PTBs coming from within the tunnel 517 interface (see Sections 4.3.3 and 4.3.8). The ITE can also set 518 smaller MTU values; however, care must be taken not to set so small a 519 value that original sources would experience an MTU underflow. In 520 particular, IPv6 sources must see a minimum path MTU of 1280 bytes, 521 and IPv4 sources should see a minimum path MTU of 576 bytes. 523 The ITE can alternatively set an indefinite MTU on the tunnel virtual 524 interface such that all inner IP packets are admitted into the 525 interface without regard to size. This option must be carefully 526 coordinated with the ITE's protocol stack upper layers, since some 527 upper layer protocols (e.g., TCP) derive their packet sizing 528 parameters from the MTU of the underlying interface and as such may 529 select too large an initial size. This is not a problem for upper 530 layers that use conservative initial estimates, e.g., when mechanisms 531 such as Packetization Layer Path MTU Discovery [RFC4821] are used. 533 4.3.2. Admitting Packets into the Tunnel Interface 535 The inner IP layer consults the tunnel interface MTU when admitting a 536 packet into the interface. For IPv4 packets with the IPv4 Don't 537 Fragment (DF) bit set to 0, if the packet is larger than the tunnel 538 interface MTU the inner IP layer uses IP fragmentation to break the 539 packet into fragments no larger than the tunnel interface MTU. The 540 ITE then admits each fragment into the tunnel as an independent 541 packet. 543 For all other packets, the ITE admits the packet if it is no larger 544 than the tunnel interface MTU; otherwise, it drops the packet and 545 sends an ICMP PTB error message to the source with the MTU value set 546 to the tunnel interface MTU. The message must contain as much of the 547 invoking packet as possible without the entire message exceeding the 548 minimum IP MTU (i.e., 576 bytes for IPv4 and 1280 bytes for IPv6). 550 Note that when the tunnel interface sets an indefinite MTU all 551 packets are unconditionally admitted into the interface without 552 fragmentation. 554 4.3.3. Segmentation 556 For each ETE, the ITE maintains soft state within the tunnel 557 interface (e.g., in a destination cache) used to support inner 558 fragmentation and SEAL segmentation. The soft state includes the 559 following: 561 o a Mid-layer Header Length (MHLEN); set to the length of any mid- 562 layer '*' encapsulation headers and trailers (e.g., for '*' = AH, 563 ESP, NULL, etc.). MHLEN additionally includes 4 extra bytes for a 564 trailing mid-layer checksum (see below). 566 o an Outer Header Length (OHLEN); set to the length of the outer 567 SEAL/*/IP encapsulation headers and trailers. 569 o a total Header Length (HLEN); set to MHLEN plus OHLEN. 571 o a SEAL Maximum Segment Size (S_MSS); initialized to a value that 572 is no larger than the underlying IP interface MTU. The ITE 573 decreases or increases S_MSS based on any SEAL Reassembly Report 574 messages received (see Section 4.3.9). 576 o a SEAL Clamped Segment Size (S_CSS); a value that is no larger 577 than S_MSS and that would also be unlikely to incur fragmentation 578 beyond the tunnel, (e.g., 576 bytes for IPv4 and 1280 bytes for 579 IPv6). May be set to larger values only if there is high 580 assurance that all links within the tunnel configure a larger MTU. 581 The ITE decreases S_CSS in conjunction with S_MSS, but only 582 increases S_CSS to a "safe" value as above if S_MSS is increased. 584 o a SEAL Maximum Reassembly Unit (S_MRU); initialized to the larger 585 of S_MSS and the known or estimated Maximum Receive Unit (MRU) 586 actually configured by the ETE (2KB minimum default). The ITE 587 decreases or increases S_MRU based on any SEAL Reassembly Report 588 messages received (see Section 4.3.9). When (S_MRU>(S_MSS*256)), 589 the ITE uses (S_MSS*256) as the effective S_MRU value. 591 After an inner packet/fragment has been admitted into the tunnel 592 interface the ITE uses the following algorithm to determine whether 593 the packet can be accommodated at all and (if so) whether inner IP 594 fragmentation is needed: 596 o if the inner packet is an IPv6 packet or an IPv4 packet with DF=1, 597 and the packet is larger than (S_MRU - HLEN), the ITE drops the 598 packet and sends an ICMP PTB message to the original source with 599 an MTU value of (S_MRU - HLEN) the same as described in Section 600 4.3.2; else, 602 o if the inner packet is a SEAL IPv4 packet with DF=0, and the 603 packet is larger than (S_MRU - HLEN), the ITE uses inner IPv4 604 fragmentation to break the packet into fragments no larger than 605 (S_MRU - HLEN); else, 607 o if the inner packet is a non-SEAL IPv4 packet with DF=0, the ITE 608 uses inner IPv4 fragmentation to break the packet into fragments 609 no larger than (S_CSS - HLEN); else, 611 o the ITE processes the packet without inner fragmentation. 613 (Note that in the above the ITE must also track whether the tunnel 614 interface is using header compression on the inner */IP headers. If 615 so, the ITE must include the length of the uncompressed */IP inner 616 header when calculating the total length of the inner packet.) 618 The ITE next encapsulates each inner packet/fragment in the MHLEN 619 bytes of mid-layer '*' headers and trailers and reserves 4 bytes for 620 a trailing checksum at the end of the mid-layer trailers. The ITE 621 then calculates a checksum of the mid-layer packet using the 16-bit 622 Fletcher Checksum algorithm specified in [RFC1146], Appendix II, and 623 writes the 'A' portion as the most significant 16 bits and the 'B' 624 portion as the least significant 16 bits. 626 If the length of the resulting mid-layer packet plus OHLEN is greater 627 than S_MSS, the ITE must additionally perform SEAL segmentation. To 628 do so, it breaks the mid-layer packet into N segments (N <= 256) that 629 are no larger than (S_MSS - OHLEN) bytes each. Each segment, except 630 the final one, MUST be of equal length. The first byte of each 631 segment MUST begin immediately after the final byte of the previous 632 segment, i.e., the segments MUST NOT overlap. The ITE SHOULD 633 generate the smallest number of segments possible, e.g., it SHOULD 634 NOT generate 6 smaller segments when the packet could be accommodated 635 with 4 larger segments. 637 Note that this SEAL segmentation ignores the fact that the mid-layer 638 packet may be unfragmentable outside of the subnetwork. This 639 segmentation process is a mid-layer (not an IP layer) operation 640 employed by the ITE to adapt the mid-layer packet to the subnetwork 641 path characteristics, and the ETE will restore the packet to its 642 original form during reassembly. Therefore, the fact that the packet 643 may have been segmented within the subnetwork is not observable 644 outside of the subnetwork. 646 4.3.4. Encapsulation 648 Following SEAL segmentation, the ITE encapsulates each segment in a 649 SEAL header formatted as specified in Section 4.3.2 with VER='00' and 650 RSV='00'. For the first segment, the ITE sets F=1 and sets NEXTHDR 651 to the Internet Protocol number of the encapsulated packet; the ITE 652 next sets M=1 if there are more segments or sets M=0 otherwise. For 653 each non-initial segment of an N-segment mid-layer packet (N <= 256), 654 the ITE sets (F=0; M=1; SEG=1) in the SEAL header of the first non- 655 initial segment, sets (F=0; M=1; SEG=2) in the next non-initial 656 segment, etc., and sets (F=0; M=0; SEG=N-1) in the final segment. 657 (Note that the value SEG=0 is not used.) 659 The ITE next encapsulates each segment in the requisite */IP outer 660 headers according to the specific encapsulation format (e.g., 661 [RFC2003], [RFC2473], [RFC4213], [RFC4380], etc.), except that it 662 writes 'SEAL_PROTO' in the protocol field of the outer IP header 663 (when simple IP encapsulation is used) or writes 'SEAL_DPORT' in the 664 outer destination service port field (e.g., when UDP/IP encapsulation 665 is used). The ITE finally sets the A bit as specified in Section 666 4.3.5 (if necessary), sets the packet identification values as 667 specified in Section 4.3.6 and sends the packets as specified in 668 Section 4.3.7. 670 Note that when IPv6 is used as the outer IP encapsulation layer, the 671 ITE must insert an IPv6 fragment header with an Identification value 672 set as described in Section 4.3.6. 674 4.3.5. Probing Strategy 676 All SEAL packets sent by the ITE are considered implicit probes, and 677 will elicit Reassembly Reports from the ETE with a new value for 678 S_MSS if any IP fragmentation occurs in the path. Thereafter, the 679 ITE may periodically reset S_MSS to a larger value (e.g., the 680 underlying IP interface MTU minus OHLEN bytes) to detect path MTU 681 increases. 683 The ITE should additionally send explicit probes, periodically, to 684 verify that the ETE is still reachable and to manage a window of 685 SEAL_IDs. The ITE sets (A=1; F=1) in the SEAL header of a first- 686 segment to be used as an explicit probe, where the probe can be 687 either an ordinary data packet or a NULL packet created by setting 688 the 'Next Header' field to a value of "No Next Header" (see Section 689 4.7 of [RFC2460]). The probe will elicit a Reassembly Report from 690 the ETE as an acknowledgement. 692 The ITE can also send probes using non-initial SEAL segments to 693 determine whether any of the preceding segments of the same SEAL 694 packet are missing. The probe will elicit a Reassembly Report from 695 the ETE with a Bitmap of received and missing segments. 697 Finally, the ITE MAY send "expendable" probe packets (see Section 698 4.3.7) in order to generate ICMP PTB messages from routers on the 699 path to the ETE. 701 4.3.6. Packet Identification 703 For the purpose of packet identification, the ITE maintains a SEAL_ID 704 value as per-ETE soft state, e.g., in the destination cache. The ITE 705 randomly initializes SEAL_ID when the soft state is created, and 706 monotonically increments it for each successive SEAL protocol packet 707 it sends to the ETE. 709 For each outer IPv4 packet, the ITE writes the least-significant 16 710 bits of the SEAL_ID value into the Identification field in the outer 711 IPv4 header, and writes the most-significant 16 bits in the ID 712 Extension field in the SEAL header. For each outer IPv6 packet, the 713 ITE writes the entire SEAL_ID value into the Identification field in 714 the IPv6 fragment header. 716 For ITE->ETR tunnels specifically designed for the traversal of 717 Network Address Translators (NATs) and other middleboxes that may 718 rewrite the outer IP ID field, the ITE instead writes least 719 significant bits of the SEAL_ID in the ID field of the SEAL header 720 and writes a random value in the Identification field in the outer IP 721 header. Since the ID field in the SEAL header is only 16 bits, 722 however, the ITE must limit the rate at which it sends packets to 723 avoid wrapping the ID field. Alternatively, the ITE and ETE can use 724 SEAL-LITE to obtain a larger ID field in the SEAL header (see Section 725 5.3.6). 727 4.3.7. Sending SEAL Protocol Packets 729 Following SEAL segmentation and encapsulation, the ITE sets DF=0 for 730 ordinary SEAL/*/IPv4 packets, but may set DF=1 for "expendable" SEAL/ 731 */IPv4 packets (e.g., for NULL packets used as probes -- see Section 732 4.3.5). For SEAL/*/IPv6 packets, the "DF" bit is always implicitly 733 set to 1, but when a fragment header is included a translating router 734 on the path may still fragment the packet. 736 The ITE sends each outer packet that encapsulates a segment of the 737 same mid-layer packet into the tunnel in canonical order, i.e., 738 segment 0 first, followed by segment 1, etc., and finally segment 739 N-1. 741 4.3.8. Processing Raw ICMP Messages 743 The ITE may receive "raw" ICMP error messages [RFC0792][RFC4443] from 744 either the ETE or routers within the subnetwork that comprise an 745 outer IP header, followed by an ICMP header, followed by a portion of 746 the SEAL packet that generated the error (also known as the "packet- 747 in-error"). The ITE can use the SEAL ID encoded in the packet-in- 748 error as a nonce to confirm that the ICMP message came from either 749 the ETE or an on-path router, and can use any additional information 750 to determine whether to accept or discard the message. 752 The ITE should specifically process raw ICMPv4 Protocol Unreachable 753 messages and ICMPv6 Parameter Problem messages with Code 754 "Unrecognized Next Header type encountered" as a hint that the ETE 755 does not implement the SEAL protocol. 757 4.3.9. Processing SEAL Control Messages 759 In addition to any raw ICMP messages, the ITE may receive UDP/IP SEAL 760 control messages from the ETE formatted as specified in Section 4.4.5 761 and with 'SEAL_CPORT' as the UDP destination port. The ITE must 762 therefore monitor the 'SEAL_CPORT' UDP port and process any messages 763 that arrive on that port. 765 For each control message, the ITE verifies the UDP checksum and 766 discards the message if the checksum is incorrect. The ITE can then 767 verify that the SEAL_ID is within the current window of transmitted 768 SEAL_IDs for this ETE. If the SEAL_ID is outside of the window, the 769 ITE discards the message; otherwise, it advances the window and 770 processes the message. The ITE processes SEAL control messages as 771 follows: 773 4.3.9.1. Reassembly Report (Type=0) 775 When the ITE receives a Reassembly Report formatted as specified in 776 Section 4.4.5.1, it processes the message according to the Code value 777 as follows: 779 4.3.9.1.1. IP Fragmentation Experienced (Code=0) 781 The ITE records the value in the S_MRU field in its soft state for 782 this ETE and adjusts the S_MSS value in its soft state. If the S_MSS 783 value in the Reassembly Report is greater than 576 (i.e., the nominal 784 minimum MTU for IPv4 links), the ITE records this new value in its 785 soft state. If the S_MSS value in the report is less than the 786 current soft state value and also less than 576, the can discern that 787 IP fragmentation is occurring but it cannot determine the true MTU of 788 the restricting link due to a router on the path generating runt 789 first-fragments. 791 The ITE should therefore search for a reduced S_MSS value through an 792 iterative searching strategy that parallels (Section 5 of [RFC1191]). 793 This searching strategy may require multiple iterations of sending 794 SEAL packets using a reduced S_MSS and receiving additional 795 Reassembly Report messages, but it will soon converge to a stable 796 value. During this process, it is essential that the ITE reduce 797 S_MSS based on the first Reassembly Report message received, and 798 refrain from further reducing S_MSS until SEAL Reassembly Report 799 messages pertaining to packets sent under the new S_MSS are received. 801 4.3.9.1.2. Segment Acknowledged (Code=1) 803 The ITE records the value in the S_MRU field in its soft state for 804 this ETE. If the S_MSS value in the report is non-zero, the ITE also 805 adjusts its S_MSS value the same as for an IP Fragmentation 806 Experienced message (see Section 4.3.9.1.1). 808 The ITE next examines the Bitmap field to determine which segments of 809 this SEAL packet were received. The map is arranged with segment 0 810 represented as the most significant bit, segment 1 represented as the 811 next most significant bit, etc., up to the segment that triggered the 812 reassembly report (i.e., segment N) as the final bit. For example, 813 when N=6 the bit map '0110011' means that segments 1, 2, 5 and 6 were 814 received but segments 0, 3 and 4 were missing from the ETE's 815 reassembly buffer. The ITE can then retransmit segments 0, 3 and 4 816 if it still has them in its cache, and if there is reason to believe 817 the retransmissions may satisfy the pending reassembly. 819 4.3.9.1.3. Packet Too Big (Code=2) 821 The ITE records the value in the S_MRU field in its soft state for 822 this ETE. 824 4.3.9.1.4. Time Exceeded (Code=3) 826 The ITE examines the time encoded in the Data field, and reduces its 827 S_MRU estimate for this ETE if it there is significant evidence that 828 large packets are timing out prior to SEAL reassembly completion. 829 The ITE may log the event for network management purposes. 831 4.3.9.1.5. Checksum Incorrect (Code=4) 833 The ITE may log the event for network management purposes. When 834 sustained Checksum Incorrect messages are received from this ETE, the 835 ITE may also benefit by adjusting its packet sizing parameters. 837 4.3.9.2. Parameter Problem (Type=1) 839 When the ITE receives a Parameter Problem message formatted as 840 specified in Section 4.4.5.2, it examines the encapsulated SEAL 841 header in the message to determine whether the header was corrupted 842 or whether the header specified features that the ETE did not 843 recognize. The ITE MAY log the event for network management 844 purposes. For uncorrupted headers, the SHOULD adjust its SEAL header 845 parameters in subsequent SEAL packets. 847 4.3.9.3. Information Request (Type=2) 849 When the ITE receives an Information Request message formatted as 850 specified in Section 4.4.5.5 and with a SEAL_ID that corresponds to a 851 SEAL packet that it sent earlier with I=1, it sends an Information 852 Reply as specified in Section 4.4.5.6. 854 4.4. ETE Specification 856 4.4.1. Reassembly Buffer Requirements 858 ETEs must be capable of performing IP-layer reassembly for SEAL 859 protocol IP packets up to 2KB in length, and must also be capable of 860 performing SEAL-layer reassembly for mid-layer packets up to (2KB - 861 OHLEN). Hence, ETEs: 863 o MUST configure a reassembly buffer of at least 2KB 865 o MAY configure a larger reassembly buffer 867 o MUST be capable of discarding SEAL packets that are too large to 868 reassemble 870 Note that the ETE must retain the SEAL/*/IP header during both IP- 871 layer and SEAL-layer reassembly for the purpose of associating the 872 fragments/segments of the same packet. 874 4.4.2. IP-Layer Reassembly 876 ETEs perform standard IP-layer reassembly for SEAL protocol IP 877 fragments, and should maintain a conservative reassembly cache high- 878 and low-water mark . When the size of the reassembly cache exceeds 879 this high-water mark, the ETE should actively discard incomplete 880 reassemblies (e.g., using an Active Queue Management (AQM) strategy) 881 until the size falls below the low-water mark. The ETE should also 882 actively discard any pending reassemblies that clearly have no 883 opportunity for completion, e.g., when a considerable number of new 884 fragments have been received before a fragment that completes a 885 pending reassembly has arrived. 887 When the ETE processes the IP first-fragment (i.e, one with MF=1 and 888 Offset=0 in the IP header) of a fragmented SEAL packet, it sends a 889 SEAL Reassembly Report message back to the ITE with the S_MSS field 890 set to the length of the first-fragment and with the S_MRU field set 891 to the size of the ETE's reassembly buffer (see Section 4.4.5). 893 4.4.3. SEAL-Layer Reassembly 895 Following IP reassembly of a SEAL segment, the ETE adds the segment 896 to a SEAL-Layer pending-reassembly queue according to the (Source, 897 Destination, SEAL_ID)-tuple found in the outer SEAL/*/IP headers. 898 The ETE performs SEAL-layer reassembly through simple in-order 899 concatenation of the encapsulated segments of the same mid-layer 900 packet from N consecutive SEAL packets. SEAL-layer reassembly 901 requires the ETE to maintain a cache of recently received segments 902 for a hold time that would allow for reasonable inter-segment delays 903 (e.g., 15 seconds). When a SEAL reassembly times out, the ETE 904 discards the incomplete reassembly and returns a Reassembly Report - 905 Time Exceeded message to the ITE (see Section 4.4.5). As for IP- 906 layer reassembly, the ETE should also maintain a conservative 907 reassembly cache high- and low-water mark and should actively discard 908 any pending reassemblies that clearly have no opportunity for 909 completion, e.g., when a considerable number of new SEAL packets have 910 been received before a packet that completes a pending reassembly has 911 arrived. 913 When the ETE receives a SEAL packet with an incorrect value in the 914 SEAL header, it discards the packet and returns a Parameter Problem 915 message (see Section 4.4.5). If the ETE receives a SEAL packet for 916 which a segment with the same (Source, Destination, SEAL_ID)-tuple is 917 already in the queue, it must determine whether to accept the new 918 segment and release the old, or drop the new segment. If accepting 919 the new segment would cause an inconsistency with other segments 920 already in the queue (e.g., differing segment lengths), the ETE drops 921 the segment that is least likely to complete the reassembly. 923 When the ETE adds a SEAL packet with A=1 and with SEG=N to a 924 reassembly queue, it sends a Reassembly Report - Segment Acknowledged 925 message back to the ITE as specified in Section 4.4.6 that contains a 926 bitmask of this and all prior segments already in the queue beginning 927 with segment 0 in the most-significant bit, segment 1 in the next 928 bit, etc.,up to segment N in the final bit. For example, when N=4, 929 the bitmask '01101' indicates that segments 1, 2 and 4 are in the 930 queue while segments 0 and 3 are missing. 932 After all segments are gathered, the ETE reassembles the mid-layer 933 packet by concatenating the segments encapsulated in N consecutive 934 SEAL packets beginning with the initial segment (i.e., SEG=0) and 935 followed by any non-initial segments 1 through N-1. That is, for an 936 N-segment mid-layer packet, reassembly entails the concatenation of 937 the SEAL-encapsulated mid-layer packet segments with (F=1, M=1, 938 SEAL_ID=j) in the first SEAL header, followed by (F=0, M=1, SEG=1, 939 SEAL_ID=(j+1)) in the next SEAL header, followed by (F=0, M=1, SEG=2, 940 SEAL_ID=(j+2)), etc., up to (F=0, M=0, SEG=(N-1), SEAL_ID=(j + N-1)) 941 in the final SEAL header. (Note that modulo arithmetic based on the 942 length of the SEAL_ID field is used). 944 When the ETE determines that a mid-layer packet is too large to 945 reassemble, it releases the reassembly queue resources and sends a 946 Reassembly Report - Packet Too Big message back to the ITE with the 947 S_MRU field set to the size of the ETE's reassembly buffer (see 948 Section 4.4.5). 950 4.4.4. Decapsulation and Delivery to Upper Layers 952 Following SEAL-layer reassembly, the ETE verifies the trailing 953 checksum of the mid-layer packet using the algorithm in Section 954 4.3.3. If the checksum is incorrect, the ETE discards the packet and 955 sends a Reassembly Report - Checksum Incorrect message to the ITE 956 (see Section 4.4.5). If the reassembled mid-layer packet is larger 957 than (S_MRU-OHLEN), the ETE discards the packet and sends a 958 Reassembly Report - Packet Too Big message to the ITE (see Section 959 4.4.5). 961 Otherwise, the ETE discards the outer and mid-layer headers and 962 trailers, and delivers the inner packet to the upper-layer protocol 963 indicated in the SEAL Next Header field. (If the reassembled packet 964 if it was a NULL packet (see Section 4.3.4), the ETE instead silently 965 discards the packet). 967 4.4.5. Sending SEAL Control Messages 969 The ETE generates SEAL control messages in response to certain SEAL 970 packets. SEAL control messages are formated much the same as for 971 ICMPv4 [RFC0792] and ICMPv6 [RFC4443] messages, and are used for very 972 similar purposes. The ETE prepares each control message as a UDP/IP 973 packet as shown in Figure 3: 975 0 1 2 3 976 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 977 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 978 | | 979 ~ UDP/IP Headers (dport=SEAL_CPORT) ~ 980 | | 981 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 982 | SEAL ID | 983 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 984 | Type | Code | Data | 985 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 986 | | 987 + Message Body + 988 | | 989 +-+-+ ... 991 Figure 3: SEAL Control Message Format 993 The control message consists of outer UDP/IP headers followed by a 994 32-bit SEAL_ID followed by a 32-bit control field followed by the 995 message body. When the ETE/ITE prepares a control message, it sets 996 the outer IP destination and source addresses of the message to the 997 source and destination addresses (respectively) of the SEAL packet 998 that triggered the message. If the destination address in the packet 999 was multicast, the ETE/ITE instead sets the outer IP source address 1000 to an address assigned to the underlying IP interface. The ETE/ITE 1001 next sets the UDP destination port to 'SEAL_CPORT'' and sets the UDP 1002 source port to a constant value of its choosing. It then sets the 1003 SEAL_ID to the Identification value encoded in the SEAL packet that 1004 triggered the message. 1006 As for ICPMv4 and ICMPv6 messages, the SEAL control header includes 1007 an 8-bit Type field in bits 0 thru 7 and an 8-bit Code field in bits 1008 8 thru 15. Unlike ICMPv4 and ICMPv6 messages, however, the control 1009 header does not include a checksum field (since the UDP header 1010 already contains a checksum) but instead includes a 16-bit Data field 1011 in bits 16 thru 31. The ETE/ITE sets the Type, Code, Data and 1012 Message body fields according to the specific SEAL control message 1013 type, then sends the message. The following types are currently 1014 defined; other values for Type will be recorded in the IANA registry 1015 for SEAL: 1017 4.4.5.1. Reassembly Report (Type=0) 1019 The ETE generates a Reassembly Report to inform the ITE of various 1020 conditions encountered during SEAL-layer reassembly. The following 1021 values for Code are defined: 1023 o Code = 0 : IP Fragmentation Experienced 1025 o Code = 1 : Segment Acknowledged 1027 o Code = 2 : Packet Too Big 1029 o Code = 3 : Time Exceeded 1031 o Code = 4 : Checksum Incorrect 1033 (Other values for Code will be recorded in the IANA registry for 1034 SEAL.) 1036 The ETE prepares the Reassembly Report according to the Code as 1037 follows: 1039 4.4.5.1.1. IP Fragmentation Experienced (Code=0) 1041 When the ITE receives an IP first-fragment of a SEAL packet that 1042 experienced outer IP fragmentation, it examines SEAL header and 1043 examines the IP reassembly buffer to assess the likelihood that 1044 reassembly will complete. If the 'A" bit is not set in the SEAL 1045 header, or if IP reassembly completion appears unlikely, the ETE uses 1046 the IP first-fragment to prepare a "Reassembly Report - IP 1047 Fragmentation Experienced" message with Type=0, Code=0, and Data=0. 1048 The message is formatted as follows 1050 0 1 2 3 1051 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1052 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1053 | Type=0 | Code=0 | Data=0 | 1054 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1055 | SEAL Header of Packet that Triggered the Report | 1056 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1057 | S_MRU | 1058 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1059 | S_MSS | 1060 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1062 Figure 4: IP Fragmentation Experienced Message Format 1064 The ETE unconditionally writes the size of its reassembly buffer (see 1065 Section 4.4.1) in the S_MRU field and writes the length of the first 1066 IP fragment in the S_MSS field. 1068 If the 'A' bit is set in the SEAL header and IP reassembly completion 1069 appears likely, the ETE should refrain from sending this message if 1070 possible and instead send a Segment Acknowledged message according to 1071 the next section. (Note that it is not an error for the ETE to 1072 generate both the IP Fragmentation Experienced and Segment 1073 Acknowledged messages for the same SEAL packet, however this may be 1074 inefficient in some instances.) 1076 4.4.5.1.2. Segment Acknowledged (Code=1) 1078 When the ITE receives a SEAL segment following IP reassembly that has 1079 the 'A' bit set in the SEAL header, it prepares a "Reassembly Report 1080 - Segment Acknowledged" message with Type=0, Code=1, and Data=0. The 1081 message is formatted as follows 1083 0 1 2 3 1084 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1085 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1086 | Type=0 | Code=1 | Data=0 | 1087 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1088 | SEAL Header of Packet that Triggered the Report | 1089 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1090 | S_MRU | 1091 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1092 | S_MSS | 1093 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1094 | Bitmap | 1095 +-+-+ ... 1097 Figure 5: Segment Acknolwedged Message Format 1099 The ETE unconditionally writes the size of its reassembly buffer in 1100 the S_MRU field. If the Segment arrived as multiple IP fragments, 1101 the ETE also writes the length of the IP first-fragment in the S_MSS 1102 field; otherwise, it writes the value 0 in that field. 1104 The ETE next includes a Bitmap recording this and all preceding 1105 segments of the same SEAL packet that are already in its SEAL 1106 reassembly buffer. For each segment, the Bitmap records the value 1 1107 if the segment is present and 0 if the segment is absent. The most 1108 significant bit of the Bitmap corresponds to segment 0, the next-most 1109 significant bit corresponds to segment 1, etc., and the least 1110 significant bit corresponds to the final segment. For example, when 1111 SEG=6 in the SEAL header, the bit map '0110011' means that segments 1112 1, 2, 5 and 6 were received but segments 0, 3 and 4 were missing from 1113 the ETE's reassembly buffer. The Bitmap must include at least (SEG+ 1114 1)-many bits, where SEG is at most 255. 1116 4.4.5.1.3. Packet Too Big (Code=2) 1118 The ETE generates a "Reassembly Report - Packet Too Big" message when 1119 it discards a SEAL packet that is too large for it to receive. The 1120 ETE sets Type=0, Code=2, and Data to 0 . The ETE then writes the 1121 SEAL header of segment 0 of the packet that generated the error into 1122 the first four bytes of the message body, then writes the size of its 1123 reassembly buffer in the S_MRU field. The message is formatted as 1124 follows: 1126 0 1 2 3 1127 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1128 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1129 | Type=0 | Code=2 | Data=(Time, in Seconds) | 1130 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1131 | SEAL Header of First SEAL Segment of the Too-big Packet | 1132 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1133 | S_MRU | 1134 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1136 Figure 6: Segment Acknolwedged Message Format 1138 4.4.5.1.4. Time Exceeded (Code=3) 1140 The ETE generates a "Reassembly Report - Time Exceeded" message when 1141 it discards an incomplete SEAL reassembly buffer due to a reassembly 1142 timeout. The ETE sets Type=0, Code=3, and sets Data to the time in 1143 seconds from when the initial SEAL segment arrived until the 1144 reassembly time expired. The ETE finally writes the SEAL header of 1145 segment 0 of the packet that generated the error into the first four 1146 bytes of the message body. If segment 0 is unavailable, the ETE 1147 instead writes the SEAL header of the first available segment. The 1148 message is formatted as follows: 1150 0 1 2 3 1151 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1152 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1153 | Type=0 | Code=3 | Data=(Time, in Seconds) | 1154 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1155 | SEAL Header of First SEAL Segment in the Reassembly Buffer | 1156 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1158 Figure 7: Segment Acknolwedged Message Format 1160 4.4.5.1.5. Checksum Incorrect (Code=4) 1162 The ETE generates a "Reassembly Report - Checksum Incorrect" message 1163 when it reassembles a SEAL packet with an invalid checksum. The ETE 1164 sets Type=0, Code=4 and Data=0. The ETE finally writes the SEAL 1165 header of the first segment of the packet into the first four bytes 1166 of the message body. The message is formatted as follows: 1168 0 1 2 3 1169 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1170 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1171 | Type=0 | Code=4 | Data=0 | 1172 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1173 | SEAL Header of First SEAL Segment | 1174 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1176 Figure 8: Reassembly Checksum Failure Message Format 1178 4.4.5.2. Parameter Problem (Type=1) 1180 The ETE generates a Parameter Problem message when it receives a SEAL 1181 packet with an invalid value in one of the SEAL header fields. The 1182 ETE sets Type=1 and Code=0, then sets data to the bit number of the 1183 SEAL header field that triggered the error (e.g., when Data=8, the 1184 parameter problem is specific to the NEXTHDR/SEG field). The ETE 1185 finally writes the SEAL header of the packet that generated the error 1186 into the first four bytes of the message body. 1188 The Parameter Problem message is formatted as follows: 1190 0 1 2 3 1191 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1192 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1193 | Type=1 | Code=0 | Data=SEAL Header bit # | 1194 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1195 | SEAL Header of Packet that Triggered the Parameter Problem | 1196 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1198 Figure 9: Parameter Problem Message Format 1200 Other values for Code will be recorded in the IANA registry for SEAL. 1202 4.4.5.3. Information Request (Type=2) 1204 The ETE generates an Information Request message when it receives a 1205 SEAL packet with I=1 in the SEAL header. The ETE sets Type=2 and 1206 sets Code/Data to values that are specific to the associated 1207 tunneling protocol (for example, the LISP protocol can use the 1208 Information Request message to request mapping updates). The message 1209 body further contains opaque data that is interpreted according to 1210 the Code/Data values. 1212 When Code=0, both Data and the Opaque Data are discarded upon receipt 1213 by the ETE. The information request message is formatted as follows: 1215 0 1 2 3 1216 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1217 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1218 | Type=2 | Code=0 | Data=X | 1219 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1220 | Opaque Data | 1221 +-+-+-+ ... 1223 Figure 10: Information Request Message Format 1225 Other values for Code will be recoded in the IANA registry for SEAL. 1227 After the ETE sends an Information Request message, it must retry 1228 until it receives a corresponding Information Reply (see Section 1229 4.4.5.4). Note that while in this loop the ETE may receive further 1230 SEAL packets with I=1. In that case, the ETE should begin sending 1231 Information Requests specific to the SEAL_ID of the new packet and 1232 should not dwell on the old SEAL_ID. 1234 4.4.5.4. Information Reply (Type=3) 1236 When the ETE sends an Information Request message to the ITE, the ITE 1237 responds by sending an Information Reply message back to the ETE with 1238 the IP source and destination address set to the destination and 1239 source address, the UDP destination port set to the UDP source port, 1240 and the SEAL-ID set to the same value that was present in the 1241 Information Request message. 1243 The ITE sets Type=3 and sets Code/Data to values that are specific to 1244 the associated tunneling protocol (for example, the LISP protocol can 1245 use the Information Reply message to encode mapping updates). The 1246 message body further contains opaque data that is interpreted 1247 according to the Code/Data values. 1249 When Code=0, both Data and the Opaque Data are discarded upon receipt 1250 by the ETE. The information reply message is formatted as follows: 1252 0 1 2 3 1253 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1254 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1255 | Type=3 | Code=0 | Data=X | 1256 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1257 | Opaque Data | 1258 +-+-+-+ ... 1260 Figure 11: Information Reply Message Format 1262 Other values for Code will be recoded in the IANA registry for SEAL. 1264 5. SEAL Protocol Specification (Version 1) 1266 This section specifies a minimal version of SEAL known as "SEAL 1267 Version 1", or "SEAL-LITE". SEAL-LITE observes the same protocol 1268 specifications as for SEAL-MAX (see Section 4) with the exception 1269 that the ITE/ETE do not perform segmentation and reassembly. In 1270 particular, the ETE unilaterally drops any SEAL-LITE packets that 1271 arrive as multiple IP fragments and/or multiple SEAL segments. 1273 SEAL-LITE can be considered for use by associated tunneling protocol 1274 specifications when it highly unlikely that "marginal" links will 1275 occur in any path, e.g., when it is known that the vast majority of 1276 links configure MTUs that are appreciably larger than 1500 bytes. 1277 SEAL-LITE can also be used in instances when it is acceptable for the 1278 ITE to return ICMP PTB messages for packet sizes smaller than 1500 1279 bytes. Finally, the use of SEAL-LITE requires that the associated 1280 tunneling protocol specification either defines a next header field 1281 or ensures that the data immediately following the SEAL header is an 1282 IP header (i.e., either IPv4 or IPv6). The use of SEAL-LITE must 1283 therefore be carefully examined in relation to the particular use 1284 case. 1286 With respect to Section 4, the SEAL-LITE protocol corresponds to 1287 SEAL-MAX as follows: 1289 5.1. Model of Operation 1291 SEAL-LITE follows the same model of operation as for SEAL-MAX as 1292 described in Section 4.1 except as noted in the following sections. 1294 5.2. SEAL Header Format (Version 1) 1296 The SEAL-LITE header is formatted as follows: 1298 0 1 2 3 1299 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1300 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1301 |VER|A|I| Identification | 1302 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1304 Figure 12: SEAL Version 1 Header Format 1306 where the header fields are defined as: 1308 VER (2) 1309 a 2-bit value that encodes the SEAL protocol version number. This 1310 section describes Version 1 of the SEAL protocol, i.e., the VER 1311 field encodes the value '01'. 1313 A (1) 1314 the "Acknowledgement Requested" bit. Set to 1 if the ITE wishes 1315 to receive an explicit acknowledgement from the ETE. 1317 I (1) 1318 the "Information Request Solicit" bit. Set to 1 if the ITE wishes 1319 the ETE to initiate an Information Request. 1321 Identification (28) 1322 a 28-bit identification field. 1324 5.3. ITE Specification 1326 5.3.1. Tunnel Interface MTU 1328 SEAL-LITE observes the SEAL-MAX specification found in Section 4.3.1. 1330 5.3.2. Admitting Packets into the Tunnel Interface 1332 SEAL-LITE observes the SEAL-MAX specification found in Section 4.3.2. 1334 5.3.3. Segmentation 1336 SEAL-LITE observes the SEAL-MAX specification found in Section 4.3.3, 1337 except that the inner fragmentation algorithm is adjusted to avoid 1338 all fragmentation and/or segmentation either within or beyond the 1339 tunnel as follows: 1341 o if the inner packet is an IPv6 packet or an IPv4 packet with DF=1, 1342 and the packet is larger than (MIN(S_MRU, S_MSS) - HLEN), the ITE 1343 drops the packet and sends an ICMP PTB message to the original 1344 source with an MTU value of (MIN(S_MRU, S_MSS) - HLEN) the same as 1345 described in Section 4.3.2; else, 1347 o if the inner packet is an IPv4 packet with DF=0, and the packet is 1348 larger than (S_CSS - HLEN), the ITE uses inner IPv4 fragmentation 1349 to break the packet into fragments no larger than (S_CSS - HLEN); 1350 else, 1352 o the ITE processes the packet without inner fragmentation. 1354 5.3.4. Encapsulation 1356 SEAL-LITE observes the SEAL-MAX specification found in Section 4.3.4, 1357 except that it uses the header format defined in this section and 1358 with the VER field set to '01'. In particular, SEAL-LITE uses the A 1359 and I bits the same as specified for SEAL-MAX. 1361 5.3.5. Probing Strategy 1363 SEAL-LITE observes the SEAL-MAX specification found in Section 4.3.5. 1365 5.3.6. Packet Identification 1367 SEAL-LITE observes the SEAL-MAX soft state specifications found in 1368 Section 4.3.6, but the SEAL_ID is treated as a 28-bit value that is 1369 written into the Identification field in the SEAL header. 1371 As for the SEAL-MAX specification in Section 4.3.6, SEAL-LITE 1372 increments the Identification field (modulo 28) for each consecutive 1373 SEAL packet. 1375 5.3.7. Sending SEAL Protocol Packets 1377 SEAL-LITE observes the SEAL-MAX specification found in Section 4.3.7. 1379 5.3.8. Processing Raw ICMP Messages 1381 SEAL-LITE observes the SEAL-MAX specification found in Section 4.3.8. 1383 5.3.9. Processing SEAL Control Messages 1385 SEAL-LITE observes the SEAL-MAX specification found in Section 4.3.9. 1387 5.4. ETE Specification 1389 5.4.1. Reassembly Buffer Requirements 1391 SEAL-LITE does not maintain a reassembly buffer for SEAL reassembly. 1393 5.4.2. IP-Layer Reassembly 1395 SEAL-LITE uses SEAL-protocol IP first-fragments solely for the 1396 purpose of generating SEAL Reassembly Reports as specified in Section 1397 4.4.2, but thereafter discards all SEAL-protocol IP fragments. 1399 5.4.3. SEAL-Layer Reassembly 1401 SEAL-LITE does not observe the SEAL-MAX reassembly procedures in 1402 Section 4.4.3; Instead, the SEAL-LITE ETE discards all SEAL packets 1403 with F=0 following IP layer reassembly, and may also return 1404 Reassembly Report - Packet Too Big messages when a packet that is too 1405 large to receive is discarded. 1407 As for SEAL-MAX, SEAL-LITE returns a Parameter Problem for SEAL 1408 packets with unrecognized values in the SEAL header. 1410 5.4.4. Decapsulation and Delivery to Upper Layers 1412 SEAL-LITE observes the SEAL-MAX specification found in Section 4.4.4. 1414 5.4.5. Sending SEAL Control Messages 1416 SEAL-LITE observes the SEAL-MAX specification found in Section 4.4.5. 1418 6. Link Requirements 1420 Subnetwork designers are expected to follow the recommendations in 1421 Section 2 of [RFC3819] when configuring link MTUs. 1423 7. End System Requirements 1425 SEAL provides robust mechanisms for returning PTB messages; however, 1426 end systems that send unfragmentable IP packets larger than 1500 1427 bytes are strongly encouraged to use Packetization Layer Path MTU 1428 Discovery per [RFC4821]. 1430 8. Router Requirements 1432 IPv4 routers within the subnetwork are strongly encouraged to 1433 implement IPv4 fragmentation such that the first-fragment is the 1434 largest and approximately the size of the underlying link MTU, i.e., 1435 they should avoid generating runt first-fragments. 1437 9. IANA Considerations 1439 The IANA is instructed to allocate an IP protocol number for 1440 'SEAL_PROTO' in the 'protocol-numbers' registry. 1442 The IANA is instructed to allocate a Well-Known Port number for both 1443 'SEAL_CPORT' and 'SEAL_DPORT' in the 'port-numbers' registry. 1445 The IANA is instructed to establish a "SEAL Control Protocol" 1446 registry to record SEAL control message Code and Type values. This 1447 registry should be initialized to include the Code and Type values 1448 defined in Section 4.4.5. 1450 10. Security Considerations 1452 Unlike IPv4 fragmentation, overlapping fragment attacks are not 1453 possible due to the requirement that SEAL segments be non- 1454 overlapping. 1456 An amplification/reflection attack is possible when an attacker sends 1457 IP first-fragments with spoofed source addresses to an ETE, resulting 1458 in a stream of Reassembly Report messages returned to a victim ITE. 1459 The SEAL_ID in the encapsulated segment of the spoofed IP first- 1460 fragment provides mitigation for the ITE to detect and discard 1461 spurious Reassembly Reports. 1463 The SEAL header is sent in-the-clear (outside of any IPsec/ESP 1464 encapsulations) the same as for the outer */IPv4 headers. As for 1465 IPv6 extension headers, the SEAL header is protected only by L2 1466 integrity checks and is not covered under any L3 integrity checks. 1468 11. Related Work 1470 Section 3.1.7 of [RFC2764] provides a high-level sketch for 1471 supporting large tunnel MTUs via a tunnel-level segmentation and 1472 reassembly capability to avoid IP level fragmentation, which is in 1473 part the same approach used by tunnel-mode SEAL. SEAL could 1474 therefore be considered as a fully functioned manifestation of the 1475 method postulated by that informational reference. 1477 Section 3 of [RFC4459] describes inner and outer fragmentation at the 1478 tunnel endpoints as alternatives for accommodating the tunnel MTU; 1479 however, the SEAL protocol specifies a mid-layer segmentation and 1480 reassembly capability that is distinct from both inner and outer 1481 fragmentation. 1483 Section 4 of [RFC2460] specifies a method for inserting and 1484 processing extension headers between the base IPv6 header and 1485 transport layer protocol data. The SEAL header is inserted and 1486 processed in exactly the same manner. 1488 The concepts of path MTU determination through the report of 1489 fragmentation and extending the IP Identification field were first 1490 proposed in deliberations of the TCP-IP mailing list and the Path MTU 1491 Discovery Working Group (MTUDWG) during the late 1980's and early 1492 1990's. SEAL supports a report fragmentation capability using bits 1493 in an extension header (the original proposal used a spare bit in the 1494 IP header) and supports ID extension through a 16-bit field in an 1495 extension header (the original proposal used a new IP option). A 1496 historical analysis of the evolution of these concepts, as well as 1497 the development of the eventual path MTU discovery mechanism for IP, 1498 appears in Appendix A of this document. 1500 12. SEAL Advantages over Classical Methods 1502 The SEAL approach offers a number of distinct advantages over the 1503 classical path MTU discovery methods [RFC1191] [RFC1981]: 1505 1. Classical path MTU discovery *always* results in packet loss when 1506 an MTU restriction is encountered. Using SEAL, IP fragmentation 1507 provides a short-term interim mechanism for ensuring that packets 1508 are delivered while SEAL adjusts its packet sizing parameters. 1510 2. Classical path MTU discovery requires that routers generate an 1511 ICMP PTB message for *all* packets lost due to an MTU 1512 restriction; this situation is exacerbated at high data rates and 1513 becomes severe for in-the-network tunnels that service many 1514 communicating end systems. Since SEAL ensures that packets no 1515 larger than S_MRU are delivered, however, it is sufficient for 1516 the ETE to return ICMP PTB messages subject to rate limiting and 1517 not for every packet-in-error. 1519 3. Classical path MTU may require several iterations of dropping 1520 packets and returning ICMP PTB messages until an acceptable path 1521 MTU value is determined. Under normal circumstances, SEAL 1522 determines the correct packet sizing parameters in a single 1523 iteration. 1525 4. Using SEAL, ordinary packets serve as implicit probes without 1526 exposing data to unnecessary loss. SEAL also provides an 1527 explicit probing mode not available in the classic methods. 1529 5. Using SEAL, ETEs encapsulate ICMP error messages in an outer 1530 UDP/IP header such that packet-filtering network middleboxes will 1531 not filter them the same as for"raw" ICMP messages that may be 1532 generated by an attacker. 1534 6. Most importantly, all SEAL packets have a 32-bit Identification 1535 value that can be used for duplicate packet detection purposes 1536 and to match ICMP error messages with actual packets sent without 1537 requiring per-packet state; hence, certain denial-of-service 1538 attack vectors open to the classical methods are eliminated. 1540 In summary, the SEAL approach represents an architecturally superior 1541 method for ensuring that packets of various sizes are either 1542 delivered or deterministically dropped. When end systems use their 1543 own end-to-end MTU determination mechanisms [RFC4821], the SEAL 1544 advantages are further enhanced. 1546 13. Acknowledgments 1548 The following individuals are acknowledged for helpful comments and 1549 suggestions: Jari Arkko, Fred Baker, Iljitsch van Beijnum, Oliver 1550 Bonaventure, Teco Boot, Bob Braden, Brian Carpenter, Steve Casner, 1551 Ian Chakeres, Noel Chiappa, Remi Denis-Courmont, Aurnaud Ebalard, 1552 Gorry Fairhurst, Dino Farinacci, Joel Halpern, Sam Hartman, John 1553 Heffner, Thomas Henderson, Bob Hinden, Christian Huitema, Darrel 1554 Lewis, Joe Macker, Matt Mathis, Erik Nordmark, Dan Romascanu, Dave 1555 Thaler, Joe Touch, Pascal Thubert, Margaret Wasserman, Magnus 1556 Westerlund, Robin Whittle, James Woodyatt, and members of the Boeing 1557 Research & Technology NST DC&NT group. 1559 Path MTU determination through the report of fragmentation was first 1560 proposed by Charles Lynn on the TCP-IP mailing list in 1987. 1561 Extending the IP identification field was first proposed by Steve 1562 Deering on the MTUDWG mailing list in 1989. 1564 14. References 1566 14.1. Normative References 1568 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1569 September 1981. 1571 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 1572 RFC 792, September 1981. 1574 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1575 Requirement Levels", BCP 14, RFC 2119, March 1997. 1577 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1578 (IPv6) Specification", RFC 2460, December 1998. 1580 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 1581 Message Protocol (ICMPv6) for the Internet Protocol 1582 Version 6 (IPv6) Specification", RFC 4443, March 2006. 1584 14.2. Informative References 1586 [FOLK] C, C., D, D., and k. k, "Beyond Folklore: Observations on 1587 Fragmented Traffic", December 2002. 1589 [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", 1590 October 1987. 1592 [I-D.ietf-lisp] 1593 Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, 1594 "Locator/ID Separation Protocol (LISP)", 1595 draft-ietf-lisp-01 (work in progress), May 2009. 1597 [I-D.russert-rangers] 1598 Russert, S., Fleischman, E., and F. Templin, "RANGER 1599 Scenarios", draft-russert-rangers-00 (work in progress), 1600 May 2009. 1602 [I-D.templin-autoconf-dhcp] 1603 Templin, F., "Virtual Enterprise Traversal (VET)", 1604 draft-templin-autoconf-dhcp-38 (work in progress), 1605 April 2009. 1607 [I-D.templin-ranger] 1608 Templin, F., "Routing and Addressing in Next-Generation 1609 EnteRprises (RANGER)", draft-templin-ranger-07 (work in 1610 progress), February 2009. 1612 [MTUDWG] "IETF MTU Discovery Working Group mailing list, 1613 gatekeeper.dec.com/pub/DEC/WRL/mogul/mtudwg-log, November 1614 1989 - February 1995.". 1616 [RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP 1617 MTU discovery options", RFC 1063, July 1988. 1619 [RFC1146] Zweig, J. and C. Partridge, "TCP alternate checksum 1620 options", RFC 1146, March 1990. 1622 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1623 November 1990. 1625 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 1626 for IP version 6", RFC 1981, August 1996. 1628 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 1629 October 1996. 1631 [RFC2004] Perkins, C., "Minimal Encapsulation within IP", RFC 2004, 1632 October 1996. 1634 [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in 1635 IPv6 Specification", RFC 2473, December 1998. 1637 [RFC2764] Gleeson, B., Heinanen, J., Lin, A., Armitage, G., and A. 1638 Malis, "A Framework for IP Based Virtual Private 1639 Networks", RFC 2764, February 2000. 1641 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", 1642 RFC 2923, September 2000. 1644 [RFC3366] Fairhurst, G. and L. Wood, "Advice to link designers on 1645 link Automatic Repeat reQuest (ARQ)", BCP 62, RFC 3366, 1646 August 2002. 1648 [RFC3692] Narten, T., "Assigning Experimental and Testing Numbers 1649 Considered Useful", BCP 82, RFC 3692, January 2004. 1651 [RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D., 1652 Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. 1653 Wood, "Advice for Internet Subnetwork Designers", BCP 89, 1654 RFC 3819, July 2004. 1656 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 1657 for IPv6 Hosts and Routers", RFC 4213, October 2005. 1659 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 1660 Internet Protocol", RFC 4301, December 2005. 1662 [RFC4380] Huitema, C., "Teredo: Tunneling IPv6 over UDP through 1663 Network Address Translations (NATs)", RFC 4380, 1664 February 2006. 1666 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1667 Network Tunneling", RFC 4459, April 2006. 1669 [RFC4727] Fenner, B., "Experimental Values In IPv4, IPv6, ICMPv4, 1670 ICMPv6, UDP, and TCP Headers", RFC 4727, November 2006. 1672 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 1673 Discovery", RFC 4821, March 2007. 1675 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 1676 Errors at High Data Rates", RFC 4963, July 2007. 1678 [TCP-IP] "Archive/Hypermail of Early TCP-IP Mail List, 1679 http://www-mice.cs.ucl.ac.uk/multimedia/misc/tcp_ip/, May 1680 1987 - May 1990.". 1682 Appendix A. Reliability 1684 Automatic Repeat-ReQuest (ARQ) mechanisms are used to ensure reliable 1685 delivery between the endpoints of links [RFC3366] (e.g., on-link 1686 neighbors in an IEEE 802.11 network) as well as between the endpoints 1687 of an end-to-end transport (e.g., the endpoints of a TCP connection). 1688 Implementations of this specification can use the Bitmap feature of 1689 Reassembly Report control messages as a hint of segments for 1690 retransmission, however this may not be ideally suitable for all SEAL 1691 use cases since retransmission of lost segments may require 1692 considerable state maintenance at the ITE and may also result in 1693 considerable delay variance and packet reordering within the 1694 subnetwork. 1696 Alternate reliability mechanisms such as Forward Error Correction 1697 (FEC) may also be used for the purpose of improved reliability. Such 1698 mechanisms may entail the ITE performing proactive transmissions of 1699 redundant data, e.g., by sending multiple copies of the same data. 1700 Signaling from the ETE may also be considered as a means for the ETE 1701 to dynamically inform the ITE of changing FEC conditions. 1703 Future studies should examine the use of ARQ and FEC mechanisms for 1704 improved reliability in the face of loss due to congestion, signal 1705 intermittence, etc. 1707 Appendix B. Transport Mode 1709 SEAL can also be used in "transport-mode", e.g., when the inner layer 1710 includes upper-layer protocol data rather than an encapsulated IP 1711 packet. For instance, TCP peers can negotiate the use of SEAL for 1712 the carriage of protocol data encapsulated as TCP/SEAL/IPv4. In this 1713 sense, the "subnetwork" becomes the entire end-to-end path between 1714 the TCP peers and may potentially span the entire Internet. 1716 Sections 4 and 5 specify the operation of SEAL in "tunnel mode", 1717 i.e., when there are both an inner and outer IP layer with a SEAL 1718 encapsulation layer between. However, the SEAL protocol can also be 1719 used in a "transport mode" of operation within a subnetwork region in 1720 which the inner-layer corresponds to a transport layer protocol 1721 (e.g., UDP, TCP, etc.) instead of an inner IP layer. 1723 For example, two TCP endpoints connected to the same subnetwork 1724 region can negotiate the use of transport-mode SEAL for a connection 1725 by inserting a 'SEAL_OPTION' TCP option during the connection 1726 establishment phase. If both TCPs agree on the use of SEAL, their 1727 protocol messages will be carried as TCP/SEAL/IPv4 and the connection 1728 will be serviced by the SEAL protocol using TCP (instead of an 1729 encapsulating tunnel endpoint) as the transport layer protocol. The 1730 SEAL protocol for transport mode otherwise observes the same 1731 specifications as for Sections 4 and 5. 1733 Appendix C. Historic Evolution of PMTUD 1735 (Taken from "Neighbor Affiliation Protocol for IPv6-over-(foo)-over- 1736 IPv4"; written 10/30/2002): 1738 The topic of Path MTU discovery (PMTUD) saw a flurry of discussion 1739 and numerous proposals in the late 1980's through early 1990. The 1740 initial problem was posed by Art Berggreen on May 22, 1987 in a 1741 message to the TCP-IP discussion group [TCP-IP]. The discussion that 1742 followed provided significant reference material for [FRAG]. An IETF 1743 Path MTU Discovery Working Group [MTUDWG] was formed in late 1989 1744 with charter to produce an RFC. Several variations on a very few 1745 basic proposals were entertained, including: 1747 1. Routers record the PMTUD estimate in ICMP-like path probe 1748 messages (proposed in [FRAG] and later [RFC1063]) 1750 2. The destination reports any fragmentation that occurs for packets 1751 received with the "RF" (Report Fragmentation) bit set (Steve 1752 Deering's 1989 adaptation of Charles Lynn's Nov. 1987 proposal) 1754 3. A hybrid combination of 1) and Charles Lynn's Nov. 1987 (straw 1755 RFC draft by McCloughrie, Fox and Mogul on Jan 12, 1990) 1757 4. Combination of the Lynn proposal with TCP (Fred Bohle, Jan 30, 1758 1990) 1760 5. Fragmentation avoidance by setting "IP_DF" flag on all packets 1761 and retransmitting if ICMPv4 "fragmentation needed" messages 1762 occur (Geof Cooper's 1987 proposal; later adapted into [RFC1191] 1763 by Mogul and Deering). 1765 Option 1) seemed attractive to the group at the time, since it was 1766 believed that routers would migrate more quickly than hosts. Option 1767 2) was a strong contender, but repeated attempts to secure an "RF" 1768 bit in the IPv4 header from the IESG failed and the proponents became 1769 discouraged. 3) was abandoned because it was perceived as too 1770 complicated, and 4) never received any apparent serious 1771 consideration. Proposal 5) was a late entry into the discussion from 1772 Steve Deering on Feb. 24th, 1990. The discussion group soon 1773 thereafter seemingly lost track of all other proposals and adopted 1774 5), which eventually evolved into [RFC1191] and later [RFC1981]. 1776 In retrospect, the "RF" bit postulated in 2) is not needed if a 1777 "contract" is first established between the peers, as in proposal 4) 1778 and a message to the MTUDWG mailing list from jrd@PTT.LCS.MIT.EDU on 1779 Feb 19. 1990. These proposals saw little discussion or rebuttal, and 1780 were dismissed based on the following the assertions: 1782 o routers upgrade their software faster than hosts 1784 o PCs could not reassemble fragmented packets 1786 o Proteon and Wellfleet routers did not reproduce the "RF" bit 1787 properly in fragmented packets 1789 o Ethernet-FDDI bridges would need to perform fragmentation (i.e., 1790 "translucent" not "transparent" bridging) 1792 o the 16-bit IP_ID field could wrap around and disrupt reassembly at 1793 high packet arrival rates 1795 The first four assertions, although perhaps valid at the time, have 1796 been overcome by historical events. The final assertion is addressed 1797 by the mechanisms specified in SEAL. 1799 Author's Address 1801 Fred L. Templin (editor) 1802 Boeing Research & Technology 1803 P.O. Box 3707 1804 Seattle, WA 98124 1805 USA 1807 Email: fltemplin@acm.org