idnits 2.17.1 draft-templin-intarea-seal-24.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 29, 2010) is 4890 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC3971' is defined on line 1680, but no explicit reference was found in the text == Unused Reference: 'RFC4987' is defined on line 1792, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-07) exists of draft-ietf-intarea-ipv4-id-update-01 == Outdated reference: A later version (-40) exists of draft-templin-intarea-vet-16 == Outdated reference: A later version (-17) exists of draft-templin-iron-13 -- Obsolete informational reference (is this intentional?): RFC 1063 (Obsoleted by RFC 1191) -- Obsolete informational reference (is this intentional?): RFC 1981 (Obsoleted by RFC 8201) Summary: 1 error (**), 0 flaws (~~), 6 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Templin, Ed. 3 Internet-Draft Boeing Research & Technology 4 Intended status: Standards Track November 29, 2010 5 Expires: June 2, 2011 7 The Subnetwork Encapsulation and Adaptation Layer (SEAL) 8 draft-templin-intarea-seal-24.txt 10 Abstract 12 For the purpose of this document, a subnetwork is defined as a 13 virtual topology configured over a connected IP network routing 14 region and bounded by encapsulating border nodes. These virtual 15 topologies are manifested by tunnels that may span multiple IP and/or 16 sub-IP layer forwarding hops, and can introduce failure modes due to 17 packet duplication and/or links with diverse Maximum Transmission 18 Units (MTUs). This document specifies a Subnetwork Encapsulation and 19 Adaptation Layer (SEAL) that accommodates such virtual topologies 20 over diverse underlying link technologies. 22 Status of this Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on June 2, 2011. 39 Copyright Notice 41 Copyright (c) 2010 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 57 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . 4 58 1.2. Approach . . . . . . . . . . . . . . . . . . . . . . . . . 6 59 2. Terminology and Requirements . . . . . . . . . . . . . . . . . 7 60 3. Applicability Statement . . . . . . . . . . . . . . . . . . . 9 61 4. SEAL Protocol Specification . . . . . . . . . . . . . . . . . 10 62 4.1. VET Interface Model . . . . . . . . . . . . . . . . . . . 10 63 4.2. SEAL Model of Operation . . . . . . . . . . . . . . . . . 11 64 4.3. SEAL Header Format . . . . . . . . . . . . . . . . . . . . 13 65 4.4. ITE Specification . . . . . . . . . . . . . . . . . . . . 14 66 4.4.1. Tunnel Interface MTU . . . . . . . . . . . . . . . . . 14 67 4.4.2. Tunnel Interface Soft State . . . . . . . . . . . . . 16 68 4.4.3. Admitting Packets into the Tunnel . . . . . . . . . . 17 69 4.4.4. Mid-Layer Encapsulation . . . . . . . . . . . . . . . 18 70 4.4.5. SEAL Segmentation . . . . . . . . . . . . . . . . . . 18 71 4.4.6. SEAL Encapsulation . . . . . . . . . . . . . . . . . . 18 72 4.4.7. Outer Encapsulation . . . . . . . . . . . . . . . . . 19 73 4.4.8. Sending SEAL Protocol Packets . . . . . . . . . . . . 20 74 4.4.9. Probing Strategy . . . . . . . . . . . . . . . . . . . 20 75 4.4.10. Processing ICMP Messages . . . . . . . . . . . . . . . 21 76 4.4.11. Black Hole Detection . . . . . . . . . . . . . . . . . 21 77 4.5. ETE Specification . . . . . . . . . . . . . . . . . . . . 21 78 4.5.1. Reassembly Buffer Requirements . . . . . . . . . . . . 21 79 4.5.2. Tunnel Interface Soft State . . . . . . . . . . . . . 22 80 4.5.3. IP-Layer Reassembly . . . . . . . . . . . . . . . . . 23 81 4.5.4. SEAL-Layer Reassembly . . . . . . . . . . . . . . . . 23 82 4.5.5. Decapsulation and Delivery to Upper Layers . . . . . . 24 83 4.6. The SEAL Control Message Protocol (SCMP) . . . . . . . . . 25 84 4.6.1. Generating SCMP Messages . . . . . . . . . . . . . . . 25 85 4.6.2. Processing SCMP Messages . . . . . . . . . . . . . . . 29 86 4.7. Tunnel Endpoint Synchronization . . . . . . . . . . . . . 32 87 5. Link Requirements . . . . . . . . . . . . . . . . . . . . . . 33 88 6. End System Requirements . . . . . . . . . . . . . . . . . . . 33 89 7. Router Requirements . . . . . . . . . . . . . . . . . . . . . 34 90 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34 91 9. Security Considerations . . . . . . . . . . . . . . . . . . . 34 92 10. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 35 93 11. SEAL Advantages over Classical Methods . . . . . . . . . . . . 36 94 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 36 95 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 37 96 13.1. Normative References . . . . . . . . . . . . . . . . . . . 37 97 13.2. Informative References . . . . . . . . . . . . . . . . . . 37 98 Appendix A. Reliability . . . . . . . . . . . . . . . . . . . . . 40 99 Appendix B. Integrity . . . . . . . . . . . . . . . . . . . . . . 41 100 Appendix C. Transport Mode . . . . . . . . . . . . . . . . . . . 41 101 Appendix D. Historic Evolution of PMTUD . . . . . . . . . . . . . 42 102 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 43 104 1. Introduction 106 As Internet technology and communication has grown and matured, many 107 techniques have developed that use virtual topologies (including 108 tunnels of one form or another) over an actual network that supports 109 the Internet Protocol (IP) [RFC0791][RFC2460]. Those virtual 110 topologies have elements that appear as one hop in the virtual 111 topology, but are actually multiple IP or sub-IP layer hops. These 112 multiple hops often have quite diverse properties that are often not 113 even visible to the endpoints of the virtual hop. This introduces 114 failure modes that are not dealt with well in current approaches. 116 The use of IP encapsulation (also known as "tunneling") has long been 117 considered as the means for creating such virtual topologies. 118 However, the insertion of an outer IP header reduces the effective 119 path MTU visible to the inner network layer. When IPv4 is used, this 120 reduced MTU can be accommodated through the use of IPv4 121 fragmentation, but unmitigated in-the-network fragmentation has been 122 found to be harmful through operational experience and studies 123 conducted over the course of many years [FRAG][FOLK][RFC4963]. 124 Additionally, classical path MTU discovery [RFC1191] has known 125 operational issues that are exacerbated by in-the-network tunnels 126 [RFC2923][RFC4459]. The following subsections present further 127 details on the motivation and approach for addressing these issues. 129 1.1. Motivation 131 Before discussing the approach, it is necessary to first understand 132 the problems. In both the Internet and private-use networks today, 133 IPv4 is ubiquitously deployed as the Layer 3 protocol. The two 134 primary functions of IPv4 are to provide for 1) addressing, and 2) a 135 fragmentation and reassembly capability used to accommodate links 136 with diverse MTUs. While it is well known that the IPv4 address 137 space is rapidly becoming depleted, there is a lesser-known but 138 growing consensus that other IPv4 protocol limitations have already 139 or may soon become problematic. 141 First, the IPv4 header Identification field is only 16 bits in 142 length, meaning that at most 2^16 unique packets with the same 143 (source, destination, protocol)-tuple may be active in the Internet 144 at a given time [I-D.ietf-intarea-ipv4-id-update]. Due to the 145 escalating deployment of high-speed links (e.g., 1Gbps Ethernet), 146 however, this number may soon become too small by several orders of 147 magnitude for high data rate packet sources such as tunnel endpoints 148 [RFC4963]. Furthermore, there are many well-known limitations 149 pertaining to IPv4 fragmentation and reassembly - even to the point 150 that it has been deemed "harmful" in both classic and modern-day 151 studies (see above). In particular, IPv4 fragmentation raises issues 152 ranging from minor annoyances (e.g., in-the-network router 153 fragmentation [RFC1981]) to the potential for major integrity issues 154 (e.g., mis-association of the fragments of multiple IP packets during 155 reassembly [RFC4963]). 157 As a result of these perceived limitations, a fragmentation-avoiding 158 technique for discovering the MTU of the forward path from a source 159 to a destination node was devised through the deliberations of the 160 Path MTU Discovery Working Group (PMTUDWG) during the late 1980's 161 through early 1990's (see Appendix D). In this method, the source 162 node provides explicit instructions to routers in the path to discard 163 the packet and return an ICMP error message if an MTU restriction is 164 encountered. However, this approach has several serious shortcomings 165 that lead to an overall "brittleness" [RFC2923]. 167 In particular, site border routers in the Internet are being 168 configured more and more to discard ICMP error messages coming from 169 the outside world. This is due in large part to the fact that 170 malicious spoofing of error messages in the Internet is trivial since 171 there is no way to authenticate the source of the messages [RFC5927]. 172 Furthermore, when a source node that requires ICMP error message 173 feedback when a packet is dropped due to an MTU restriction does not 174 receive the messages, a path MTU-related black hole occurs. This 175 means that the source will continue to send packets that are too 176 large and never receive an indication from the network that they are 177 being discarded. This behavior has been confirmed through documented 178 studies showing clear evidence of path MTU discovery failures in the 179 Internet today [TBIT][WAND][SIGCOMM]. 181 The issues with both IPv4 fragmentation and this "classical" method 182 of path MTU discovery are exacerbated further when IP tunneling is 183 used [RFC4459]. For example, ingress tunnel endpoints (ITEs) may be 184 required to forward encapsulated packets into the subnetwork on 185 behalf of hundreds, thousands, or even more original sources in the 186 end site. If the ITE allows IPv4 fragmentation on the encapsulated 187 packets, persistent fragmentation could lead to undetected data 188 corruption due to Identification field wrapping. If the ITE instead 189 uses classical IPv4 path MTU discovery, it may be inconvenienced by 190 excessive ICMP error messages coming from the subnetwork that may be 191 either suspect or contain insufficient information for translation 192 into error messages to be returned to the original sources. 194 Although recent works have led to the development of a robust end-to- 195 end MTU determination scheme [RFC4821], this approach requires 196 tunnels to present a consistent MTU the same as for ordinary links on 197 the end-to-end path. Moreover, in current practice existing 198 tunneling protocols mask the MTU issues by selecting a "lowest common 199 denominator" MTU that may be much smaller than necessary for most 200 paths and difficult to change at a later date. Due to these many 201 consideration, a new approach to accommodate tunnels over links with 202 diverse MTUs is necessary. 204 1.2. Approach 206 For the purpose of this document, a subnetwork is defined as a 207 virtual topology configured over a connected network routing region 208 and bounded by encapsulating border nodes. Example connected network 209 routing regions include Mobile Ad hoc Networks (MANETs), enterprise 210 networks and the global public Internet itself. Subnetwork border 211 nodes forward unicast and multicast packets over the virtual topology 212 across multiple IP and/or sub-IP layer forwarding hops that may 213 introduce packet duplication and/or traverse links with diverse 214 Maximum Transmission Units (MTUs). 216 This document introduces a Subnetwork Encapsulation and Adaptation 217 Layer (SEAL) for tunneling network layer protocols (e.g., IP, OSI, 218 etc.) over IP subnetworks that connect Ingress and Egress Tunnel 219 Endpoints (ITEs/ETEs) of border nodes. It provides a modular 220 specification designed to be tailored to specific associated 221 tunneling protocols. A transport-mode of operation is also possible, 222 and described in Appendix C. SEAL accommodates links with diverse 223 MTUs, protects against off-path denial-of-service attacks, and can be 224 configured to enable efficient duplicate packet detection through the 225 use of a minimal mid-layer encapsulation. 227 SEAL specifically treats tunnels that traverse the subnetwork as 228 ordinary links that must support network layer services. As for any 229 link, tunnels that use SEAL must provide suitable networking services 230 including best-effort datagram delivery, integrity and consistent 231 handling of packets of various sizes. As for any link whose media 232 cannot provide suitable services natively, tunnels that use SEAL 233 employ link-level adaptation functions to meet the legitimate 234 expectations of the network layer service. As this is essentially a 235 link level adaptation, SEAL is therefore permitted to alter packets 236 within the subnetwork as long as it restores them to their original 237 form when they exit the subnetwork. The mechanisms described within 238 this document are designed precisely for this purpose. 240 SEAL encapsulation introduces an extended Identification field for 241 per-packet and/or per-ETE identification as well as a mid-layer 242 segmentation and reassembly capability that allows simplified cutting 243 and pasting of packets. Moreover, SEAL engages both tunnel endpoints 244 in ensuring a functional path MTU on the path from the ITE to the 245 ETE. This is in contrast to "stateless" approaches which seek to 246 avoid MTU issues by selecting a lowest common denominator MTU value 247 that may be overly conservative for the vast majority of tunnel paths 248 and difficult to change even when larger MTUs become available. 250 The following sections provide the SEAL normative specifications, 251 while the appendices present non-normative additional considerations. 253 2. Terminology and Requirements 255 The following terms are defined within the scope of this document: 257 subnetwork 258 a virtual topology configured over a connected network routing 259 region and bounded by encapsulating border nodes. 261 Ingress Tunnel Endpoint 262 a virtual interface over which an encapsulating border node (host 263 or router) sends encapsulated packets into the subnetwork. 265 Egress Tunnel Endpoint 266 a virtual interface over which an encapsulating border node (host 267 or router) receives encapsulated packets from the subnetwork. 269 inner packet 270 an unencapsulated network layer protocol packet (e.g., IPv6 271 [RFC2460], IPv4 [RFC0791], OSI/CLNP [RFC1070], etc.) before any 272 mid-layer or outer encapsulations are added. Internet protocol 273 numbers that identify inner packets are found in the IANA Internet 274 Protocol registry [RFC3232]. 276 mid-layer packet 277 a packet resulting from adding mid-layer encapsulating headers to 278 an inner packet. 280 outer IP packet 281 a packet resulting from adding an outer IP header (and possibly 282 other outer headers) to a mid-layer packet. 284 packet-in-error 285 the leading portion of an invoking data packet encapsulated in the 286 body of an error control message (e.g., an ICMPv4 [RFC0792] error 287 message, an ICMPv6 [RFC4443] error message, etc.). 289 Packet Too Big (PTB) 290 a control plane message indicating an MTU restriction, e.g., an 291 ICMPv6 "Packet Too Big" message [RFC4443], an ICMPv4 292 "Fragmentation Needed" message [RFC0792], an SCMP "Packet Too Big" 293 message (see: Section 4.5), etc. 295 IP, IPvX, IPvY 296 used to generically refer to either IP protocol version, i.e., 297 IPv4 or IPv6. 299 The following abbreviations correspond to terms used within this 300 document and elsewhere in common Internetworking nomenclature: 302 DF - the IPv4 header "Don't Fragment" flag [RFC0791] 304 ETE - Egress Tunnel Endpoint 306 HLEN - the sum of MHLEN and OHLEN 308 ITE - Ingress Tunnel Endpoint 310 LINK_ID - a short integer that identifies an ITE's underlying link 312 MHLEN - the length of any mid-layer headers and trailers 314 MRU - Maximum Reassembly Unit 316 MTU - Maximum Transmission Unit 318 NONCE - a short integer nonce value that identifies an ETE 320 OHLEN - the length of any outer encapsulating headers and trailers 322 S_IFT - SEAL Inner Fragmentation Threshold 324 S_MRU - SEAL Maximum Reassembly Unit 326 S_MSS - SEAL Maximum Segment Size 328 SCMP - the SEAL Control Message Protocol 330 SEAL - Subnetwork Encapsulation and Adaptation Layer 332 SEAL_ID - a SEAL packet and/or ETE identification value 334 SEAL_PORT - a TCP/UDP service port number used for SEAL 336 SEAL_PROTO - an IPv4 protocol number used for SEAL 338 TE - Tunnel Endpoint (i.e., either ingress or egress) 340 VET - Virtual Enterprise Traversal 342 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 343 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 344 document are to be interpreted as described in [RFC2119]. When used 345 in lower case (e.g., must, must not, etc.), these words MUST NOT be 346 interpreted as described in [RFC2119], but are rather interpreted as 347 they would be in common English. 349 3. Applicability Statement 351 SEAL was originally motivated by the specific case of subnetwork 352 abstraction for Mobile Ad hoc Networks (MANETs), however it soon 353 became apparent that the domain of applicability also extends to 354 subnetwork abstractions over enterprise networks, ISP networks, SOHO 355 networks, the global public Internet itself, and any other connected 356 network routing region. SEAL along with the Virtual Enterprise 357 Traversal (VET) [I-D.templin-intarea-vet] tunnel virtual interface 358 abstraction are the functional building blocks for a new 359 Internetworking architecture based on Routing and Addressing in 360 Networks with Global Enterprise Recursion (RANGER) 361 [RFC5720][I-D.russert-rangers] and the Internet Routing Overlay 362 Network (IRON) [I-D.templin-iron]. 364 SEAL provides a network sublayer for encapsulation of an inner 365 network layer packet within outer encapsulating headers. For 366 example, for IPvX in IPvY encapsulation (e.g., as IPv4/SEAL/IPv6), 367 the SEAL header appears as a subnetwork encapsulation as seen by the 368 inner IP layer. SEAL can also be used as a sublayer within a UDP 369 data payload (e.g., as IPv4/UDP/SEAL/IPv6 similar to Teredo 370 [RFC4380]), where UDP encapsulation is typically used for Network 371 Address Translator (NAT) traversal as well as operation over 372 subnetworks that give preferential treatment to the "core" Internet 373 protocols (i.e., TCP and UDP). The SEAL header is processed the same 374 as for IPv6 extension headers, i.e., it is not part of the outer IP 375 header but rather allows for the creation of an arbitrarily 376 extensible chain of headers in the same way that IPv6 does. 378 SEAL supports a segmentation and reassembly capability for adapting 379 the network layer to the underlying subnetwork characteristics, where 380 the Egress Tunnel Endpoint (ETE) determines how much or how little 381 reassembly it is willing to support. In the limiting case, the ETE 382 can avoid reassembly altogether and act as a passive observer that 383 simply informs the Ingress Tunnel Endpoint (ITE) of any MTU 384 limitations and otherwise discards all packets that arrive as 385 multiple fragments. This mode is useful for determining an 386 appropriate MTU for tunnels between performance-critical routers 387 connected to high data rate subnetworks such as the Internet DFZ, for 388 unidirectional tunnels in which the ETE is stateless, and for other 389 uses in which reassembly would present too great of a burden for the 390 routers or end systems. 392 When the ETE supports reassembly, the tunnel can be used to transport 393 packets that are too large to traverse the path without 394 fragmentation. In this mode, the ITE determines the tunnel MTU based 395 on the largest packet the ETE is capable of reassembling rather than 396 on the MTU of the smallest link in the path. Therefore, tunnel 397 endpoints that use SEAL can transport packets that are much larger 398 than the underlying subnetwork links themselves can carry in a single 399 piece. 401 SEAL tunnels may be configured over paths that include not only 402 ordinary physical links, but also virtual links that may include 403 other tunnels. An example application would be linking two 404 geographically remote supercomputer centers with large MTU links by 405 configuring a SEAL tunnel across the Internet. A second example 406 would be support for sub-IP segmentation over low-end links, i.e., 407 especially over wireless transmission media such as IEEE 802.15.4, 408 broadcast radio links in Mobile Ad-hoc Networks (MANETs), Very High 409 Frequency (VHF) civil aviation data links, etc. 411 Many other use case examples are anticipated, and will be identified 412 as further experience is gained. 414 4. SEAL Protocol Specification 416 The following sections specify the operation of the SEAL protocol. 418 4.1. VET Interface Model 420 SEAL is an encapsulation sublayer used within VET non-broadcast, 421 multiple access (NBMA) virtual interfaces. Each VET interface 422 connects an ITE to one or more ETE "neighbors" via tunneling across 423 an underlying enterprise network, or "subnetwork". The tunnel 424 neighbor relationship between the ITE and each ETE may be either 425 unidirectional or bidirectional. 427 A unidirectional tunnel neighbor relationship requires no prior 428 coordination between the ITE and ETE; it allows the ITE to send both 429 data and control messages forward to the ETE, but only allows the ETE 430 to send back control messages. A bidirectional tunnel neighbor 431 relationship requires prior coordination between the TEs (see: 432 Section 4.7), and is one over which both TEs can exchange both data 433 and control messages. 435 Implications of the VET unidirectional and bidirectional models for 436 SEAL will be discussed in the following sections. 438 4.2. SEAL Model of Operation 440 SEAL supports a multi-level segmentation and reassembly capability 441 for the transmission of unicast and multicast packets across an 442 underlying IP subnetwork with heterogeneous links. First, the ITE 443 can use IPv4 fragmentation to fragment inner IPv4 packets before SEAL 444 encapsulation if necessary. Secondly, the SEAL layer itself provides 445 a simple cutting-and-pasting capability for mid-layer packets that 446 can be used to avoid IP fragmentation on the outer packet. Finally, 447 ordinary IP fragmentation is permitted on the outer packet after SEAL 448 encapsulation and is used to detect and tune out any in-the-network 449 fragmentation. 451 SEAL-enabled ITEs encapsulate each inner packet in any mid-layer 452 headers and trailers, segment the resulting mid-layer packet into 453 multiple segments if necessary, then append a SEAL header and any 454 outer encapsulations to each segment. As an example, for IPv6 within 455 IPv4 encapsulation a single-segment inner IPv6 packet encapsulated in 456 any mid-layer headers and trailers, followed by the SEAL header, 457 followed by any outer headers and trailers, followed by an outer IPv4 458 header would appear as shown in Figure 1: 460 +--------------------+ 461 ~ outer IPv4 header ~ 462 +--------------------+ 463 I ~ other outer hdrs ~ 464 n +--------------------+ 465 n ~ SEAL Header ~ 466 e +--------------------+ +--------------------+ 467 r ~ mid-layer headers ~ ~ mid-layer headers ~ 468 +--------------------+ +--------------------+ 469 I --> | | --> | | 470 P --> ~ inner IPv6 ~ --> ~ inner IPv6 ~ 471 v --> ~ Packet ~ --> ~ Packet ~ 472 6 --> | | --> | | 473 +--------------------+ +--------------------+ 474 P ~ mid-layer trailers ~ ~ mid-layer trailers ~ 475 a +--------------------+ +--------------------+ 476 c ~ outer trailers ~ 477 k Mid-layer packet +--------------------+ 478 e after mid-layer encaps. 479 t Outer IPv4 packet 480 after SEAL and outer encaps. 482 Figure 1: SEAL Encapsulation - Single Segment 484 As a second example, for IPv4 within IPv6 encapsulation an inner IPv4 485 packet requiring three SEAL segments would appear as three separate 486 outer IPv6 packets, where the mid-layer headers are carried only in 487 segment 0 and the mid-layer trailers are carried in segment 2 as 488 shown in Figure 2: 489 +------------------+ +------------------+ 490 ~ outer IPv6 hdr ~ ~ outer IPv6 hdr ~ 491 +------------------+ +------------------+ +------------------+ 492 ~ other outer hdrs ~ ~ outer IPv6 hdr ~ ~ other outer hdrs ~ 493 +------------------+ +------------------+ +------------------+ 494 ~ SEAL hdr (SEG=0) ~ ~ other outer hdrs ~ ~ SEAL hdr (SEG=2) ~ 495 +------------------+ +------------------+ +------------------+ 496 ~ mid-layer hdrs ~ ~ SEAL hdr (SEG=1) ~ | inner IPv4 | 497 +------------------+ +------------------+ ~ Packet ~ 498 | inner IPv4 | | inner IPv4 | | (Segment 2) | 499 ~ Packet ~ ~ Packet ~ +------------------+ 500 | (Segment 0) | | (Segment 1) | ~ mid-layer trails ~ 501 +------------------+ +------------------+ +------------------+ 502 ~ outer trailers ~ ~ outer trailers ~ ~ outer trailers ~ 503 +------------------+ +------------------+ +------------------+ 505 Segment 0 (includes Segment 1 (no mid- Segment 2 (includes 506 mid-layer hdrs) layer encaps) mid-layer trails) 508 Figure 2: SEAL Encapsulation - Multiple Segments 510 The ITE inserts the SEAL header according to the specific tunneling 511 protocol. Examples include the following: 513 o For simple encapsulation of an inner network layer packet within 514 an outer IPvX header (e.g., [RFC1070][RFC2003][RFC2473][RFC4213], 515 etc.), the ITE inserts the SEAL header between the inner packet 516 and outer IPvX headers as: IPvX/SEAL/{inner packet}. 518 o For encapsulations over transports such as UDP (e.g., [RFC4380]), 519 the ITE inserts the SEAL header between the outer transport layer 520 header and the mid-layer packet, e.g., as IPvX/UDP/SEAL/{mid-layer 521 packet}. Here, the UDP header is seen as an "other outer header". 523 The SEAL header includes a SEAL_ID that the ITE maintains as either a 524 monotonically-incrementing per-packet identifier or as a constant 525 per-ETE identifier. When the ITE maintains the SEAL_ID as a packet 526 identifier, routers within the subnetwork can use it for duplicate 527 packet detection and both TEs can use it for SEAL segmentation/ 528 reassembly. The SEAL header also includes a LINK_ID field that 529 identifies the ITE's underlying link, and a NONCE field that provides 530 a per-ETE identifier extension. 532 The following sections specify the SEAL header format and SEAL- 533 related operations of the ITE and ETE. 535 4.3. SEAL Header Format 537 The SEAL header is formatted as follows: 539 0 1 2 3 540 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 541 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 542 |VER|C|A|I|R|F|M| NEXTHDR/SEG | LINK_ID | NONCE | 543 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 544 | SEAL_ID | 545 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 547 Figure 3: SEAL Header Format 549 where the header fields are defined as: 551 VER (2) 552 a 2-bit version field. This document specifies Version 0 of the 553 SEAL protocol, i.e., the VER field encodes the value 0. 555 C (1) 556 the "Control/Data" bit. Set to 1 by the ITE in SEAL Control 557 Message Protocol (SCMP) control messages, and set to 0 in ordinary 558 data packets. 560 A (1) 561 the "Acknowledgement Requested" bit. Set to 1 by the ITE in data 562 packets for which it wishes to receive an explicit acknowledgement 563 from the ETE. 565 I (1) 566 the "Identifier" bit. Set to 1 if the SEAL_ID contains a 567 monotonically-incrementing packet identifier; set to 0 if the 568 SEAL_ID contains a constant ETE identifier. 570 R (1) 571 the "Redirects Permitted" bit. Set to 1 if the ITE is willing to 572 accept SCMP redirects (see: Section 4.6); set to 0 otherwise. 574 F (1) 575 the "First Segment" bit. Set to 1 if this SEAL protocol packet 576 contains the first segment (i.e., Segment #0) of a mid-layer 577 packet. 579 M (1) 580 the "More Segments" bit. Set to 1 if this SEAL protocol packet 581 contains a non-final segment of a multi-segment mid-layer packet. 583 NEXTHDR/SEG (8) an 8-bit field. When 'F'=1, encodes the next header 584 Internet Protocol number the same as for the IPv4 protocol and 585 IPv6 next header fields. When 'F'=0, encodes a segment number of 586 a multi-segment mid-layer packet. (The segment number 0 is 587 reserved.) 589 LINK_ID (8) 590 an 8-bit link identifier. An integer value between 1 and 255 used 591 by the ITE to identify the underlying link selected for tunneling 592 the current packet. The ITE may also use the value 0 to indicate 593 "underlying link unspecified", e.g., when the ETE does not keep 594 track of tunnel state. 596 NONCE (8) 597 an 8-bit nonce field. Set to a random value by the ITE when the 598 tunnel to the ETE is established, and used as a per-ETE 599 identification adjunct to the SEAL_ID. 601 SEAL_ID (32) 602 a 32-bit Identification field. Used as either a per-packet or 603 per-ETE identifier. 605 Setting of the various bits and fields of the SEAL header is 606 specified in the following sections. 608 4.4. ITE Specification 610 4.4.1. Tunnel Interface MTU 612 The tunnel interface must present a fixed MTU to the inner network 613 layer as the size for admission of inner packets into the interface. 614 Since VET NBMA tunnel virtual interfaces may support a large set of 615 ETEs that accept widely varying maximum packet sizes, however, a 616 number of factors should be taken into consideration when selecting a 617 tunnel interface MTU. 619 Due to the ubiquitous deployment of standard Ethernet and similar 620 networking gear, the nominal Internet cell size has become 1500 621 bytes; this is the de facto size that end systems have come to expect 622 will either be delivered by the network without loss due to an MTU 623 restriction on the path or a suitable ICMP Packet Too Big (PTB) 624 message returned. When the 1500 byte packets sent by end systems 625 incur additional encapsulation at an ITE, however, they may be 626 dropped silently since the network may not always deliver the 627 necessary PTBs [RFC2923]. 629 The ITE should therefore set a tunnel interface MTU of at least 1500 630 bytes plus extra room to accommodate any additional encapsulations 631 that may occur on the path from the original source. The ITE can 632 also set smaller MTU values; however, care must be taken not to set 633 so small a value that original sources would experience an MTU 634 underflow. In particular, IPv6 sources must see a minimum path MTU 635 of 1280 bytes, and IPv4 sources should see a minimum path MTU of 576 636 bytes. 638 The ITE can alternatively set an indefinite MTU on the tunnel 639 interface such that all inner packets are admitted into the interface 640 without regard to size. For ITEs that host applications that use the 641 tunnel interface directly, this option must be carefully coordinated 642 with protocol stack upper layers since some upper layer protocols 643 (e.g., TCP) derive their packet sizing parameters from the MTU of the 644 outgoing interface and as such may select too large an initial size. 645 This is not a problem for upper layers that use conservative initial 646 maximum segment size estimates and/or when the tunnel interface can 647 reduce the upper layer's maximum segment size, e.g., by reducing the 648 size advertised in the MSS option of outgoing TCP messages. 650 The inner network layer protocol consults the tunnel interface MTU 651 when admitting a packet into the interface. For non-SEAL inner IPv4 652 packets with the IPv4 Don't Fragment (DF) bit set to 0, if the packet 653 is larger than the tunnel interface MTU the inner IPv4 layer uses 654 IPv4 fragmentation to break the packet into fragments no larger than 655 the tunnel interface MTU. The ITE then admits each fragment into the 656 interface as an independent packet. 658 For all other inner packets, the inner network layer admits the 659 packet if it is no larger than the tunnel interface MTU; otherwise, 660 it drops the packet and sends a PTB error message to the source with 661 the MTU value set to the tunnel interface MTU. The message must 662 contain as much of the invoking packet as possible without the entire 663 message exceeding the network layer minimum MTU (e.g., 576 bytes for 664 IPv4, 1280 bytes for IPv6, etc.). For SEAL packets, however, the 665 inner layer must send a SEAL PTB message instead of a PTB of the 666 inner network layer (see: Section 4.4.3). 668 For this reason, when the tunnel interface sets a finite MTU the 669 inner network layer must be made aware of the SEAL protocol; this may 670 not be practical for some implementations. When the tunnel interface 671 sets an indefinite MTU, however, the inner network layer 672 unconditionally admits all packets into the interface without 673 fragmentation. Once the packet has been admitted into the interface, 674 it transitions from the inner network layer and becomes subject to 675 SEAL layer processing. 677 In light of the above considerations, it is RECOMMENDED that the ITE 678 configure an indefinite MTU on the tunnel interface such that the 679 inner network layer unconditionally admits all inner packets into the 680 interface and any necessary tunnel adaptations are performed by the 681 SEAL layer within the tunnel interface as described in the following 682 sections. 684 4.4.2. Tunnel Interface Soft State 686 The ITE maintains per-ETE soft state within the tunnel interface, 687 e.g., in a neighbor cache. (The ITE can instead maintain only per- 688 tunnel interface instead of per-ETE packet identification and sizing 689 variables if it is willing to use lowest-common-denominator values 690 that are acceptable for all ETEs.) The soft state includes the 691 following: 693 o a Mid-layer Header Length (MHLEN); set to the length of any mid- 694 layer encapsulation headers and trailers that must be added before 695 SEAL segmentation. 697 o an Outer Header Length (OHLEN); set to the length of the outer IP, 698 SEAL and other outer encapsulation headers and trailers. 700 o a total Header Length (HLEN); set to MHLEN plus OHLEN. 702 o a SEAL Maximum Segment Size (S_MSS). The ITE initializes S_MSS to 703 the minimum MTU of the underlying interfaces if the underlying 704 interface MTUs can be determined (otherwise, the ITE initializes 705 S_MSS to "infinity"). The ITE decreases or increased S_MSS based 706 on any SCMP "Packet Too Big (PTB)" messages received (see Section 707 4.6). 709 o a SEAL Maximum Reassembly Unit (S_MRU). If the ITE is not 710 configured to use SEAL segmentation, it initializes S_MRU to the 711 constant value 0 and ignores any S_MRU values reported by the ETE. 712 Otherwise, the ITE initializes S_MRU to "infinity" and decreases 713 or increases S_MRU based on any SCMP PTB messages received from 714 the ETE (see Section 4.6). When (S_MRU>(S_MSS*256)), the ITE uses 715 (S_MSS*256) as the effective S_MRU value. 717 o a SEAL Inner Fragmentation Threshold (S_IFT); used to determine a 718 maximum fragment size for fragmentable IPv4 packets. Required 719 only for tunnels that support encapsulation with IPv4 as the inner 720 network layer protocol. The ITE should use a "safe" estimate for 721 S_IFT that would be highly unlikely to trigger additional 722 fragmentation on the path to the ETE. In particular, it is 723 RECOMMENDED that the ITE set S_IFT to 512 unless it can determine 724 a more accurate safe value, e.g., via probing. 726 o a set of 8 bit LINK_IDs that identify the ITE's underlying links 727 and are used to fill the SEAL header field of the same name for 728 packets sent to this ETE. The ITE selects a separate randomly- 729 initialized LINK_ID for each underlying link, and the ETE uses the 730 LINK_ID (in combination with the SEAL_ID and NONCE) to identify 731 the ITE's underlying link of origin. 733 o an 8 bit NONCE that encodes a randomly-initialized constant value 734 and is used to fill the SEAL header field of the same name for 735 packets sent to this ETE. 737 o a 32 bit SEAL_ID that is randomly-initialized constant ETE 738 identifier or monotonically-increasing packet identifier and is 739 used to fill the SEAL header field of the same name for packets 740 sent to this ETE. 742 Note that S_MSS and S_MRU include the length of the outer and mid- 743 layer encapsulating headers and trailers (i.e., HLEN), since the ETE 744 must retain the headers and trailers during reassembly. Note also 745 that the ITE maintains S_MSS and S_MRU as 32-bit values such that 746 inner packets larger than 64KB (e.g., IPv6 jumbograms [RFC2675]) can 747 be accommodated when appropriate for a given subnetwork. 749 4.4.3. Admitting Packets into the Tunnel 751 Once an inner packet/fragment has been admitted into the tunnel 752 interface, it transitions from the inner network layer and becomes 753 subject to SEAL layer processing. The ITE then examines each packet 754 to determine whether it is too large for SEAL encapsulation, then 755 prepares the packet for admission into the tunnel according to 756 whether it is "fragmentable" (discussed in the next paragraph) or 757 "unfragmentable" (discussed in the following paragraph). 759 If the packet is a non-SEAL IPv4 packet with DF=0 in the IPv4 header 760 (*), and the packet is larger than S_IFT, the ITE uses fragmentation 761 to break the packet into IPv4 fragments no larger than S_IFT bytes 762 then submits each fragment for encapsulation separately. 764 For all other packets, if the packet is larger than (MAX(S_MRU, 765 S_MSS) - HLEN), the ITE drops it and sends a PTB message to the 766 source (**) with an MTU value of (MAX(S_MRU, S_MSS) - HLEN); 767 otherwise, it submits the packet for encapsulation. The ITE must 768 include the length of the uncompressed headers and trailers when 769 calculating HLEN even if the tunnel is using header compression. The 770 ITE is also permitted to admit inner packets into the tunnel that can 771 be accommodated in a single SEAL segment (i.e., no larger than S_MSS) 772 even if they are larger than the ETE would be willing to reassemble 773 if fragmented (i.e., larger than S_MRU) - see: Section 4.5.1. 775 (*) In order to support nested encapsulations, inner SEAL-protocol 776 IPv4 packets with DF=0 must be treated as unfragmentable and subject 777 to drop due to an MTU restriction as for all other packets. 779 (**) When the ITE needs to drop a packet and send a PTB message, it 780 sends an SCMP PTB message if the packet itself is a SEAL encapsulated 781 packet (see: Section 4.6.1.1). Otherwise, it sends a PTB 782 corresponding to the inner network layer protocol packet. 784 4.4.4. Mid-Layer Encapsulation 786 After inner IP fragmentation (if necessary), the ITE next 787 encapsulates each inner packet/fragment in the MHLEN bytes of mid- 788 layer headers and trailers. The ITE then submits the mid-layer 789 packet for SEAL segmentation and encapsulation. 791 4.4.5. SEAL Segmentation 793 If the ITE is configured to use SEAL segmentation, it checks the 794 length of the resulting packet after mid-layer encapsulation to 795 determine whether segmentation is needed. If the length of the 796 resulting mid-layer packet plus OHLEN is larger than S_MSS but no 797 larger than S_MRU the ITE performs SEAL segmentation by breaking the 798 mid-layer packet into N segments (N <= 256) that are no larger than 799 (S_MSS - OHLEN) bytes each. Each segment, except the final one, MUST 800 be of equal length. The first byte of each segment MUST begin 801 immediately after the final byte of the previous segment, i.e., the 802 segments MUST NOT overlap. The ITE SHOULD generate the smallest 803 number of segments possible, e.g., it SHOULD NOT generate 6 smaller 804 segments when the packet could be accommodated with 4 larger 805 segments. 807 This SEAL segmentation process ignores the fact that the mid-layer 808 packet may be unfragmentable outside of the subnetwork. The process 809 is a mid-layer (not an IP layer) operation employed by the ITE to 810 adapt the mid-layer packet to the subnetwork path characteristics, 811 and the ETE will restore the packet to its original form during 812 reassembly. Therefore, the fact that the packet may have been 813 segmented within the subnetwork is not observable outside of the 814 subnetwork. 816 4.4.6. SEAL Encapsulation 818 Following SEAL segmentation, the ITE next encapsulates each segment 819 in a SEAL header formatted as specified in Section 4.3. For the 820 first segment, the ITE sets F=1, then sets NEXTHDR to the Internet 821 Protocol number of the encapsulated inner packet, and finally sets 822 M=1 if there are more segments or sets M=0 otherwise. For each non- 823 initial segment of an N-segment mid-layer packet (N <= 256), the ITE 824 sets (F=0; M=1; SEG=1) in the SEAL header of the first non-initial 825 segment, sets (F=0; M=1; SEG=2) in the next non-initial segment, 826 etc., and sets (F=0; M=0; SEG=N-1) in the final segment. (Note that 827 the value SEG=0 is not used, since the initial segment encodes a 828 NEXTHDR value and not a SEG value.) 830 For each segment, the ITE then sets C=0, sets R=1 if it is willing to 831 accept SCMP redirects (see Section 4.6) and sets A=1 if explicit 832 probing is desired (see Section 4.4.9). The ITE then sets the 833 LINK_ID field to an integer between 1 and 255 that identifies the 834 underlying link over which this packet will be tunneled. (The ITE 835 may instead set LINK_ID to 0 if the ETE is not tracking state, e.g., 836 if the tunnel neighbor relationship is unidirectional.) The ITE next 837 sets the NONCE field to a randomly-initialized constant nonce value 838 for this ETE. 840 The ITE finally sets the I flag and SEAL_ID values as follows. The 841 ITE maintains a randomly-initialized SEAL_ID value as per-ETE soft 842 state (e.g., in the neighbor cache). If the SEAL_ID is to be used as 843 a packet identifier, the ITE monotonically increments the value for 844 each successive SEAL protocol packet it sends to the ETE. If the 845 SEAL_ID is to be used as an ETE identifier, the ITE instead maintains 846 SEAL_ID as a constant value. 848 For each successive SEAL segment, the ITE writes the current SEAL_ID 849 value into the SEAL header field of the same name. It then sets I=1 850 if the SEAL_ID represents a packet identifier and I=0 if the SEAL_ID 851 represents an ETE identifier. The ITE must be consistent in its 852 setting of the I flag. For example, it must not set I=1 in some 853 packets and I=0 in others since this may result in unpredictable 854 behavior. 856 4.4.7. Outer Encapsulation 858 Following SEAL encapsulation, the ITE next encapsulates each SEAL 859 segment in the requisite outer headers and trailers according to the 860 specific encapsulation format (e.g., [RFC1070], [RFC2003], [RFC2473], 861 [RFC4213], etc.), except that it writes 'SEAL_PROTO' in the protocol 862 field of the outer IP header (when simple IP encapsulation is used) 863 or writes 'SEAL_PORT' in the outer destination service port field 864 (e.g., when IP/UDP encapsulation is used). 866 When IPv4 is used as the outer encapsulation layer, the ITE finally 867 sets the DF flag in the IPv4 header of each segment. If the path to 868 the ETE correctly implements IP fragmentation (see: Section 4.4.9), 869 the ITE sets DF=0; otherwise, it sets DF=1. 871 When IPv6 is used as the outer encapsulation layer, the "DF" flag is 872 absent but the packet will not be fragmented within the subnetwork 873 since IPv6 deprecates in-the-network fragmentation. 875 4.4.8. Sending SEAL Protocol Packets 877 Following outer encapsulation, the ITE sends each outer packet that 878 encapsulates a segment of the same mid-layer packet over the same 879 underlying link in canonical order, i.e., segment 0 first, followed 880 by segment 1, etc., and finally segment N-1. 882 4.4.9. Probing Strategy 884 When IPv4 is used as the outer encapsulation layer, the ITE should 885 perform a qualification exchange over each underlying link to 886 determine whether each subnetwork path to the ETE correctly 887 implements IPv4 fragmentation. The qualification exchange can be 888 performed either as an initial probe or in-band with real data 889 packets, and should be repeated periodically since the subnetwork 890 paths may change due to dynamic routing. 892 To perform this qualification, the ITE prepares a probe packet that 893 is no larger than 576 bytes (e.g., a NULL packet with A=1 and 894 NEXTHDR="No Next Header" [RFC2460] in the SEAL header), then splits 895 the packet into two outer IPv4 fragments and sends both fragments to 896 the ETE over the same underlying link. If the ETE returns an SCMP 897 PTB message with Code=1 (see Section 4.6.1.1), then the subnetwork 898 path correctly implements IPv4 fragmentation and subsequent data 899 packets can be sent with DF=0 in the outer header to enable the 900 preferred method of probing. If the ETE returns an SCMP PTB message 901 with Code=2, however, the ITE is obliged to set DF=1 for future 902 packets sent over that underlying link since a middlebox in the 903 network is reassembling the IPv4 fragments before they are delivered 904 to the ETE. 906 In addition to any control plane probing, all SEAL encapsulated data 907 packets sent by the ITE are considered implicit probes. SEAL 908 encapsulated packets that use IPv4 as the outer layer of 909 encapsulation with DF=0 will elicit SCMP PTB messages from the ETE if 910 any IPv4 fragmentation occurs in the path. SEAL encapsulated packets 911 that use either IPv6 or IPv4 with DF=1 as the outer layer of 912 encapsulation may be dropped by a router on the path to the ETE which 913 will also return an ICMP PTB message to the ITE. If the message 914 includes enough information (see Section 4.4.10), the ITE can then 915 use the (LINK_ID, NONCE, SEAL_ID)-tuple within the packet-in-error to 916 determine whether the PTB message corresponds to one of its recent 917 packet transmissions. 919 The ITE should also send explicit probes, periodically, to verify 920 that the ETE is still reachable. The ITE sets A=1 in the SEAL header 921 of a segment to be used as an explicit probe, where the probe can be 922 either an ordinary data packet segment or a NULL packet (see above). 923 The probe will elicit an SCMP PTB message from the ETE as an 924 acknowledgement (see Section 4.6.1). 926 4.4.10. Processing ICMP Messages 928 When the ITE sends outer IP packets, it may receive ICMP error 929 messages [RFC0792][RFC4443] from either the ETE or routers within the 930 subnetwork. The ICMP messages include an outer IP header, followed 931 by an ICMP header, followed by a portion of the outer IP packet that 932 generated the error (also known as the "packet-in-error"). The ITE 933 can use the (LINK_ID, NONCE, SEAL_ID)-tuple encoded in the SEAL 934 header within the packet-in-error to confirm that the ICMP message 935 came from either the ETE or an on-path router, and can use any 936 additional information to determine whether to accept or discard the 937 message. 939 The ITE should specifically process raw ICMPv4 Protocol Unreachable 940 messages and ICMPv6 Parameter Problem messages with Code 941 "Unrecognized Next Header type encountered" as a hint that the ETE 942 does not implement the SEAL protocol; specific actions that the ITE 943 may take in this case are out of scope. 945 4.4.11. Black Hole Detection 947 In some subnetwork paths, ICMP error messages may be lost due to 948 filtering or may not contain enough information due to a router in 949 the path not observing the recommendations of [RFC1812]. The ITE can 950 use explicit probing as described in Section 4.4.9 to determine 951 whether the path to the ETE is silently dropping packets (also known 952 as a "black hole"). For example, when the ITE is obliged to set DF=1 953 in the outer headers of data packets it should send explicit probe 954 packets, periodically, in order to detect path MTU increases or 955 decreases. 957 4.5. ETE Specification 959 4.5.1. Reassembly Buffer Requirements 961 The ETE SHOULD support the minimum IP-layer reassembly requirements 962 specified for IPv4 (i.e., 576 bytes [RFC1812]) and IPv6 (i.e., 1500 963 bytes [RFC2460]). The ETE SHOULD also support SEAL-layer reassembly 964 for inner packets of at least 1280 bytes in length and MAY support 965 reassembly for larger inner packets. The ETE records the SEAL-layer 966 reassembly buffer size in a soft-state variable "S_MRU" (see: Section 967 4.5.2). 969 The ETE may instead omit the reassembly function altogether and set 970 S_MRU=0, but this may cause tunnel MTU underruns in some environments 971 resulting in an unusable link. When reassembly is supported, the ETE 972 must retain the outer IP, SEAL and other outer headers and trailers 973 during both IP-layer and SEAL-layer reassembly for the purpose of 974 associating the fragments/segments of the same packet, and must also 975 configure a SEAL-layer reassembly buffer that is no smaller than the 976 IP-layer reassembly buffer. Hence, the ETE: 978 o SHOULD configure an outer IP-layer reassembly buffer of at least 979 the minimum specified for the outer IP protocol version. 981 o SHOULD configure a SEAL-layer reassembly buffer S_MRU size of at 982 least (1280 + HELN) bytes, and 984 o MUST be capable of discarding inner packets that require IP-layer 985 and/or SEAL-layer reassembly and that are larger than (S_MRU - 986 HLEN). 988 The ETE is permitted to accept inner packets that did not undergo IP- 989 layer and/or SEAL-layer reassembly even if they are larger than 990 (S_MRU - HELN) bytes. Hence, S_MRU is a maximum *reassembly* size, 991 and may be less than the largest packet size the ETE is able to 992 receive when no reassembly is required. 994 4.5.2. Tunnel Interface Soft State 996 The ETE maintains a single per-interface S_MRU value to be applied 997 for all unidirectional tunnel neighbors, and can also maintain per- 998 ITE S_MRU values for any bidirectional tunnel neighbors (see: Section 999 4.7). For each bidirectional ITE neighbor, the ETE also maintains 1000 per-ITE soft state to track the NONCE, SEAL_ID and LINK_ID values 1001 used by the ITE. 1003 For each bidirectional tunnel neighbor, the ETE also tracks the outer 1004 IP source addresses (and also port numbers when outer UDP 1005 encapsulation is used) of packets received from the ITE and 1006 associates the most recent values received with the corresponding 1007 LINK_ID. In this way, the LINK_ID provides a stable handle for the 1008 tunnel near end to use for return traffic to the tunnel far end even 1009 if the outer IP source address and port numbers in packets received 1010 from the tunnel far end change. 1012 4.5.3. IP-Layer Reassembly 1014 The ETE submits unfragmented SEAL protocol IP packets for SEAL-layer 1015 reassembly as specified in Section 4.5.4. The ETE instead performs 1016 standard IP-layer reassembly for multi-fragment SEAL protocol IP 1017 packets as follows. 1019 The ETE should maintain conservative IP-layer reassembly cache high- 1020 and low-water marks. When the size of the reassembly cache exceeds 1021 this high-water mark, the ETE should actively discard incomplete 1022 reassemblies (e.g., using an Active Queue Management (AQM) strategy) 1023 until the size falls below the low-water mark. The ETE should also 1024 actively discard any pending reassemblies that clearly have no 1025 opportunity for completion, e.g., when a considerable number of new 1026 fragments have been received before a fragment that completes a 1027 pending reassembly has arrived. Following successful IP-layer 1028 reassembly, the ETE submits the reassembled packet for SEAL-layer 1029 reassembly as specified in Section 4.5.4. 1031 When the ETE processes the IP first fragment (i.e., one with MF=1 and 1032 Offset=0 in the IP header) of a fragmented SEAL packet, it sends an 1033 SCMP PTB message back to the ITE (see Section 4.6.1.1). When the ETE 1034 processes an IP fragment that would cause the reassembled outer 1035 packet to be larger than the IP-layer reassembly buffer following 1036 reassembly, it discontinues the reassembly and discards any further 1037 fragments of the same packet. 1039 4.5.4. SEAL-Layer Reassembly 1041 Following IP reassembly (if necessary), the ETE examines each mid- 1042 layer data packet (i.e., those with C=0 in the SEAL header) packet) 1043 to determine whether an SCMP error message is required. If the mid- 1044 layer data packet has an incorrect value in the SEAL header the ETE 1045 discards the packet and returns an SCMP "Parameter Problem" message 1046 (see Section 4.6.1). Next, if the SEAL header has A=1 and the packet 1047 did not arrive as multiple outer IP fragments, the ETE sends an SCMP 1048 PTB message with Code=2 back to the ITE (see Section 4.6.1.1). The 1049 ETE next submits single-segment mid-layer packets for decapsulation 1050 and delivery to upper layers (see Section 4.5.5). The ETE instead 1051 performs SEAL-layer reassembly for multi-segment mid-layer packets 1052 with I=1 in the SEAL header as follows. 1054 The ETE adds each segment of a multi-segment mid-layer packet with 1055 I=1 in the SEAL header to a SEAL-layer pending-reassembly queue 1056 according to the (LINK_ID, NONCE, SEAL_ID)-tuple found in the SEAL 1057 header. The ETE performs SEAL-layer reassembly through simple in- 1058 order concatenation of the encapsulated segments of the same mid- 1059 layer packet from N consecutive SEAL segments. SEAL-layer reassembly 1060 requires the ETE to maintain a cache of recently received segments 1061 for a hold time that would allow for nominal inter-segment delays. 1062 When a SEAL reassembly times out, the ETE discards the incomplete 1063 reassembly and returns an SCMP "Time Exceeded" message to the ITE 1064 (see Section 4.6.1). As for IP-layer reassembly, the ETE should also 1065 maintain a conservative reassembly cache high- and low-water mark and 1066 should actively discard any pending reassemblies that clearly have no 1067 opportunity for completion, e.g., when a considerable number of new 1068 SEAL packets have been received before a packet that completes a 1069 pending reassembly has arrived. 1071 If the ETE receives a SEAL packet for which a segment with the same 1072 (LINK_ID, NONCE, SEAL_ID)-tuple is already in the queue, it must 1073 determine whether to accept the new segment and release the old, or 1074 drop the new segment. If accepting the new segment would cause an 1075 inconsistency with other segments already in the queue (e.g., 1076 differing segment lengths), the ETE drops the segment that is least 1077 likely to complete the reassembly. When the ETE has already received 1078 the SEAL first segment (i.e., one with F=1 and M=1 in the SEAL 1079 header) of a SEAL protocol packet that arrived as multiple SEAL 1080 segments, and accepting the current segment would cause the size of 1081 the reassembled packet to exceed S_MRU, the ETE schedules the 1082 reassembly resources for garbage collection and sends an SCMP PTB 1083 message with Code=3 back to the ITE (see Section 4.6.1.1). 1085 After all segments are gathered, the ETE reassembles the packet by 1086 concatenating the segments encapsulated in the N consecutive SEAL 1087 packets beginning with the initial segment (i.e., SEG=0) and followed 1088 by any non-initial segments 1 through N-1. That is, for an N-segment 1089 mid-layer packet, reassembly entails the concatenation of the SEAL- 1090 encapsulated packet segments with (F=1, M=1, SEAL_ID=j) in the first 1091 SEAL header, followed by (F=0, M=1, SEG=1, SEAL_ID=(j+1)) in the next 1092 SEAL header, followed by (F=0, M=1, SEG=2, SEAL_ID=(j+2)), etc., up 1093 to (F=0, M=0, SEG=(N-1), SEAL_ID=(j + N-1)) in the final SEAL header, 1094 where modulo arithmetic based on the length of the SEAL_ID field is 1095 used. Following successful SEAL-layer reassembly, the ETE submits 1096 the reassembled mid-layer packet for decapsulation and delivery to 1097 upper layers as specified in Section 4.5.5. 1099 The ETE must not perform SEAL-layer reassembly for multi-segment mid- 1100 layer packets with I=0 in the SEAL header. The ETE instead silently 1101 drops all segments with I=0 and either F=0 or (F=1; M=1) in the SEAL 1102 header and sends an SCMP Parameter Problem message back to the ITE. 1104 4.5.5. Decapsulation and Delivery to Upper Layers 1106 Following any necessary IP- and SEAL-layer reassembly, the ETE 1107 discards the outer headers and trailers and performs any mid-layer 1108 transformations on the mid-layer packet. The ETE next discards the 1109 mid-layer headers and trailers, and delivers the inner packet to the 1110 upper-layer protocol indicated either in the SEAL NEXTHDR field or 1111 the next header field of the mid-layer packet (i.e., if the packet 1112 included mid-layer encapsulations). The ETE instead silently 1113 discards the inner packet if it was a NULL packet (see Section 1114 4.4.9). 1116 4.6. The SEAL Control Message Protocol (SCMP) 1118 SEAL uses a companion SEAL Control Message Protocol (SCMP) based on 1119 the same message format as the Internet Control Message Protocol for 1120 IPv6 (ICMPv6) [RFC4443]. Each SCMP message is embedded within an 1121 SCMP packet which begins with the same outer header format as would 1122 be used for outer encapsulation of a SEAL data packet (see: Section 1123 4.4.7). The following sections specify the generation and processing 1124 of SCMP messages: 1126 4.6.1. Generating SCMP Messages 1128 SCMP messages may be generated by either ITEs or ETEs (i.e., by any 1129 TE) using the same message Type and Code values specified for 1130 ordinary ICMPv6 messages in [RFC4443]. SCMP is also used to carry 1131 other ICMPv6 message types and their associated options as specified 1132 in other documents (e.g., [RFC4191][RFC4861], etc.). The general 1133 format for SCMP messages is shown in Figure 4: 1135 0 1 2 3 1136 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1137 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1138 | Type | Code | Checksum | 1139 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1140 | | 1141 ~ Message Body ~ 1142 | | 1143 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1144 | As much of invoking SEAL data | 1145 ~ packet as possible without the SCMP ~ 1146 | packet exceeding 576 bytes (*) | 1147 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1149 (*) also known as the "packet-in-error" 1151 Figure 4: SCMP Message Format 1153 TEs generate solicitation messages (e.g., an SCMP echo request, an 1154 SCMP router/neighbor solicitation, a SEAL data packet with A=1, etc.) 1155 for the purpose of triggering an SCMP response. TEs generate 1156 solicited SCMP messages (e.g., an SCMP echo reply, an SCMP router/ 1157 neighbor advertisement, an SCMP PTB message, etc.) in response to 1158 explicit solicitations, and also generate SCMP error messages in 1159 response to errored SEAL data packets. As for ICMP, TEs must not 1160 generate SCMP error message in response to other SCMP messages. 1162 As for ordinary ICMPv6 messages, the SCMP message begins with a 4 1163 byte header that includes 8-bit Type and Code fields followed by a 1164 16-bit Checksum field followed by a variable-length Message Body. 1165 The TE sets the Type and Code fields to the same values that would 1166 appear in the corresponding ICMPv6 message and also formats the 1167 Message Body the same as for the corresponding ICMPv6 message. 1169 The Message Body is followed by the leading portion of the invoking 1170 SEAL data packet (i.e., the "packet-in-error") IFF the packet-in- 1171 error would also be included in the corresponding ICMPv6 message. If 1172 the SCMP message will include a packet-in-error, the TE includes as 1173 much of the leading portion of the invoking SEAL data packet as 1174 possible beginning with the outer IP header and extending to a length 1175 that would not cause the entire SCMP packet following outer 1176 encapsulation to exceed 576 bytes (see: Figure 5). 1178 The TE then calculates the SCMP message Checksum the same as 1179 specified for ICMPv6 messages except that it does not prepend a 1180 pseudo-header of the outer IP header since the (LINK_ID, NONCE, 1181 SEAL_ID)-tuple already gives sufficient assurance against mis- 1182 delivery. (The Checksum calculation procedure is therefore identical 1183 to that used for ICMPv4 [RFC0792].) The TE then encapsulates the 1184 SCMP message in the outer headers as shown in Figure 5: 1186 +--------------------+ 1187 ~ outer IPv4 header ~ 1188 +--------------------+ 1189 ~ other outer hdrs ~ 1190 +--------------------+ 1191 ~ SEAL Header ~ 1192 +--------------------+ +--------------------+ 1193 ~ SCMP message header~ --> ~ SCMP message header~ 1194 +--------------------+ --> +--------------------+ 1195 ~ SCMP message body ~ --> ~ SCMP message body ~ 1196 +--------------------+ --> +--------------------+ 1197 ~ packet-in-error ~ --> ~ packet-in-error ~ 1198 +--------------------+ +--------------------+ 1199 ~ outer trailers ~ 1200 SCMP Message +--------------------+ 1201 before encapsulation 1202 SCMP Packet 1203 after encapsulation 1205 Figure 5: SCMP Message Encapsulation 1207 When a TE generates an SCMP message in response to an SCMP 1208 solicitation or an ordinary SEAL data packet (i.e., a "solicitation 1209 packet"), it sets the outer IP destination and source addresses of 1210 the SCMP packet to the solicitation's source and destination 1211 addresses (respectively). (If the destination address in the 1212 solicitation was multicast, the TE instead sets the outer IP source 1213 address of the SCMP packet to an address assigned to the underlying 1214 IP interface.) The TE then sets the (LINK_ID, NONCE, SEAL_ID)-tuple 1215 and I flag in the SEAL header of the SCMP packet to the same values 1216 that appeared in the solicitation. 1218 When a TE generates an unsolicited SCMP message, it sets the outer IP 1219 destination and source addresses of the SCMP packet the same as it 1220 would for ordinary SEAL data packets. The TE then sets the (LINK_ID, 1221 NONCE, SEAL_ID)-tuple and I flag in the SEAL header of the SCMP 1222 packet to the same values that it would use to send an ordinary SEAL 1223 data packet. 1225 For all SCMP messages, the TE then sets the other flag bits in the 1226 SEAL header to C=1, A=0, R=0, F=1, and M=0. It next sets the 1227 NEXTHDR/SEG field to 0 and sends the SCMP packet to the tunnel 1228 neighbor. 1230 4.6.1.1. Generating SCMP Packet Too Big (PTB) Messages 1232 An ETE generates an SCMP PTB message under one of the following 1233 cases: 1235 o Case 1: when it receives the IP first fragment (i.e., one with 1236 MF=1 and Offset=0 in the outer IP header) of a SEAL protocol 1237 packet that arrived as multiple IP fragments, or: 1239 o Case 2: when it receives a SEAL protocol data packet with A=1 in 1240 the SEAL header that did not arrive as multiple IP fragments 1241 (i.e., one that does not also match Case 1), or: 1243 o Case 3: when it has already received the SEAL first segment (i.e., 1244 one with F=1 and M=1 in the SEAL header) of a SEAL protocol packet 1245 that arrived as multiple SEAL segments, and accepting the current 1246 segment would cause the size of the reassembled packet to exceed 1247 S_MRU. 1249 The ETE prepares an SCMP PTB message the same as for the 1250 corresponding ICMPv6 PTB message, except that it writes the S_MRU 1251 value for this ITE in the MTU field (i.e., even if the S_MRU value is 1252 0). For cases 1 and 2 above, the packet-in-error field includes the 1253 leading portion of the IP packet or fragment that triggered the 1254 condition. For case 3 above, the packet-in-error field includes the 1255 leading portion of the SEAL first segment, beginning with the 1256 encapsulating outer IP header. 1258 Finally, the ETE writes the value 1, 2 or 3 in the Code field of the 1259 PTB message according to whether the reason for generating the 1260 message was due to the corresponding case number from the list of 1261 cases above. 1263 NOTE CAREFULLY that, unlike cases 1 and 3 above, case 2 is not an 1264 error condition and does not necessarily signify packet loss. 1265 Instead, it is a control plane acknowledgement of a data plane probe. 1266 NOTE ALSO that the ETE MUST NOT generate both a Case 1 and a Case 2 1267 SCMP PTB message on behalf of the same SEAL segment. 1269 4.6.1.2. Generating SCMP Neighbor Discovery Messages 1271 An ITE generates an SCMP "Neighbor Solicitation" (SNS) or "Router 1272 Solicitation" (SRS) message when it needs to solicit a response from 1273 an ETE. An ETE generates a solicited SCMP "Neighbor Advertisement" 1274 (SNA) or "Router Advertisement" (SRA) message when it receives an 1275 SNS/SRS message. Any TE may also generate unsolicited SNA/SRA 1276 messages that are not triggered by a specific solicitation event. 1278 The TE generates SNS, SNA, SRS and SRA messages the same as described 1279 for the corresponding IPv6 Neighbor Discovery (ND) messages (see: 1280 [RFC4861]). 1282 4.6.1.3. Generating SCMP Redirect Messages 1284 An ETE generates an SCMP "Redirect" message when it receives a SEAL 1285 data packet with R=1 in the SEAL header and needs to inform the ITE 1286 of a better next hop. The ETE generates SCMP Redirect messages the 1287 same as described for IPv6 ND Redirects in [RFC4861], except that it 1288 includes Route Information Options (RIOs) [RFC4191] to inform the ITE 1289 of a better next hop for an entire IP prefix instead of only a single 1290 destination. The SCMP Redirect message therefore supports both 1291 network and host redirection instead of only host redirection. 1293 4.6.1.4. Generating Other SCMP Messages 1295 An ETE generates an SCMP "Destination Unreachable - Communication 1296 with Destination Administratively Prohibited" message when its 1297 association with the ITE is bidirectional and it receives a SEAL 1298 packet with a (LINK_ID, NONCE, SEAL_ID)-tuple that does not 1299 correspond to this ITE (see: Section 4.7). 1301 An ETE generates an SCMP "Destination Unreachable" message with an 1302 appropriate code under the same circumstances that an IPv6 system 1303 would generate an ICMPv6 Destination Unreachable message using the 1304 same code. The SCMP Destination Unreachable message is formatted the 1305 same as for ICMPv6 Destination Unreachable messages. 1307 An ETE generates an SCMP "Parameter Problem" message when it receives 1308 a SEAL packet with an incorrect value in the SEAL header, and 1309 generates an SCMP "Time Exceeded" message when it garbage collects an 1310 incomplete SEAL data packet reassembly. The message formats used are 1311 the same as for the corresponding ICMPv6 messages. 1313 Generation of all other SCMP message types is outside the scope of 1314 this document. 1316 4.6.2. Processing SCMP Messages 1318 An ITE processes any solicited and error SCMP message it receives as 1319 long as it can verify that the corresponding SCMP packet was sent 1320 from an on-path ETE. The ITE can verify that the SCMP packet came 1321 from an on-path ETE by checking that the (LINK_ID, NONCE, SEAL_ID)- 1322 tuple in the SEAL header of the packet corresponds to one of its 1323 recently-sent SEAL data packets or SCMP solicitation packets. 1325 For each solicited and error SCMP message it receives, the ITE first 1326 verifies that the (LINK_ID, NONCE, SEAL_ID)-tuple is acceptable, then 1327 verifies that the Checksum in the SCMP message header is correct. If 1328 the (LINK_ID, NONCE,SEAL_ID)-tuple and/or checksum are incorrect, the 1329 ITE discards the message; otherwise, it processes the message the 1330 same as for ordinary ICMPv6 messages. 1332 Any TE may also receive unsolicited SCMP messages (e.g., SNS, SRS, 1333 SNA, SRA, etc.) from the tunnel neighbor. The TE sends SCMP response 1334 messages in response to solicitations, but does not otherwise process 1335 the unsolicited SCMP messages as an indication of tunnel neighbor 1336 liveness. 1338 Finally, TEs process solicited and error SCMP messages as an 1339 indication that the tunnel neighbor is responsive, i.e., in the same 1340 manner implied for IPv6 Neighbor Unreachability Detection "hints of 1341 forward progress" (see: [RFC4861]). 1343 4.6.2.1. Processing SCMP PTB Messages 1345 An ITE may receive an SCMP PTB message after it sends a SEAL data 1346 packet to an ETE (see: Section 4.6.1). The packet-in-error within 1347 the PTB message consists of the encapsulating IP/*/SEAL headers 1348 followed by the inner packet in the form in which the ITE received it 1349 prior to SEAL encapsulation. 1351 If the PTB message has Code=2 in the SCMP header the ITE processes 1352 the message as a response to an explicit probe request and discards 1353 the message. If the PTB has Code=1 or Code=3 in the SCMP header, 1354 however, the ITE processes the message as an indication of an MTU 1355 limitation. 1357 if the PTB has Code =1, the ITE first verifies that the outer IP 1358 header in the packet-in-error encodes an IP first fragment, then 1359 examines the outer IP header length field to determine a new S_MSS 1360 value as follows: 1362 o If the length is no less than 1280, the ITE records the length as 1363 the new S_MSS value. 1365 o If the length is less than the current S_MSS value and also less 1366 than 1280, the ITE can discern that IP fragmentation is occurring 1367 but it cannot determine the true MTU of the restricting link due 1368 to the possibility that a router on the path is generating runt 1369 first fragments. 1371 In this latter case, the ITE may need to search for a reduced S_MSS 1372 value through an iterative searching strategy that parallels the IPv4 1373 Path MTU Discovery "plateau table" procedure in a similar fashion as 1374 described in Section 5 of [RFC1191]. This searching strategy may 1375 entail multiple iterations in which the ITE sends additional SEAL 1376 data packets using a reduced S_MSS and receives additional SCMP PTB 1377 messages, but the process should quickly converge. During this 1378 process, it is essential that the ITE reduce S_MSS based on the first 1379 SCMP PTB message received under the current S_MSS size, and refrain 1380 from further reducing S_MSS until SCMP PTB messages pertaining to 1381 packets sent under the new S_MSS are received. 1383 For both Code=1 and Code=3 PTB messages, the ITE next records the 1384 value in the MTU field of the SCMP PTB message as the new S_MRU value 1385 for this ETE and examines the inner packet within the packet-in- 1386 error. If the inner packet was unfragmentable (see: Section 4.4.3) 1387 and larger than (MAX(S_MRU, S_MSS) - HLEN), the ITE then sends a 1388 transcribed PTB message appropriate for the inner packet to the 1389 original source with MTU set to (MAX(S_MRU, S_MSS) - HLEN). (In the 1390 case of nested SEAL encapsulations, the transcribed PTB message will 1391 itself be an SCMP PTB message). If the inner packet is fragmentable, 1392 however, the ITE instead reduces its inner fragmentation THRESH 1393 estimate to a size no larger than S_MSS for this ETE (see: Section 1394 4.4.3) and does not send a transcribed PTB. In that case, some 1395 fragmentable packets may be silently discarded but future 1396 fragmentable packets will subsequently undergo inner fragmentation 1397 based on this new THRESH estimate. 1399 The ITE may alternatively ignore the S_MSS and S_MRU values, thus 1400 disabling SEAL-layer segmentation. In that case, the ITE sends all 1401 SEAL-encapsulated packets as single segments and implements stateless 1402 MTU discovery. In that case, if the ITE receives an SCMP PTB message 1403 from the ETE with Code=1 and with a too-small length value in the 1404 outer IP header, it can send a translated PTB message back to the 1405 source listing a slightly smaller MTU size than the length value in 1406 the inner IP header. For example, if the ITE receives an SCMP PTB 1407 message with Code=1, outer IP length 256 and inner IP length 1500, it 1408 can send a PTB message listing an MTU of 1400 back to the source. If 1409 the ITE subsequently receives an SCMP PTB message with Code=1, outer 1410 IP length 256 and inner IP length 1400, it can send a PTB message 1411 listing an MTU of 1300 back to the source, etc. 1413 Actual plateau table values for this "step-down" MTU determination 1414 procedure are up to the implementation, which may consult Section 7 1415 of [RFC1191] for non-normative example guidance. 1417 4.6.2.2. Processing SCMP Neighbor Discovery Messages 1419 An ETE may receive SNS/SRS messages from an ITE as the initial leg in 1420 a neighbor discovery exchange. An ITE may also receive both 1421 solicited and unsolicited SNA/SRA messages from an ETE. 1423 The TE processes SNS/SRS and SNA/SRA messages the same as described 1424 for the corresponding IPv6 Neighbor Discovery (ND) messages (see: 1425 [RFC4861]). 1427 4.6.2.3. Processing SCMP Redirect Messages 1429 An ITE may receive SCMP redirect messages after sending a SEAL data 1430 packet with R=1 in the SEAL header to an ETE. The ITE processes any 1431 RIO options in the SCMP redirect message and updates its Forwarding 1432 Information Base (FIB) accordingly. 1434 4.6.2.4. Processing Other SCMP Messages 1436 An ITE may receive an SCMP "Destination Unreachable - Communication 1437 with Destination Administratively Prohibited" message after it sends 1438 a SEAL data packet. The ITE processes the message as an indication 1439 that it needs to (re)synchronize with the ETE (see: Section 4.7). 1441 An ITE may receive an SCMP "Destination Unreachable" message with an 1442 appropriate code under the same circumstances that an IPv6 node would 1443 receive an ICMPv6 Destination Unreachable message. The ITE processes 1444 the message the same as for the corresponding ICMPv6 Destination 1445 Unreachable messages. 1447 An ITE may receive an SCMP "Parameter Problem" message when the ETE 1448 receives a SEAL packet with an incorrect value in the SEAL header. 1449 The ITE should examine the incorrect SEAL header field setting to 1450 determine whether a different setting should be used in subsequent 1451 packets. 1453 .An ITE may receive an SCMP "Time Exceeded" message when the ETE 1454 garbage collects an incomplete SEAL data packet reassembly. The ITE 1455 should consider the message as an indication of congestion. 1457 Processing of all other SCMP message types is outside the scope of 1458 this document. 1460 4.7. Tunnel Endpoint Synchronization 1462 By default, the SEAL ITE retains per-ETE soft state, but the ETE does 1463 not retain per-ITE soft state. In that case, the tunnel neighbor 1464 relationship between the ITE and ETE is said to be "unidirectional", 1465 and the ETE unconditionally accepts any packets coming from the ITE. 1466 When peer TEs need to establish a closer coordination with one 1467 another, however, they can establish a bidirectional tunnel neighbor 1468 relationship to establish both ITE and ETE soft state within both 1469 TEs. 1471 In order to establish a bidirectional tunnel neighbor relationship, 1472 the initiating TE (call it "A") initiates a short transaction with 1473 the responding TE (call it "B") carried by a reliable transport 1474 protocol such as TCP. The protocol details of the transaction are 1475 out of scope for this document, and indeed need not be standardized 1476 as long as both TEs observe the same specifications. 1478 In the transaction, "A" and "B" first authenticate themselves to each 1479 other. "A" then selects randomly-generated NONCE(A) and SEAL_ID(A) 1480 values and registers them with "B", while "B" in turn selects 1481 randomly-generated NONCE(B) and SEAL_ID(B) values and registers them 1482 with "A". Both TEs then further select one or more randomly- 1483 generated LINK_IDs (e.g., LINK_ID(A1), LINK_ID(A2), etc.), where each 1484 LINK_ID represents a different underlying link over which the ITE 1485 function of "A" will send tunneled packets to the ETE function of "B" 1486 (and vice-versa). Both TEs then use each such (LINK_ID(i), NONCE, 1487 SEAL_ID)-tuple to establish the appropriate bidirectional tunnel 1488 neighbor soft state (see Sections 4.4.2 and 4.5.2). 1490 Following this bidirectional tunnel neighbor establishment, the 1491 reliable transport transaction between the TEs concludes since the 1492 status of the underlying links is opaque to the transport protocol 1493 and the transport protocol therefore has no means for selecting 1494 alternate underlying links should the path through the primary 1495 underlying link fail. The soft state is then kept alive by the 1496 continued flow of SEAL data packets and/or SCMP messages between the 1497 TEs rather than by higher-layer keepalives of the transport protocol. 1499 Outbound and inbound traffic engineering between bidirectional tunnel 1500 neighbors is therefore coordinated by SCMP from within the tunnel 1501 interface and can remain continuous even if the paths through one or 1502 more of the underlying links has failed. When one TE detects that 1503 most/all underlying link paths to the other TE have failed, however, 1504 it schedules the bidirectional state for garbage collection. 1506 This bidirectional tunnel neighbor establishment is most commonly 1507 initiated by a client TE in establishing a "connection" with a 1508 serving TE, e.g., when a customer router within a home network 1509 established a connection with a serving router in a provider network. 1511 5. Link Requirements 1513 Subnetwork designers are expected to follow the recommendations in 1514 Section 2 of [RFC3819] when configuring link MTUs. 1516 6. End System Requirements 1518 SEAL provides robust mechanisms for returning PTB messages; however, 1519 end systems that send unfragmentable IP packets larger than 1500 1520 bytes are strongly encouraged to implement their own end-to-end MTU 1521 assurance, e.g., using Packetization Layer Path MTU Discovery per 1522 [RFC4821]. 1524 7. Router Requirements 1526 IPv4 routers within the subnetwork are strongly encouraged to 1527 implement IPv4 fragmentation such that the first fragment is the 1528 largest and approximately the size of the underlying link MTU, i.e., 1529 they should avoid generating runt first fragments. 1531 IPv6 routers within the subnetwork are required to generate the 1532 necessary PTB messages when they drop outer IPv6 packets due to an 1533 MTU restriction. 1535 8. IANA Considerations 1537 The IANA is instructed to allocate an IP protocol number for 1538 'SEAL_PROTO' in the 'protocol-numbers' registry. 1540 The IANA is instructed to allocate a Well-Known Port number for 1541 'SEAL_PORT' in the 'port-numbers' registry. 1543 The IANA is instructed to establish a "SEAL Protocol" registry to 1544 record SEAL Version values. This registry should be initialized to 1545 include the initial SEAL Version number, i.e., Version 0. 1547 9. Security Considerations 1549 Unlike IPv4 fragmentation, overlapping fragment attacks are not 1550 possible due to the requirement that SEAL segments be non- 1551 overlapping. This condition is naturally enforced due to the fact 1552 that each consecutive SEAL segment begins at offset 0 with respect to 1553 the previous SEAL segment. 1555 An amplification/reflection attack is possible when an attacker sends 1556 IP first fragments with spoofed source addresses to an ETE, resulting 1557 in a stream of SCMP messages returned to a victim ITE. The (LINK_ID, 1558 NONCE, SEAL_ID)-tuple in the encapsulated segment of the spoofed IP 1559 first fragment provides mitigation for the ITE to detect and discard 1560 spurious SCMP messages. 1562 The SEAL header is sent in-the-clear (outside of any IPsec/ESP 1563 encapsulations) the same as for the outer IP and other outer headers. 1564 In this respect, the threat model is no different than for IPv6 1565 extension headers. As for IPv6 extension headers, the SEAL header is 1566 protected only by L2 integrity checks and is not covered under any L3 1567 integrity checks. 1569 SCMP messages carry the (LINK_ID, NONCE, SEAL_ID)-tuple of the 1570 packet-in-error. Therefore, when an ITE receives an SCMP message it 1571 can unambiguously associate it with the SEAL data packet that 1572 triggered the error. When the TEs are synchronized, the ETE can also 1573 detect off-path spoofing attacks. 1575 Security issues that apply to tunneling in general are discussed in 1576 [I-D.ietf-v6ops-tunnel-security-concerns]. 1578 10. Related Work 1580 Section 3.1.7 of [RFC2764] provides a high-level sketch for 1581 supporting large tunnel MTUs via a tunnel-level segmentation and 1582 reassembly capability to avoid IP level fragmentation, which is in 1583 part the same approach used by SEAL. SEAL could therefore be 1584 considered as a fully functioned manifestation of the method 1585 postulated by that informational reference. 1587 Section 3 of [RFC4459] describes inner and outer fragmentation at the 1588 tunnel endpoints as alternatives for accommodating the tunnel MTU; 1589 however, the SEAL protocol specifies a mid-layer segmentation and 1590 reassembly capability that is distinct from both inner and outer 1591 fragmentation. 1593 Section 4 of [RFC2460] specifies a method for inserting and 1594 processing extension headers between the base IPv6 header and 1595 transport layer protocol data. The SEAL header is inserted and 1596 processed in exactly the same manner. 1598 The concepts of path MTU determination through the report of 1599 fragmentation and extending the IP Identification field were first 1600 proposed in deliberations of the TCP-IP mailing list and the Path MTU 1601 Discovery Working Group (MTUDWG) during the late 1980's and early 1602 1990's. SEAL supports a report fragmentation capability using bits 1603 in an extension header (the original proposal used a spare bit in the 1604 IP header) and supports ID extension through a 16-bit field in an 1605 extension header (the original proposal used a new IP option). A 1606 historical analysis of the evolution of these concepts, as well as 1607 the development of the eventual path MTU discovery mechanism for IP, 1608 appears in Appendix D of this document. 1610 11. SEAL Advantages over Classical Methods 1612 The SEAL approach offers a number of distinct advantages over the 1613 classical path MTU discovery methods [RFC1191] [RFC1981]: 1615 1. Classical path MTU discovery always results in packet loss when 1616 an MTU restriction is encountered. Using SEAL, IP fragmentation 1617 provides a short-term interim mechanism for ensuring that packets 1618 are delivered while SEAL adjusts its packet sizing parameters. 1620 2. Classical path MTU may require several iterations of dropping 1621 packets and returning PTB messages until an acceptable path MTU 1622 value is determined. Under normal circumstances, SEAL determines 1623 the correct packet sizing parameters in a single iteration. 1625 3. Using SEAL, ordinary packets serve as implicit probes without 1626 exposing data to unnecessary loss. SEAL also provides an 1627 explicit probing mode not available in the classic methods. 1629 4. Using SEAL, ETEs encapsulate SCMP error messages in outer and 1630 mid-layer headers such that packet-filtering network middleboxes 1631 will not filter them the same as for "raw" ICMP messages that may 1632 be generated by an attacker. 1634 5. The SEAL approach ensures that the tunnel either delivers or 1635 deterministically drops packets according to their size, which is 1636 a required characteristic of any IP link. 1638 6. Most importantly, all SEAL packets have an Identification field 1639 that is sufficiently long to be used for duplicate packet 1640 detection purposes and to associate ICMP error messages with 1641 actual packets sent without requiring per-packet state; hence, 1642 SEAL avoids certain denial-of-service attack vectors open to the 1643 classical methods. 1645 12. Acknowledgments 1647 The following individuals are acknowledged for helpful comments and 1648 suggestions: Jari Arkko, Fred Baker, Iljitsch van Beijnum, Oliver 1649 Bonaventure, Teco Boot, Bob Braden, Brian Carpenter, Steve Casner, 1650 Ian Chakeres, Noel Chiappa, Remi Denis-Courmont, Remi Despres, Ralph 1651 Droms, Aurnaud Ebalard, Gorry Fairhurst, Washam Fan, Dino Farinacci, 1652 Joel Halpern, Sam Hartman, John Heffner, Thomas Henderson, Bob 1653 Hinden, Christian Huitema, Eliot Lear, Darrel Lewis, Joe Macker, Matt 1654 Mathis, Erik Nordmark, Dan Romascanu, Dave Thaler, Joe Touch, Mark 1655 Townsley, Ole Troan, Margaret Wasserman, Magnus Westerlund, Robin 1656 Whittle, James Woodyatt, and members of the Boeing Research & 1657 Technology NST DC&NT group. 1659 Path MTU determination through the report of fragmentation was first 1660 proposed by Charles Lynn on the TCP-IP mailing list in 1987. 1661 Extending the IP identification field was first proposed by Steve 1662 Deering on the MTUDWG mailing list in 1989. 1664 13. References 1666 13.1. Normative References 1668 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1669 September 1981. 1671 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 1672 RFC 792, September 1981. 1674 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1675 Requirement Levels", BCP 14, RFC 2119, March 1997. 1677 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1678 (IPv6) Specification", RFC 2460, December 1998. 1680 [RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure 1681 Neighbor Discovery (SEND)", RFC 3971, March 2005. 1683 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 1684 Message Protocol (ICMPv6) for the Internet Protocol 1685 Version 6 (IPv6) Specification", RFC 4443, March 2006. 1687 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 1688 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 1689 September 2007. 1691 13.2. Informative References 1693 [FOLK] Shannon, C., Moore, D., and k. claffy, "Beyond Folklore: 1694 Observations on Fragmented Traffic", December 2002. 1696 [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", 1697 October 1987. 1699 [I-D.ietf-intarea-ipv4-id-update] 1700 Touch, J., "Updated Specification of the IPv4 ID Field", 1701 draft-ietf-intarea-ipv4-id-update-01 (work in progress), 1702 October 2010. 1704 [I-D.ietf-v6ops-tunnel-security-concerns] 1705 Krishnan, S., Thaler, D., and J. Hoagland, "Security 1706 Concerns With IP Tunneling", 1707 draft-ietf-v6ops-tunnel-security-concerns-04 (work in 1708 progress), October 2010. 1710 [I-D.russert-rangers] 1711 Russert, S., Fleischman, E., and F. Templin, "RANGER 1712 Scenarios", draft-russert-rangers-05 (work in progress), 1713 July 2010. 1715 [I-D.templin-intarea-vet] 1716 Templin, F., "Virtual Enterprise Traversal (VET)", 1717 draft-templin-intarea-vet-16 (work in progress), 1718 July 2010. 1720 [I-D.templin-iron] 1721 Templin, F., "The Internet Routing Overlay Network 1722 (IRON)", draft-templin-iron-13 (work in progress), 1723 October 2010. 1725 [MTUDWG] "IETF MTU Discovery Working Group mailing list, 1726 gatekeeper.dec.com/pub/DEC/WRL/mogul/mtudwg-log, November 1727 1989 - February 1995.". 1729 [RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP 1730 MTU discovery options", RFC 1063, July 1988. 1732 [RFC1070] Hagens, R., Hall, N., and M. Rose, "Use of the Internet as 1733 a subnetwork for experimentation with the OSI network 1734 layer", RFC 1070, February 1989. 1736 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1737 November 1990. 1739 [RFC1812] Baker, F., "Requirements for IP Version 4 Routers", 1740 RFC 1812, June 1995. 1742 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 1743 for IP version 6", RFC 1981, August 1996. 1745 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 1746 October 1996. 1748 [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in 1749 IPv6 Specification", RFC 2473, December 1998. 1751 [RFC2675] Borman, D., Deering, S., and R. Hinden, "IPv6 Jumbograms", 1752 RFC 2675, August 1999. 1754 [RFC2764] Gleeson, B., Heinanen, J., Lin, A., Armitage, G., and A. 1755 Malis, "A Framework for IP Based Virtual Private 1756 Networks", RFC 2764, February 2000. 1758 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", 1759 RFC 2923, September 2000. 1761 [RFC3232] Reynolds, J., "Assigned Numbers: RFC 1700 is Replaced by 1762 an On-line Database", RFC 3232, January 2002. 1764 [RFC3366] Fairhurst, G. and L. Wood, "Advice to link designers on 1765 link Automatic Repeat reQuest (ARQ)", BCP 62, RFC 3366, 1766 August 2002. 1768 [RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D., 1769 Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. 1770 Wood, "Advice for Internet Subnetwork Designers", BCP 89, 1771 RFC 3819, July 2004. 1773 [RFC4191] Draves, R. and D. Thaler, "Default Router Preferences and 1774 More-Specific Routes", RFC 4191, November 2005. 1776 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 1777 for IPv6 Hosts and Routers", RFC 4213, October 2005. 1779 [RFC4380] Huitema, C., "Teredo: Tunneling IPv6 over UDP through 1780 Network Address Translations (NATs)", RFC 4380, 1781 February 2006. 1783 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1784 Network Tunneling", RFC 4459, April 2006. 1786 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 1787 Discovery", RFC 4821, March 2007. 1789 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 1790 Errors at High Data Rates", RFC 4963, July 2007. 1792 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 1793 Mitigations", RFC 4987, August 2007. 1795 [RFC5445] Watson, M., "Basic Forward Error Correction (FEC) 1796 Schemes", RFC 5445, March 2009. 1798 [RFC5720] Templin, F., "Routing and Addressing in Networks with 1799 Global Enterprise Recursion (RANGER)", RFC 5720, 1800 February 2010. 1802 [RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010. 1804 [SIGCOMM] Luckie, M. and B. Stasiewicz, "Measuring Path MTU 1805 Discovery Behavior", November 2010. 1807 [TBIT] Medina, A., Allman, M., and S. Floyd, "Measuring 1808 Interactions Between Transport Protocols and Middleboxes", 1809 October 2004. 1811 [TCP-IP] "Archive/Hypermail of Early TCP-IP Mail List, 1812 http://www-mice.cs.ucl.ac.uk/multimedia/misc/tcp_ip/, May 1813 1987 - May 1990.". 1815 [WAND] Luckie, M., Cho, K., and B. Owens, "Inferring and 1816 Debugging Path MTU Discovery Failures", October 2005. 1818 Appendix A. Reliability 1820 Although a SEAL tunnel may span an arbitrarily-large subnetwork 1821 expanse, the IP layer sees the tunnel as a simple link that supports 1822 the IP service model. Since SEAL supports segmentation at a layer 1823 below IP, SEAL therefore presents a case in which the link unit of 1824 loss (i.e., a SEAL segment) is smaller than the end-to-end 1825 retransmission unit (e.g., a TCP segment). 1827 Links with high bit error rates (BERs) (e.g., IEEE 802.11) use 1828 Automatic Repeat-ReQuest (ARQ) mechanisms [RFC3366] to increase 1829 packet delivery ratios, while links with much lower BERs typically 1830 omit such mechanisms. Since SEAL tunnels may traverse arbitrarily- 1831 long paths over links of various types that are already either 1832 performing or omitting ARQ as appropriate, it would therefore often 1833 be inefficient to also require the tunnel to perform ARQ. 1835 When the SEAL ITE has knowledge that the tunnel will traverse a 1836 subnetwork with non-negligible loss due to, e.g., interference, link 1837 errors, congestion, etc., it can solicit Segment Reports from the ETE 1838 periodically to discover missing segments for retransmission within a 1839 single round-trip time. However, retransmission of missing segments 1840 may require the ITE to maintain considerable state and may also 1841 result in considerable delay variance and packet reordering. 1843 SEAL may also use alternate reliability mechanisms such as Forward 1844 Error Correction (FEC). A simple FEC mechanism may merely entail 1845 gratuitous retransmissions of duplicate data, however more efficient 1846 alternatives are also possible. Basic FEC schemes are discussed in 1848 [RFC5445]. 1850 The use of ARQ and FEC mechanisms for improved reliability are for 1851 further study. 1853 Appendix B. Integrity 1855 Each link in the path over which a SEAL tunnel is configured is 1856 responsible for link layer integrity verification for packets that 1857 traverse the link. As such, when a multi-segment SEAL packet with N 1858 segments is reassembled, its segments will have been inspected by N 1859 independent link layer integrity check streams instead of a single 1860 stream that a single segment SEAL packet of the same size would have 1861 received. Intuitively, a reassembled packet subjected to N 1862 independent integrity check streams of shorter-length segments would 1863 seem to have integrity assurance that is no worse than a single- 1864 segment packet subjected to only a single integrity check steam, 1865 since the integrity check strength diminishes in inverse proportion 1866 with segment length. In any case, the link-layer integrity assurance 1867 for a multi-segment SEAL packet is no different than for a multi- 1868 fragment IPv6 packet. 1870 Fragmentation and reassembly schemes must also consider packet- 1871 splicing errors, e.g., when two segments from the same packet are 1872 concatenated incorrectly, when a segment from packet X is reassembled 1873 with segments from packet Y, etc. The primary sources of such errors 1874 include implementation bugs and wrapping IP ID fields. In terms of 1875 implementation bugs, the SEAL segmentation and reassembly algorithm 1876 is much simpler than IP fragmentation resulting in simplified 1877 implementations. In terms of wrapping ID fields, when IPv4 is used 1878 as the outer IP protocol, the 16-bit IP ID field can wrap with only 1879 64K packets with the same (src, dst, protocol)-tuple alive in the 1880 system at a given time [RFC4963] increasing the likelihood of 1881 reassembly mis-associations. However, SEAL ensures that any outer 1882 IPv4 fragmentation and reassembly will be short-lived and tuned out 1883 as soon as the ITE receives a Reassembly Repot, and SEAL segmentation 1884 and reassembly uses a much longer ID field. Therefore, reassembly 1885 mis-associations of IP fragments nor of SEAL segments should be 1886 prohibitively rare. 1888 Appendix C. Transport Mode 1890 SEAL can also be used in "transport-mode", e.g., when the inner layer 1891 comprises upper-layer protocol data rather than an encapsulated IP 1892 packet. For instance, TCP peers can negotiate the use of SEAL for 1893 the carriage of protocol data encapsulated as IPv4/SEAL/TCP. In this 1894 sense, the "subnetwork" becomes the entire end-to-end path between 1895 the TCP peers and may potentially span the entire Internet. 1897 Section specifies the operation of SEAL in "tunnel mode", i.e., when 1898 there are both an inner and outer IP layer with a SEAL encapsulation 1899 layer between. However, the SEAL protocol can also be used in a 1900 "transport mode" of operation within a subnetwork region in which the 1901 inner-layer corresponds to a transport layer protocol (e.g., UDP, 1902 TCP, etc.) instead of an inner IP layer. 1904 For example, two TCP endpoints connected to the same subnetwork 1905 region can negotiate the use of transport-mode SEAL for a connection 1906 by inserting a 'SEAL_OPTION' TCP option during the connection 1907 establishment phase. If both TCPs agree on the use of SEAL, their 1908 protocol messages will be carried as TCP/SEAL/IPv4 and the connection 1909 will be serviced by the SEAL protocol using TCP (instead of an 1910 encapsulating tunnel endpoint) as the transport layer protocol. The 1911 SEAL protocol for transport mode otherwise observes the same 1912 specifications as for Section 4. 1914 Appendix D. Historic Evolution of PMTUD 1916 The topic of Path MTU discovery (PMTUD) saw a flurry of discussion 1917 and numerous proposals in the late 1980's through early 1990. The 1918 initial problem was posed by Art Berggreen on May 22, 1987 in a 1919 message to the TCP-IP discussion group [TCP-IP]. The discussion that 1920 followed provided significant reference material for [FRAG]. An IETF 1921 Path MTU Discovery Working Group [MTUDWG] was formed in late 1989 1922 with charter to produce an RFC. Several variations on a very few 1923 basic proposals were entertained, including: 1925 1. Routers record the PMTUD estimate in ICMP-like path probe 1926 messages (proposed in [FRAG] and later [RFC1063]) 1928 2. The destination reports any fragmentation that occurs for packets 1929 received with the "RF" (Report Fragmentation) bit set (Steve 1930 Deering's 1989 adaptation of Charles Lynn's Nov. 1987 proposal) 1932 3. A hybrid combination of 1) and Charles Lynn's Nov. 1987 (straw 1933 RFC draft by McCloughrie, Fox and Mogul on Jan 12, 1990) 1935 4. Combination of the Lynn proposal with TCP (Fred Bohle, Jan 30, 1936 1990) 1938 5. Fragmentation avoidance by setting "IP_DF" flag on all packets 1939 and retransmitting if ICMPv4 "fragmentation needed" messages 1940 occur (Geof Cooper's 1987 proposal; later adapted into [RFC1191] 1941 by Mogul and Deering). 1943 Option 1) seemed attractive to the group at the time, since it was 1944 believed that routers would migrate more quickly than hosts. Option 1945 2) was a strong contender, but repeated attempts to secure an "RF" 1946 bit in the IPv4 header from the IESG failed and the proponents became 1947 discouraged. 3) was abandoned because it was perceived as too 1948 complicated, and 4) never received any apparent serious 1949 consideration. Proposal 5) was a late entry into the discussion from 1950 Steve Deering on Feb. 24th, 1990. The discussion group soon 1951 thereafter seemingly lost track of all other proposals and adopted 1952 5), which eventually evolved into [RFC1191] and later [RFC1981]. 1954 In retrospect, the "RF" bit postulated in 2) is not needed if a 1955 "contract" is first established between the peers, as in proposal 4) 1956 and a message to the MTUDWG mailing list from jrd@PTT.LCS.MIT.EDU on 1957 Feb 19. 1990. These proposals saw little discussion or rebuttal, and 1958 were dismissed based on the following the assertions: 1960 o routers upgrade their software faster than hosts 1962 o PCs could not reassemble fragmented packets 1964 o Proteon and Wellfleet routers did not reproduce the "RF" bit 1965 properly in fragmented packets 1967 o Ethernet-FDDI bridges would need to perform fragmentation (i.e., 1968 "translucent" not "transparent" bridging) 1970 o the 16-bit IP_ID field could wrap around and disrupt reassembly at 1971 high packet arrival rates 1973 The first four assertions, although perhaps valid at the time, have 1974 been overcome by historical events. The final assertion is addressed 1975 by the mechanisms specified in SEAL. 1977 Author's Address 1979 Fred L. Templin (editor) 1980 Boeing Research & Technology 1981 P.O. Box 3707 1982 Seattle, WA 98124 1983 USA 1985 Email: fltemplin@acm.org