idnits 2.17.1 draft-templin-intarea-seal-34.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 24, 2011) is 4566 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC3971' is defined on line 1129, but no explicit reference was found in the text == Unused Reference: 'RFC4861' is defined on line 1136, but no explicit reference was found in the text == Unused Reference: 'I-D.templin-aero' is defined on line 1163, but no explicit reference was found in the text == Unused Reference: 'RFC2675' is defined on line 1204, but no explicit reference was found in the text == Unused Reference: 'RFC4191' is defined on line 1226, but no explicit reference was found in the text == Unused Reference: 'RFC4987' is defined on line 1251, but no explicit reference was found in the text == Unused Reference: 'RFC5445' is defined on line 1257, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-07) exists of draft-ietf-intarea-ipv4-id-update-04 == Outdated reference: A later version (-06) exists of draft-ietf-savi-framework-05 == Outdated reference: A later version (-12) exists of draft-templin-aero-04 == Outdated reference: A later version (-40) exists of draft-templin-intarea-vet-27 == Outdated reference: A later version (-16) exists of draft-templin-ironbis-06 -- Obsolete informational reference (is this intentional?): RFC 1063 (Obsoleted by RFC 1191) -- Obsolete informational reference (is this intentional?): RFC 1981 (Obsoleted by RFC 8201) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 1 error (**), 0 flaws (~~), 14 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Templin, Ed. 3 Internet-Draft Boeing Research & Technology 4 Intended status: Standards Track October 24, 2011 5 Expires: April 26, 2012 7 The Subnetwork Encapsulation and Adaptation Layer (SEAL) 8 draft-templin-intarea-seal-34.txt 10 Abstract 12 For the purpose of this document, a subnetwork is defined as a 13 virtual topology configured over a connected IP network routing 14 region and bounded by encapsulating border nodes. These virtual 15 topologies are manifested by tunnels that may span multiple IP and/or 16 sub-IP layer forwarding hops, and can introduce failure modes due to 17 packet duplication and/or links with diverse Maximum Transmission 18 Units (MTUs). This document specifies a Subnetwork Encapsulation and 19 Adaptation Layer (SEAL) that accommodates such virtual topologies 20 over diverse underlying link technologies. 22 Status of this Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on April 26, 2012. 39 Copyright Notice 41 Copyright (c) 2011 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . 3 58 1.2. Approach . . . . . . . . . . . . . . . . . . . . . . . . . 5 59 2. Terminology and Requirements . . . . . . . . . . . . . . . . . 5 60 3. Applicability Statement . . . . . . . . . . . . . . . . . . . 7 61 4. SEAL Specification . . . . . . . . . . . . . . . . . . . . . . 8 62 4.1. VET Interface Model . . . . . . . . . . . . . . . . . . . 8 63 4.2. SEAL Model of Operation . . . . . . . . . . . . . . . . . 9 64 4.3. SEAL Header Format . . . . . . . . . . . . . . . . . . . . 10 65 4.4. ITE Specification . . . . . . . . . . . . . . . . . . . . 11 66 4.4.1. Tunnel Interface Soft State . . . . . . . . . . . . . 11 67 4.4.2. Tunnel Interface MTU . . . . . . . . . . . . . . . . . 11 68 4.4.3. Submitting Packets for Encapsulation . . . . . . . . . 12 69 4.4.4. SEAL Encapsulation . . . . . . . . . . . . . . . . . . 13 70 4.4.5. Outer Encapsulation . . . . . . . . . . . . . . . . . 14 71 4.4.6. Probing Strategy . . . . . . . . . . . . . . . . . . . 14 72 4.4.7. Processing ICMP Messages . . . . . . . . . . . . . . . 16 73 4.5. ETE Specification . . . . . . . . . . . . . . . . . . . . 16 74 4.5.1. Tunnel Interface Soft State . . . . . . . . . . . . . 16 75 4.5.2. Reassembly Buffer Requirements . . . . . . . . . . . . 16 76 4.5.3. IP-Layer Reassembly . . . . . . . . . . . . . . . . . 16 77 4.5.4. Decapsulation and Re-Encapsulation . . . . . . . . . . 17 78 4.6. The SEAL Control Message Protocol (SCMP) . . . . . . . . . 17 79 4.6.1. Generating SCMP Error Messages . . . . . . . . . . . . 18 80 4.6.2. Processing SCMP Error Messages . . . . . . . . . . . . 20 81 5. Link Requirements . . . . . . . . . . . . . . . . . . . . . . 22 82 6. End System Requirements . . . . . . . . . . . . . . . . . . . 22 83 7. Router Requirements . . . . . . . . . . . . . . . . . . . . . 22 84 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 85 9. Security Considerations . . . . . . . . . . . . . . . . . . . 23 86 10. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 23 87 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 24 88 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 89 12.1. Normative References . . . . . . . . . . . . . . . . . . . 24 90 12.2. Informative References . . . . . . . . . . . . . . . . . . 25 91 Appendix A. Reliability . . . . . . . . . . . . . . . . . . . . . 28 92 Appendix B. Integrity . . . . . . . . . . . . . . . . . . . . . . 28 93 Appendix C. Transport Mode . . . . . . . . . . . . . . . . . . . 29 94 Appendix D. Historic Evolution of PMTUD . . . . . . . . . . . . . 29 95 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 30 97 1. Introduction 99 As Internet technology and communication has grown and matured, many 100 techniques have developed that use virtual topologies (including 101 tunnels of one form or another) over an actual network that supports 102 the Internet Protocol (IP) [RFC0791][RFC2460]. Those virtual 103 topologies have elements that appear as one hop in the virtual 104 topology, but are actually multiple IP or sub-IP layer hops. These 105 multiple hops often have quite diverse properties that are often not 106 even visible to the endpoints of the virtual hop. This introduces 107 failure modes that are not dealt with well in current approaches. 109 The use of IP encapsulation (also known as "tunneling") has long been 110 considered as the means for creating such virtual topologies. 111 However, the insertion of an outer IP header reduces the effective 112 path MTU visible to the inner network layer. When IPv4 is used, this 113 reduced MTU can be accommodated through the use of IPv4 114 fragmentation, but unmitigated in-the-network fragmentation has been 115 found to be harmful through operational experience and studies 116 conducted over the course of many years [FRAG][FOLK][RFC4963]. 117 Additionally, classical path MTU discovery [RFC1191] has known 118 operational issues that are exacerbated by in-the-network tunnels 119 [RFC2923][RFC4459]. The following subsections present further 120 details on the motivation and approach for addressing these issues. 122 1.1. Motivation 124 Before discussing the approach, it is necessary to first understand 125 the problems. In both the Internet and private-use networks today, 126 IPv4 is ubiquitously deployed as the Layer 3 protocol. The two 127 primary functions of IPv4 are to provide for 1) addressing, and 2) a 128 fragmentation and reassembly capability used to accommodate links 129 with diverse MTUs. While it is well known that the IPv4 address 130 space is rapidly becoming depleted, there is a lesser-known but 131 growing consensus that other IPv4 protocol limitations have already 132 or may soon become problematic. 134 First, the IPv4 header Identification field is only 16 bits in 135 length, meaning that at most 2^16 unique packets with the same 136 (source, destination, protocol)-tuple may be active in the Internet 137 at a given time [I-D.ietf-intarea-ipv4-id-update]. Due to the 138 escalating deployment of high-speed links, however, this number may 139 soon become too small by several orders of magnitude for high data 140 rate packet sources such as tunnel endpoints [RFC4963]. Furthermore, 141 there are many well-known limitations pertaining to IPv4 142 fragmentation and reassembly - even to the point that it has been 143 deemed "harmful" in both classic and modern-day studies (see above). 144 In particular, IPv4 fragmentation raises issues ranging from minor 145 annoyances (e.g., in-the-network router fragmentation [RFC1981]) to 146 the potential for major integrity issues (e.g., mis-association of 147 the fragments of multiple IP packets during reassembly [RFC4963]). 149 As a result of these perceived limitations, a fragmentation-avoiding 150 technique for discovering the MTU of the forward path from a source 151 to a destination node was devised through the deliberations of the 152 Path MTU Discovery Working Group (PMTUDWG) during the late 1980's 153 through early 1990's (see Appendix D). In this method, the source 154 node provides explicit instructions to routers in the path to discard 155 the packet and return an ICMP error message if an MTU restriction is 156 encountered. However, this approach has several serious shortcomings 157 that lead to an overall "brittleness" [RFC2923]. 159 In particular, site border routers in the Internet are being 160 configured more and more to discard ICMP error messages coming from 161 the outside world. This is due in large part to the fact that 162 malicious spoofing of error messages in the Internet is trivial since 163 there is no way to authenticate the source of the messages [RFC5927]. 164 Furthermore, when a source node that requires ICMP error message 165 feedback when a packet is dropped due to an MTU restriction does not 166 receive the messages, a path MTU-related black hole occurs. This 167 means that the source will continue to send packets that are too 168 large and never receive an indication from the network that they are 169 being discarded. This behavior has been confirmed through documented 170 studies showing clear evidence of path MTU discovery failures in the 171 Internet today [TBIT][WAND][SIGCOMM]. 173 The issues with both IPv4 fragmentation and this "classical" method 174 of path MTU discovery are exacerbated further when IP tunneling is 175 used [RFC4459]. For example, an ingress tunnel endpoint (ITE) may be 176 required to forward encapsulated packets into the subnetwork on 177 behalf of hundreds, thousands, or even more original sources within 178 the end site that it serves. If the ITE allows IPv4 fragmentation on 179 the encapsulated packets, persistent fragmentation could lead to 180 undetected data corruption due to Identification field wrapping. If 181 the ITE instead uses classical IPv4 path MTU discovery, it may be 182 inconvenienced by excessive ICMP error messages coming from the 183 subnetwork that may be either suspect or contain insufficient 184 information for translation into error messages to be returned to the 185 original sources. 187 Although recent works have led to the development of a robust end-to- 188 end MTU determination scheme [RFC4821], they do not excuse tunnels 189 from delivering path MTU discovery feedback when packets are lost due 190 to size restrictions. Moreover, in current practice existing 191 tunneling protocols mask the MTU issues by selecting a "lowest common 192 denominator" MTU that may be much smaller than necessary for most 193 paths and difficult to change at a later date. Therefore, a new 194 approach to accommodate tunnels over links with diverse MTUs is 195 necessary. 197 1.2. Approach 199 For the purpose of this document, a subnetwork is defined as a 200 virtual topology configured over a connected network routing region 201 and bounded by encapsulating border nodes. Example connected network 202 routing regions include Mobile Ad hoc Networks (MANETs), enterprise 203 networks and the global public Internet itself. Subnetwork border 204 nodes forward unicast and multicast packets over the virtual topology 205 across multiple IP and/or sub-IP layer forwarding hops that may 206 introduce packet duplication and/or traverse links with diverse 207 Maximum Transmission Units (MTUs). 209 This document introduces a Subnetwork Encapsulation and Adaptation 210 Layer (SEAL) for tunneling network layer protocols (e.g., IP, OSI, 211 etc.) over IP subnetworks that connect Ingress and Egress Tunnel 212 Endpoints (ITEs/ETEs) of border nodes. It provides a modular 213 specification designed to be tailored to specific associated 214 tunneling protocols. A transport-mode of operation is also possible, 215 and described in Appendix C. 217 SEAL provides a minimal mid-layer encapsulation that accommodates 218 links with diverse MTUs and allows routers in the subnetwork to 219 perform efficient duplicate packet detection. The encapsulation 220 further ensures packet header integrity, data origin authentication 221 and anti-replay [I-D.ietf-savi-framework][RFC4302]. 223 SEAL treats tunnels that traverse the subnetwork as ordinary links 224 that must support network layer services. Moreover, SEAL provides 225 dynamic mechanisms to ensure a maximal per-destination path MTU over 226 the tunnel. This is in contrast to static approaches which avoid MTU 227 issues by selecting a lowest common denominator MTU value that may be 228 overly conservative for the vast majority of tunnel paths and 229 difficult to change even when larger MTUs become available. 231 The following sections provide the SEAL normative specifications, 232 while the appendices present non-normative additional considerations. 234 2. Terminology and Requirements 236 The following terms are defined within the scope of this document: 238 subnetwork 239 a virtual topology configured over a connected network routing 240 region and bounded by encapsulating border nodes. 242 Ingress Tunnel Endpoint 243 a virtual interface over which an encapsulating border node (host 244 or router) sends encapsulated packets into the subnetwork. 246 Egress Tunnel Endpoint 247 a virtual interface over which an encapsulating border node (host 248 or router) receives encapsulated packets from the subnetwork. 250 inner packet 251 an unencapsulated network layer protocol packet (e.g., IPv6 252 [RFC2460], IPv4 [RFC0791], OSI/CLNP [RFC1070], etc.) before any 253 outer encapsulations are added. Internet protocol numbers that 254 identify inner packets are found in the IANA Internet Protocol 255 registry [RFC3232]. 257 outer IP packet 258 a packet resulting from adding an outer IP header (and possibly 259 other outer headers) to a SEAL-encapsulated inner packet. 261 packet-in-error 262 the leading portion of an invoking data packet encapsulated in the 263 body of an error control message (e.g., an ICMPv4 [RFC0792] error 264 message, an ICMPv6 [RFC4443] error message, etc.). 266 Packet Too Big (PTB) 267 a control plane message indicating an MTU restriction (e.g., an 268 ICMPv6 "Packet Too Big" message [RFC4443], an ICMPv4 269 "Fragmentation Needed" message [RFC0792], etc.). 271 IP 272 used to generically refer to either IP protocol version, i.e., 273 IPv4 or IPv6. 275 The following abbreviations correspond to terms used within this 276 document and/or elsewhere in common Internetworking nomenclature: 278 DF - the IPv4 header "Don't Fragment" flag [RFC0791] 280 ETE - Egress Tunnel Endpoint 282 HLEN - the length of the SEAL header plus outer headers and 283 trailers 284 ITE - Ingress Tunnel Endpoint 286 MTU - Maximum Transmission Unit 288 SCMP - the SEAL Control Message Protocol 290 SDU - SCMP Destination Unreachable message 292 SPP - SCMP Parameter Problem message 294 SPTB - SCMP Packet Too Big message 296 SEAL - Subnetwork Encapsulation and Adaptation Layer 298 SEAL_PORT - a transport-layer service port number used for SEAL 300 SEAL_PROTO - an IPv4 protocol number used for SEAL 302 TE - Tunnel Endpoint (i.e., either ingress or egress) 304 VET - Virtual Enterprise Traversal 306 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 307 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 308 document are to be interpreted as described in [RFC2119]. When used 309 in lower case (e.g., must, must not, etc.), these words MUST NOT be 310 interpreted as described in [RFC2119], but are rather interpreted as 311 they would be in common English. 313 3. Applicability Statement 315 SEAL was originally motivated by the specific case of subnetwork 316 abstraction for Mobile Ad hoc Networks (MANETs), however it soon 317 became apparent that the domain of applicability also extends to 318 subnetwork abstractions over enterprise networks, ISP networks, SOHO 319 networks, the global public Internet itself, and any other connected 320 network routing region. SEAL along with the Virtual Enterprise 321 Traversal (VET) [I-D.templin-intarea-vet] tunnel virtual interface 322 abstraction are the functional building blocks for a new 323 Internetworking architecture based on Routing and Addressing in 324 Networks with Global Enterprise Recursion (RANGER) [RFC5720][RFC6139] 325 and the Internet Routing Overlay Network (IRON) 326 [I-D.templin-ironbis]. 328 SEAL provides a network sublayer for encapsulation of an inner 329 network layer packet within outer encapsulating headers. SEAL can 330 also be used as a sublayer within a transport layer protocol data 331 payload, where transport layer encapsulation is typically used for 332 Network Address Translator (NAT) traversal as well as operation over 333 subnetworks that give preferential treatment to certain "core" 334 Internet protocols (e.g., TCP, UDP, etc.). The SEAL header is 335 processed the same as for IPv6 extension headers, i.e., it is not 336 part of the outer IP header but rather allows for the creation of an 337 arbitrarily extensible chain of headers in the same way that IPv6 338 does. 340 To accommodate MTU diversity, the Egress Tunnel Endpoint (ETE) acts 341 as a passive observer that simply informs the Ingress Tunnel Endpoint 342 (ITE) of any packet size limitations. This allows the ITE to return 343 appropriate path MTU discovery feedback to the previous hop on the 344 path toward the original source even if the network path between the 345 ITE and ETE filters ICMP messages. 347 SEAL further ensures packet header integrity, data origin 348 authentication and anti-replay 349 [I-D.ietf-savi-framework][RFC4301][RFC4302]. The SEAL encapsulation 350 in many respects is simply a lightweight version of the IP Security 351 (IPsec) Authentication Payload (AUTH), however its purpose is to 352 provide minimal authenticating services along multiple hops of a 353 bridged segment within a path while leaving data integrity services 354 as an end-to-end consideration. 356 4. SEAL Specification 358 The following sections specify the operation of SEAL: 360 4.1. VET Interface Model 362 SEAL is an encapsulation sublayer used within VET non-broadcast, 363 multiple access (NBMA) tunnel virtual interfaces. Each VET interface 364 connects an ITE to one or more ETE "neighbors" via tunneling across 365 an underlying subnetwork. The tunnel neighbor relationship between 366 the ITE and each ETE may be either unidirectional or bidirectional. 368 A unidirectional tunnel neighbor relationship allows the near end ITE 369 to send data packets forward to the far end ETE, while the ETE only 370 returns control messages when necessary. A bidirectional tunnel 371 neighbor relationship is one over which both TEs can exchange both 372 data and control messages. 374 Implications of the VET unidirectional and bidirectional models for 375 SEAL are discussed in [I-D.templin-intarea-vet]. 377 4.2. SEAL Model of Operation 379 SEAL-enabled ITEs encapsulate each inner packet in a SEAL header and 380 any outer encapsulations as shown in Figure 1: 382 +--------------------+ 383 ~ outer IP header ~ 384 +--------------------+ 385 ~ other outer hdrs ~ 386 +--------------------+ 387 ~ SEAL Header ~ 388 +--------------------+ +--------------------+ 389 | | --> | | 390 ~ Inner ~ --> ~ Inner ~ 391 ~ Packet ~ --> ~ Packet ~ 392 | | --> | | 393 +--------------------+ +--------------------+ 394 ~ outer trailers ~ 395 +--------------------+ 397 Figure 1: SEAL Encapsulation 399 The ITE inserts the SEAL header according to the specific tunneling 400 protocol. For simple encapsulation of an inner network layer packet 401 within an outer IP header (e.g., 402 [RFC1070][RFC2003][RFC2473][RFC4213], etc.), the ITE inserts the SEAL 403 header between the inner packet and outer IP headers as: IP/SEAL/ 404 {inner packet}. 406 For encapsulations over transports such as UDP (e.g., in the same 407 manner as for [RFC4380]), the ITE inserts the SEAL header between the 408 outer transport layer header and the inner packet, e.g., as IP/UDP/ 409 SEAL/{inner packet}. (Here, the UDP header is seen as an "other 410 outer header" as depicted in Figure 1.) 412 The following sections specify the SEAL header format and SEAL- 413 related operations of the ITE and ETE. 415 4.3. SEAL Header Format 417 The SEAL header is formatted as follows: 419 0 1 2 3 420 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 421 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 422 |VER|C|A|R| RSV | NEXTHDR | PREFLEN | LINK_ID | 423 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 424 | PKT_ID | 425 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 426 | Checksum | 427 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 429 Figure 2: SEAL Header Format 431 where the header fields are defined as: 433 VER (2) 434 a 2-bit version field. This document specifies Version 0 of the 435 SEAL protocol, i.e., the VER field encodes the value 0. 437 C (1) 438 the "Control/Data" bit. Set to 1 by the ITE in SEAL Control 439 Message Protocol (SCMP) control messages, and set to 0 in ordinary 440 data packets. 442 A (1) 443 the "Acknowledgement Requested" bit. Set to 1 by the ITE in SEAL 444 data packets for which it wishes to receive an explicit 445 acknowledgement from the ETE. 447 R (1) 448 the "Redirect" bit. For data packets, set to 1 by the ITE to 449 inform the ETE that the source is accepting redirects (see: 450 [I-D.templin-intarea-vet]). 452 RSV (3) 453 a 3-bit Reserved field. Must be set to 0 for this version of the 454 SEAL specification. 456 NEXTHDR (8) an 8-bit field that encodes the next header Internet 457 Protocol number the same as for the IPv4 protocol and IPv6 next 458 header fields. 460 PREFLEN (8) an 8-bit field that encodes the length of the prefix to 461 be applied to the source address of inner packets. 463 LINK_ID (8) 464 an 8-bit link identification value, set to a unique value by the 465 ITE for each underlying link over which it will send encapsulated 466 packets to ETEs. 468 PKT_ID (32) 469 a 32-bit per-packet identification field. Set to a monotonically- 470 incrementing 32-bit value for each SEAL packet transmitted. 472 Checksum (32) 473 a 32-bit Checksum that covers the first 128 bytes of the packet 474 beginning with the SEAL header. The value128 is chosen so that at 475 least the SEAL header as well as the inner packet network and 476 transport layer headers are covered by the checksum. 478 Setting of the various bits and fields of the SEAL header is 479 specified in the following sections. 481 4.4. ITE Specification 483 4.4.1. Tunnel Interface Soft State 485 The ITE maintains a per-ETE checksum calculation algorithm and secret 486 key to verify the Checksum in the SEAL header. 488 4.4.2. Tunnel Interface MTU 490 The tunnel interface must present a constant MTU value to the inner 491 network layer as the size for admission of inner packets into the 492 interface. Since VET NBMA tunnel virtual interfaces may support a 493 large set of ETEs that accept widely varying maximum packet sizes, 494 however, a number of factors should be taken into consideration when 495 selecting a tunnel interface MTU. 497 Due to the ubiquitous deployment of standard Ethernet and similar 498 networking gear, the nominal Internet cell size has become 1500 499 bytes; this is the de facto size that end systems have come to expect 500 will either be delivered by the network without loss due to an MTU 501 restriction on the path or a suitable ICMP Packet Too Big (PTB) 502 message returned. When large packets sent by end systems incur 503 additional encapsulation at an ITE, however, they may be dropped 504 silently within the tunnel since the network may not always deliver 505 the necessary PTBs [RFC2923]. 507 The ITE should therefore set a tunnel interface MTU of at least 1500 508 bytes plus extra room to accommodate any additional encapsulations 509 that may occur on the path from the original source. The ITE can 510 also set smaller MTU values; however, care must be taken not to set 511 so small a value that original sources would experience an MTU 512 underflow. In particular, IPv6 sources must see a minimum path MTU 513 of 1280 bytes, and IPv4 sources should see a minimum path MTU of 576 514 bytes. 516 The ITE can alternatively set an indefinite MTU on the tunnel 517 interface such that all inner packets are admitted into the interface 518 without regard to size. For ITEs that host applications that use the 519 tunnel interface directly, this option must be carefully coordinated 520 with protocol stack upper layers since some upper layer protocols 521 (e.g., TCP) derive their packet sizing parameters from the MTU of the 522 outgoing interface and as such may select too large an initial size. 523 This is not a problem for upper layers that use conservative initial 524 maximum segment size estimates and/or when the tunnel interface can 525 reduce the upper layer's maximum segment size, e.g., by reducing the 526 size advertised in the MSS option of outgoing TCP messages. 528 The inner network layer protocol consults the tunnel interface MTU 529 when admitting a packet into the interface. For non-SEAL inner IPv4 530 packets with the IPv4 Don't Fragment (DF) bit set to 0, if the packet 531 is larger than the tunnel interface MTU the inner IPv4 layer uses 532 IPv4 fragmentation to break the packet into fragments no larger than 533 the tunnel interface MTU. The ITE then admits each fragment into the 534 interface as an independent packet. 536 For all other inner packets, the inner network layer admits the 537 packet if it is no larger than the tunnel interface MTU; otherwise, 538 it drops the packet and sends a PTB error message to the source with 539 the MTU value set to the tunnel interface MTU. The message contains 540 as much of the invoking packet as possible without the entire message 541 exceeding the network layer minimum MTU (e.g., 576 bytes for IPv4, 542 1280 bytes for IPv6, etc.). 544 In light of the above considerations, the ITE SHOULD configure an 545 indefinite MTU on tunnel *router* interfaces. The ITE MAY instead 546 set a finite MTU on tunnel *host* interfaces. 548 4.4.3. Submitting Packets for Encapsulation 550 The ITE maintains HLEN as the sum of the lengths of the SEAL header 551 and any outer headers and trailers. The ITE must include the length 552 of the uncompressed outer headers and trailers when calculating HLEN 553 even if the tunnel is using header compression. The ITE then 554 prepares each inner packet/fragment admitted into the tunnel 555 interface for encapsulation according to its length. 557 For IPv4 inner packets with DF=0 in the IPv4 header, the ITE 558 fragments the packet into IPv4 fragments of a length that (when added 559 to HLEN) is unlikely to incur additional fragmentation on the path to 560 the ETE. (It is crucial that the ITE be conservative in it's 561 selection of an inner fragment size, since the ETE will discard any 562 packet that arrives as multiple IPv4 fragments after reassembly.)The 563 ITE then submits each fragment for SEAL encapsulation as specified in 564 Section 4.4.4. 566 For all other inner packets, the ITE checks whether the length of the 567 packet plus HLEN is larger than the MTU of the outgoing interface. 568 If the packet is not too large, the ITE submits it for SEAL 569 encapsulation as specified in Section 4.4.4. Otherwise, the ITE 570 sends a PTB message toward the source address of the inner packet. 572 To send the PTB message, the ITE first checks its forwarding tables 573 to discover the previous hop toward the source address of the inner 574 packet. If the previous hop is reached via the same tunnel 575 interface, the ITE sends an SCMP PTB (SPTB) message to the previous 576 hop (see: Section 4.6.1). Otherwise, the ITE sends a PTB message 577 appropriate to the inner protocol version back to the source. (In 578 both cases, the ITE sets the MTU field in the (S)PTB message to the 579 MTU of the underlying interface minus HLEN.) The ITE then discards 580 the packet. 582 4.4.4. SEAL Encapsulation 584 The ITE next encapsulates the inner packet in a SEAL header formatted 585 as specified in Section 4.3. The ITE sets NEXTHDR to the Internet 586 Protocol number corresponding to the encapsulated inner packet. For 587 example, the ITE sets NEXTHDR to the value '4' for encapsulated IPv4 588 packets [RFC2003], the value '41' for encapsulated IPv6 packets 589 [RFC2473][RFC4213], the value '80' for encapsulated OSI packets 590 [RFC1070], etc. 592 The ITE then sets PREFLEN to the length of the prefix to be applied 593 to the inner source address. The ITE's claimed PREFLEN is subject to 594 verification by the ETE; hence, the ITE must not advertise a length 595 that it is not authorized to use. Next, the ITE sets R=1 if 596 redirects are permitted (see: [I-D.templin-intarea-vet]). (Note that 597 if this process is entered via re-encapsulation (see: Section 4.5.4), 598 PREFLEN and R are instead copied from the SEAL header of the re- 599 encapsulated packet. This implies that the PREFLEN and R values are 600 propagated across a chain of ITE/ETEs that must all be authorized to 601 represent the prefix.) 603 The ITE next sets C=0 and sets A=1 if an explicit acknowledgement is 604 required from the ETE (see: Section 4.4.6). The ITE then sets 605 LINK_ID to the value assigned to the underlying link and sets PKT_ID 606 to a monotonically-increasing integer value, beginning with the vale 607 0 in the first packet transmitted.. 609 The ITE finally sets the Checksum field to 0, calculates the Checksum 610 over the first 128 bytes of the packet beginning with the SEAL header 611 and leading portion of the inner packet, then writes the value in the 612 Checksum field. (If there are fewer than 128 bytes, the Checksum is 613 calculated up to the end of the inner packet.) The Checksum is 614 calculated using an algorithm agreed on by the ITE and ETE. The 615 algorithm uses a shared secret key so that the ETE can verify that 616 the Checksum was generated by the ITE. 618 4.4.5. Outer Encapsulation 620 Following SEAL encapsulation, the ITE next encapsulates the packet in 621 the requisite outer headers and trailers according to the specific 622 encapsulation format (e.g., [RFC1070], [RFC2003], [RFC2473], 623 [RFC4213], etc.), except that it writes 'SEAL_PROTO' in the protocol 624 field of the outer IP header (when simple IP encapsulation is used) 625 or writes 'SEAL_PORT' in the outer destination transport service port 626 field (e.g., when IP/UDP encapsulation is used). 628 When UDP encapsulation is used, the ITE sets the UDP header fields as 629 specified in Section 5.5.4 of [I-D.templin-intarea-vet]. The ITE 630 then performs outer IP header encapsulation as specified in Section 631 5.5.5 of [I-D.templin-intarea-vet]. If this process is entered via 632 re-encapsulation (see: Section 4.5.4), the ITE instead follows the 633 outer IP/UDP re-encapsulation procedures specified in Section 5.5.6 634 of [I-D.templin-intarea-vet]. 636 When IPv4 is used as the outer encapsulation layer, the ITE finally 637 sets the DF flag in the IPv4 header of each segment. If the path to 638 the ETE correctly implements IP fragmentation (see: Section 4.4.6), 639 the ITE sets DF=0; otherwise, it sets DF=1. 641 When IPv6 is used as the outer encapsulation layer, the "DF" flag is 642 absent but implicitly set to 1. The packet therefore will not be 643 fragmented within the subnetwork, since IPv6 deprecates in-the- 644 network fragmentation. 646 Following outer encapsulation, the ITE sends each outer packet via 647 the underlying link corresponding to LINK_ID. 649 4.4.6. Probing Strategy 651 When IPv4 is used as the outer encapsulation layer, the ITE can 652 perform a qualification exchange over an underlying link to determine 653 whether the subnetwork path to the ETE correctly implements IP 654 fragmentation. This procedure could be employed, e.g., to determine 655 whether there are any middleboxes on the path that violate the 656 [RFC1812], Section 5.2.6 requirement that: "A router MUST NOT 657 reassemble any datagram before forwarding it". 659 To perform this qualification, the ITE prepares a SEAL Neighbor 660 Solicitation (SNS) message as specified in [I-D.templin-intarea-vet] 661 then splits the packet into two outer IP fragments and sends both 662 fragments to the ETE over the same underlying link. If the ETE 663 returns an SPTB message with non-zero MTU (see Section 4.6.1.1), then 664 the subnetwork path correctly implements IP fragmentation. If the 665 ETE instead returns a SEAL Neighbor Solicitation (SNA) message, 666 however, then a middlebox in the subnetwork is reassembling the IP 667 fragments before they are delivered to the ETE (i.e., in violation of 668 [RFC1812]). 670 In addition to any control plane probing, all SEAL encapsulated data 671 packets sent by the ITE are considered implicit probes. SEAL data 672 packets that use IPv4 as the outer layer of encapsulation with DF=0 673 will elicit SPTB messages from the ETE if any IPv4 fragmentation 674 occurs in the path. SEAL data packets that use either IPv6 or IPv4 675 with DF=1 as the outer layer of encapsulation may be dropped by a 676 router on the path to the ETE which will return a PTB message of the 677 appropriate outer IP protocol to the ITE. 679 If the PTB message includes enough information (see Section 4.4.7), 680 the ITE can then use the identifying information in the SEAL header 681 along with the addresses within the packet-in-error to determine 682 whether the message corresponds to one of its recent SEAL data packet 683 transmissions. If the previous hop toward the inner source address 684 within the packet-in-error is reached via the same tunnel interface 685 the SEAL data packet was sent on, the ITE translates the PTB into an 686 SPTB message and forwards it to the previous hop. Otherwise, the ITE 687 translates the message into a PTB appropriate for the inner header 688 and forwards it to the inner source address. 690 The ITE should also send explicit probes, periodically, to verify 691 that the ETE is still reachable. The ITE sets A=1 in the SEAL header 692 of a packet to be used as an explicit probe. The probe will elicit 693 an SPTB message from the ETE as an acknowledgement (see Section 694 4.6.1.1). The ITE can also send an SNS message to elicit an SNA 695 response from the ETE when there are no convenient data packets to 696 use as explicit probes. 698 4.4.7. Processing ICMP Messages 700 When the ITE sends SEAL data packets, it may receive raw ICMP error 701 messages [RFC0792][RFC4443] from either the ETE or from routers 702 within the subnetwork. The ICMP messages include an outer IP header, 703 followed by an ICMP header, followed by a portion of the SEAL data 704 packet that generated the error (also known as the "packet-in-error") 705 beginning with the outer IP header. 707 The ITE can use the identifying information in the SEAL header along 708 with the source and destination addresses within the packet-in-error 709 to confirm that the ICMP message came from either the ETE or an on- 710 path router, and can use any additional information to determine 711 whether to accept or discard the message. 713 The ITE should specifically process raw ICMPv4 Protocol Unreachable 714 messages and ICMPv6 Parameter Problem messages with Code 715 "Unrecognized Next Header type encountered" as a hint that the ETE 716 does not implement the SEAL protocol. The ITE can also process other 717 raw ICMPv4 messages as a hint that the path to the ETE may be 718 failing. Specific actions that the ITE may take in these cases are 719 out of scope. 721 4.5. ETE Specification 723 4.5.1. Tunnel Interface Soft State 725 The ETE maintains a per-ITE checksum calculation algorithm and secret 726 key to verify the Checksum in the SEAL header. 728 4.5.2. Reassembly Buffer Requirements 730 The ETE observes the minimum reassembly buffer sizes specified for 731 IPv4 [RFC0791] and IPv6 [RFC2460]. 733 4.5.3. IP-Layer Reassembly 735 If the SEAL data packet did not undergo outer IP fragmentation, the 736 ETE submits it for decapsulation as specified in Section 4.5.4. 737 Otherwise, the ETE submits each IP fragment for reassembly. 739 The ETE should maintain conservative IP-layer reassembly cache high- 740 and low-water marks. When the size of the reassembly cache exceeds 741 this high-water mark, the ETE should actively discard incomplete 742 reassemblies (e.g., using an Active Queue Management (AQM) strategy) 743 until the size falls below the low-water mark. The ETE should also 744 actively discard any pending reassemblies that clearly have no 745 opportunity for completion, e.g., when a considerable number of new 746 fragments have arrived before a fragment that completes a pending 747 reassembly arrives. 749 The ETE gathers the outer IP fragments of a fragmented SEAL packet 750 until it has received enough initial fragments to include the first 751 128 bytes of the SEAL packet beyond the outer headers beginning with 752 the SEAL header (or up to the end of the packet if the packet itself 753 includes less than 128 bytes). Using this leading portion of the 754 (partially) reassembled SEAL packet, the ETE then verifies the SEAL 755 header Checksum. If the Checksum is correct, the ETE sends an SPTB 756 message back to the ITE (see Section 4.6.1.1). 758 Whether or not the Checksum was correct, the ETE then discards all IP 759 fragments of the fragmented SEAL packet (i.e., it does not submit the 760 reassembled packet for decapsulation). 762 4.5.4. Decapsulation and Re-Encapsulation 764 The ETE next checks the SEAL header of the (unfragmented) SEAL 765 packet. If the PKT_ID is not within the window of acceptable next 766 PKT_ID values from this ITE, or if the SEAL header includes an 767 incorrect Checksum value, the ETE silently drops the packet. 768 Otherwise, if the packet has an incorrect value in other SEAL header 769 fields the ETE discards the packet and returns an SCMP "Parameter 770 Problem" (SPP) message (see Section 4.6.1.2). Finally, if the SEAL 771 header has A=1 the ETE sends an SPTB message with MTU=0 back to the 772 ITE (see Section 4.6.1.1). 774 Next, the ETE processes the inner packet according to the header type 775 indicated in the SEAL NEXTHDR field. If the next hop toward the 776 destination address of the inner packet will be via a different 777 interface than the SEAL packet arrived on, the ETE discards the outer 778 headers and delivers the inner packet either to the local host or to 779 the next hop interface if the packet is not destined to the local 780 host. 782 If the next hop is on the same interface the SEAL packet arrived on, 783 however, the ETE submits the inner packet for SEAL re-encapsulation 784 beginning with the specification in Section 4.4.3 above. 786 4.6. The SEAL Control Message Protocol (SCMP) 788 SEAL provides a companion SEAL Control Message Protocol (SCMP) that 789 uses the same message types and formats as for the Internet Control 790 Message Protocol for IPv6 (ICMPv6) [RFC4443]. When the TE prepares 791 an SCMP message, it sets the Type and Code fields to the same values 792 that would appear in the corresponding ICMPv6 message, then 793 calculates the SCMP message header checksum. The TE then formats the 794 Message Body the same as for the corresponding ICMPv6 message. The 795 TE then encapsulates the SCMP message in the SEAL header as well as 796 the outer headers and trailers as shown in Figure 3: 798 +--------------------+ 799 ~ outer IP header ~ 800 +--------------------+ 801 ~ other outer hdrs ~ 802 +--------------------+ 803 ~ SEAL Header ~ 804 +--------------------+ +--------------------+ 805 ~ SCMP message header~ --> ~ SCMP message header~ 806 +--------------------+ --> +--------------------+ 807 ~ SCMP message body ~ --> ~ SCMP message body ~ 808 +--------------------+ --> +--------------------+ 809 ~ outer trailers ~ 810 SCMP Message +--------------------+ 811 before encapsulation 812 SCMP Packet 813 after encapsulation 815 Figure 3: SCMP Message Encapsulation 817 The following sections specify the generation, processing and 818 relaying of SCMP messages. 820 4.6.1. Generating SCMP Error Messages 822 ETEs generate SCMP error messages in response to receiving certain 823 SEAL data packets using the format shown in Figure 4: 825 0 1 2 3 826 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 827 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 828 | Type | Code | Checksum | 829 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 830 | Type-Specific Data | 831 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 832 | As much of invoking SEAL data packet as | 833 ~ possible (beginning immediately after the SEAL header) ~ 834 | without the SCMP packet exceeding 576 bytes (*) | 836 (*) also known as the "packet-in-error" 838 Figure 4: SCMP Error Message Format 840 The error message includes the 4 byte SCMP message header, followed 841 by a 4 byte Type-Specific Data field, followed by the leading portion 842 of the invoking SEAL data packet (beginning immediately after the 843 SEAL header) as the "packet-in-error". The packet-in-error includes 844 as much of the leading portion of the invoking SEAL data packet as 845 possible extending to a length that would not cause the entire SCMP 846 packet following outer encapsulation to exceed 576 bytes. 848 When the ETE processes a SEAL data packet for which the SEAL header 849 Checksum is correct but an error must be returned, it prepares an 850 SCMP error message as shown in Figure 4. The ETE sets the Type and 851 Code fields in the SCMP header according to the appropriate error 852 message type, fills out the Type-Specific Data field and includes the 853 packet-in-error. The ETE then calculates the SCMP message checksum 854 the same as specified for ICMPv6, except that the checksum begins 855 with the SCMP message header, i.e., and not a pseudo-header of the 856 outer header. The ETE writes the checksum value in the SCMP message 857 Checksum field. 859 The ETE next encapsulates the SCMP message in the requisite SEAL and 860 outer headers as shown in Figure 3. During encapsulation, the ETE 861 sets the outer destination address/port numbers of the SCMP packet to 862 the outer source address/port numbers of the original SEAL data 863 packet and sets the outer source address/port numbers to its own 864 outer address/port numbers. 866 The ETE then sets (C=1; A=0; R=0; NEXTHDR=0) in the SEAL header, then 867 sets PREFLEN to 0 unless otherwise specified. If the neighbor 868 relationship between the ETE and the source ITE is unidirectional, 869 the ETE then writes random values in the LINK_ID and PKT_ID fields of 870 the SEAL header. If the neighbor relationship is bidirectional, the 871 ETE instead writes values appropriate to the bidirectional neighbor 872 state in the LINK_ID and PKT_ID fields. 874 The ETE then calculates and sets the SEAL header Checksum field the 875 same as specified for SEAL data packet encapsulation in Section 4.4.4 876 Next, the ETE encapsulates the SCMP message in the requisite outer 877 headers the same as for SEAL data packets in Section 4.4.5. When 878 IPv4 is used as the outer layer of encapsulation, the ETE sets the 879 DF=1 in the outer header unless the SCMP message is an SNS message 880 used for the path fragmentation qualification procedure described in 881 Section 4.4.6. The ETE then sends the resulting SCMP packet to the 882 ITE. 884 NB: A simplified implementation of this method entails creating a 885 copy of the original data packet, inserting the SCMP message header 886 and Type-Specific Data fields between the SEAL header and inner 887 headers, truncating the resulting message to 576 bytes if necessary, 888 then preparing the SEAL and outer header fields as described above. 890 The following sections describe additional considerations for various 891 SCMP error messages: 893 4.6.1.1. Generating SCMP Packet Too Big (SPTB) Messages 895 An ETE generates an SCMP "Packet Too Big" (SPTB) message when it 896 receives the leading 128 bytes of a SEAL protocol packet that arrived 897 as multiple outer IP fragments. The ETE prepares the SPTB message 898 the same as for the corresponding ICMPv6 PTB message, and writes the 899 length of the outer IP first fragment (i.e., the fragment with MF=1 900 and Offset=0) in the MTU field of the message. 902 The ETE also generates an SPTB message when it receives an 903 unfragmented SEAL protocol data packet with A=1 in the SEAL header. 904 The ETE prepares the SPTB message the same as above, except that it 905 writes the value 0 in the MTU field. The message is therefore a 906 control plane acknowledgement of a data plane probe, and does not 907 signify a packet size restriction. 909 4.6.1.2. Generating Other SCMP Error Messages 911 An ETE generates an SCMP "Destination Unreachable" (SDU) message 912 under the same circumstances that an IPv6 system would generate an 913 ICMPv6 Destination Unreachable message. 915 An ETE generates an SCMP "Parameter Problem" (SPP) message when it 916 receives a SEAL packet with an incorrect value in the SEAL header. 918 TEs generate other SCMP message types using methods and procedures 919 specified in other documents. For example, SCMP message types used 920 for tunnel neighbor coordinations are specified in VET 921 [I-D.templin-intarea-vet]. 923 4.6.2. Processing SCMP Error Messages 925 For each SCMP error message it receives, the TE first verifies that 926 the outer addresses of the SCMP packet, the SEAL header Checksum, and 927 the SCMP message header checksum are correct. If the identifying 928 addresses and/or checksums are incorrect, the TE discards the 929 message; otherwise, it processes the message as follows: 931 4.6.2.1. Processing SCMP PTB Messages 933 After an ITE sends a SEAL data packet to an ETE, it may receive an 934 SPTB message with a packet-in-error containing the leading portion of 935 the inner packet (see: Section 4.6.1.1). If the SPTB message has 936 MTU=0, the ITE processes the message as confirmation that the ETE is 937 responsive and discards the message. If the SPTB message is the 938 response to a fragmented SNS message used for path qualification (see 939 Section 4.4.6), the ITE processes the message as a confirmation that 940 the path supports IP fragmentation. Otherwise, the ITE processes the 941 message as an indication of a packet size limitation. 943 If the MTU value is no less than 1280, the value is likely to 944 represent the true MTU of the restricting link on the path to the 945 ETE. If the MTU value is less than 1280, however, the ITE cannot 946 determine the true MTU due to the possibility that a router on the 947 path is generating runt first fragments. Instead, the ITE can 948 consult a plateau table (e.g., as described in [RFC1191]) to rewrite 949 the MTU value to a reduced size. For example, if the ITE receives an 950 SPTB message with MTU=256 and inner header length 1500, it can 951 rewrite the MTU to 1400. If the ITE subsequently receives an SPTB 952 message with MTU=256 and inner header length 1400, it can rewrite the 953 MTU to 1300, etc. 955 The ITE then checks its forwarding tables to determine the previous 956 hop on the reverse path toward the source address of the inner packet 957 in the packet-in-error. If the previous hop is reached over a 958 different interface than the SPTB message arrived on, and the inner 959 packet is not an IPv4 packet with DF=0, the ITE transcribes the 960 message into a format appropriate for the inner packet and sends the 961 resulting transcribed message to the original source. If the inner 962 packet is an IPv4 packet with DF=0, however, the ITE instead discards 963 the SPTB message and caches the MTU value as the fragmentation size 964 to use for fragmentation of future inner IPv4 packets destined to the 965 inner destination address (see Section 4.4.3). 967 If the previous hop is reached over the same tunnel interface that 968 the SPTB message arrived on, the ITE instead relays the message to 969 the previous hop. In order to relay the message, the ITE rewrites 970 the SEAL header fields with values corresponding to the previous hop. 971 Next, the ITE replaces the SPTB's outer headers with headers of the 972 appropriate protocol version and fills in the header fields as 973 specified in Sections 5.5.4-5.5.6 of [I-D.templin-intarea-vet], where 974 the destination address/port correspond to the previous hop and the 975 source address/port correspond to the ITE. The ITE then sends the 976 message to the previous hop the same as if it were issuing a new SPTB 977 message. 979 4.6.2.2. Processing Other SCMP Error Messages 981 An ITE may receive an SDU message with an appropriate code under the 982 same circumstances that an IPv6 node would receive an ICMPv6 983 Destination Unreachable message. The ITE relays the message toward 984 the source address of the inner packet within the packet-in-error the 985 same as specified for SPTB messages in Section 4.6.2.1. 987 An ITE may receive an SPP message when the ETE receives a SEAL packet 988 with an incorrect value in the SEAL header. The ITE should examine 989 the incorrect SEAL header field setting to determine whether a 990 different setting should be used in subsequent packets, but does not 991 relay the message further. 993 TEs process other SCMP message types using methods and procedures 994 specified in other documents. For example, SCMP message types used 995 for tunnel neighbor coordinations are specified in VET 996 [I-D.templin-intarea-vet]. 998 5. Link Requirements 1000 Subnetwork designers are expected to follow the recommendations in 1001 Section 2 of [RFC3819] when configuring link MTUs. 1003 6. End System Requirements 1005 SEAL ensures that tunnels return the necessary path MTU discovery 1006 control messages. However, end systems are strongly encouraged to 1007 also implement their own end-to-end MTU assurance, e.g., using 1008 Packetization Layer Path MTU Discovery per [RFC4821]. 1010 7. Router Requirements 1012 IPv4 routers within the subnetwork are strongly encouraged to 1013 implement IPv4 fragmentation such that the first fragment is the 1014 largest and approximately the size of the underlying link MTU, i.e., 1015 they should not generate runt first fragments. 1017 IPv6 routers within the subnetwork are required to generate the 1018 necessary PTB messages when they drop outer IPv6 packets due to an 1019 MTU restriction. 1021 8. IANA Considerations 1023 The IANA is instructed to allocate an IP protocol number for 1024 'SEAL_PROTO' in the 'protocol-numbers' registry. 1026 The IANA is instructed to allocate a Well-Known Port number for 1027 'SEAL_PORT' in the 'port-numbers' registry. 1029 The IANA is instructed to establish a "SEAL Protocol" registry to 1030 record SEAL Version values. This registry should be initialized to 1031 include the initial SEAL Version number, i.e., Version 0. 1033 9. Security Considerations 1035 SEAL provides a segment-by-segment data origin authentication and 1036 anti-replay service across the multiple segments of a re- 1037 encapsulating tunnel. It further provides a segment-by-segment 1038 integrity check of the headers of encapsulated packets, but does not 1039 verify the integrity of the rest of the packet beyond the headers. 1040 SEAL therefore considers full message integrity checking as an end- 1041 to-end consideration, and is therefore compatible with end-to-end 1042 securing mechanisms such as TLS/SSL [RFC5246]. 1044 An amplification/reflection attack is possible when an attacker sends 1045 IP first fragments with spoofed source addresses to an ETE in an 1046 attempt to generate a stream of SCMP messages returned to a victim 1047 ITE. The SEAL header Checksum as well as the inner headers of the 1048 packet-in-error provide mitigation for the ETE to detect and discard 1049 SEAL segments with spoofed source addresses. 1051 The SEAL header is sent in-the-clear the same as for the outer IP and 1052 other outer headers. In this respect, the threat model is no 1053 different than for IPv6 extension headers. Unlike IPv6 extension 1054 headers, however, the SEAL header is protected by an integrity check 1055 that also covers the inner packet headers. 1057 Security issues that apply to tunneling in general are discussed in 1058 [RFC6169]. 1060 10. Related Work 1062 Section 3.1.7 of [RFC2764] provides a high-level sketch for 1063 supporting large tunnel MTUs via a tunnel-level segmentation and 1064 reassembly capability to avoid IP level fragmentation. This 1065 capability was implemented in the first edition of SEAL, but is now 1066 deprecated. 1068 Section 3 of [RFC4459] describes inner and outer fragmentation at the 1069 tunnel endpoints as alternatives for accommodating the tunnel MTU. 1071 Section 4 of [RFC2460] specifies a method for inserting and 1072 processing extension headers between the base IPv6 header and 1073 transport layer protocol data. The SEAL header is inserted and 1074 processed in exactly the same manner. 1076 IPsec/AUTH is [RFC4301][RFC4301] is used for full message integrity 1077 verification between tunnel endpoints, whereas SEAL only ensures 1078 integrity for the inner packet headers. The AYIYA proposal 1079 [I-D.massar-v6ops-ayiya] uses similar means for providing full 1080 message authentication and integrity. 1082 The concepts of path MTU determination through the report of 1083 fragmentation and extending the IP Identification field were first 1084 proposed in deliberations of the TCP-IP mailing list and the Path MTU 1085 Discovery Working Group (MTUDWG) during the late 1980's and early 1086 1990's. An historical analysis of the evolution of these concepts, 1087 as well as the development of the eventual path MTU discovery 1088 mechanism for IP, appears in Appendix D of this document. 1090 11. Acknowledgments 1092 The following individuals are acknowledged for helpful comments and 1093 suggestions: Jari Arkko, Fred Baker, Iljitsch van Beijnum, Oliver 1094 Bonaventure, Teco Boot, Bob Braden, Brian Carpenter, Steve Casner, 1095 Ian Chakeres, Noel Chiappa, Remi Denis-Courmont, Remi Despres, Ralph 1096 Droms, Aurnaud Ebalard, Gorry Fairhurst, Washam Fan, Dino Farinacci, 1097 Joel Halpern, Sam Hartman, John Heffner, Thomas Henderson, Bob 1098 Hinden, Christian Huitema, Eliot Lear, Darrel Lewis, Joe Macker, Matt 1099 Mathis, Erik Nordmark, Dan Romascanu, Dave Thaler, Joe Touch, Mark 1100 Townsley, Ole Troan, Margaret Wasserman, Magnus Westerlund, Robin 1101 Whittle, James Woodyatt, and members of the Boeing Research & 1102 Technology NST DC&NT group. 1104 Discussions with colleagues following the publication of RFC5320 have 1105 provided useful insights that have resulted in significant 1106 improvements to this, the Second Edition of SEAL. 1108 Path MTU determination through the report of fragmentation was first 1109 proposed by Charles Lynn on the TCP-IP mailing list in 1987. 1110 Extending the IP identification field was first proposed by Steve 1111 Deering on the MTUDWG mailing list in 1989. 1113 12. References 1115 12.1. Normative References 1117 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1118 September 1981. 1120 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 1121 RFC 792, September 1981. 1123 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1124 Requirement Levels", BCP 14, RFC 2119, March 1997. 1126 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1127 (IPv6) Specification", RFC 2460, December 1998. 1129 [RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure 1130 Neighbor Discovery (SEND)", RFC 3971, March 2005. 1132 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 1133 Message Protocol (ICMPv6) for the Internet Protocol 1134 Version 6 (IPv6) Specification", RFC 4443, March 2006. 1136 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 1137 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 1138 September 2007. 1140 12.2. Informative References 1142 [FOLK] Shannon, C., Moore, D., and k. claffy, "Beyond Folklore: 1143 Observations on Fragmented Traffic", December 2002. 1145 [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", 1146 October 1987. 1148 [I-D.ietf-intarea-ipv4-id-update] 1149 Touch, J., "Updated Specification of the IPv4 ID Field", 1150 draft-ietf-intarea-ipv4-id-update-04 (work in progress), 1151 September 2011. 1153 [I-D.ietf-savi-framework] 1154 Wu, J., Bi, J., Bagnulo, M., Baker, F., and C. Vogt, 1155 "Source Address Validation Improvement Framework", 1156 draft-ietf-savi-framework-05 (work in progress), 1157 July 2011. 1159 [I-D.massar-v6ops-ayiya] 1160 Massar, J., "AYIYA: Anything In Anything", 1161 draft-massar-v6ops-ayiya-02 (work in progress), July 2004. 1163 [I-D.templin-aero] 1164 Templin, F., "Asymmetric Extended Route Optimization 1165 (AERO)", draft-templin-aero-04 (work in progress), 1166 October 2011. 1168 [I-D.templin-intarea-vet] 1169 Templin, F., "Virtual Enterprise Traversal (VET)", 1170 draft-templin-intarea-vet-27 (work in progress), 1171 October 2011. 1173 [I-D.templin-ironbis] 1174 Templin, F., "The Internet Routing Overlay Network 1175 (IRON)", draft-templin-ironbis-06 (work in progress), 1176 October 2011. 1178 [MTUDWG] "IETF MTU Discovery Working Group mailing list, 1179 gatekeeper.dec.com/pub/DEC/WRL/mogul/mtudwg-log, November 1180 1989 - February 1995.". 1182 [RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP 1183 MTU discovery options", RFC 1063, July 1988. 1185 [RFC1070] Hagens, R., Hall, N., and M. Rose, "Use of the Internet as 1186 a subnetwork for experimentation with the OSI network 1187 layer", RFC 1070, February 1989. 1189 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1190 November 1990. 1192 [RFC1812] Baker, F., "Requirements for IP Version 4 Routers", 1193 RFC 1812, June 1995. 1195 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 1196 for IP version 6", RFC 1981, August 1996. 1198 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 1199 October 1996. 1201 [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in 1202 IPv6 Specification", RFC 2473, December 1998. 1204 [RFC2675] Borman, D., Deering, S., and R. Hinden, "IPv6 Jumbograms", 1205 RFC 2675, August 1999. 1207 [RFC2764] Gleeson, B., Heinanen, J., Lin, A., Armitage, G., and A. 1208 Malis, "A Framework for IP Based Virtual Private 1209 Networks", RFC 2764, February 2000. 1211 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", 1212 RFC 2923, September 2000. 1214 [RFC3232] Reynolds, J., "Assigned Numbers: RFC 1700 is Replaced by 1215 an On-line Database", RFC 3232, January 2002. 1217 [RFC3366] Fairhurst, G. and L. Wood, "Advice to link designers on 1218 link Automatic Repeat reQuest (ARQ)", BCP 62, RFC 3366, 1219 August 2002. 1221 [RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D., 1222 Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. 1223 Wood, "Advice for Internet Subnetwork Designers", BCP 89, 1224 RFC 3819, July 2004. 1226 [RFC4191] Draves, R. and D. Thaler, "Default Router Preferences and 1227 More-Specific Routes", RFC 4191, November 2005. 1229 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 1230 for IPv6 Hosts and Routers", RFC 4213, October 2005. 1232 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 1233 Internet Protocol", RFC 4301, December 2005. 1235 [RFC4302] Kent, S., "IP Authentication Header", RFC 4302, 1236 December 2005. 1238 [RFC4380] Huitema, C., "Teredo: Tunneling IPv6 over UDP through 1239 Network Address Translations (NATs)", RFC 4380, 1240 February 2006. 1242 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1243 Network Tunneling", RFC 4459, April 2006. 1245 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 1246 Discovery", RFC 4821, March 2007. 1248 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 1249 Errors at High Data Rates", RFC 4963, July 2007. 1251 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 1252 Mitigations", RFC 4987, August 2007. 1254 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 1255 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 1257 [RFC5445] Watson, M., "Basic Forward Error Correction (FEC) 1258 Schemes", RFC 5445, March 2009. 1260 [RFC5720] Templin, F., "Routing and Addressing in Networks with 1261 Global Enterprise Recursion (RANGER)", RFC 5720, 1262 February 2010. 1264 [RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010. 1266 [RFC6139] Russert, S., Fleischman, E., and F. Templin, "Routing and 1267 Addressing in Networks with Global Enterprise Recursion 1268 (RANGER) Scenarios", RFC 6139, February 2011. 1270 [RFC6169] Krishnan, S., Thaler, D., and J. Hoagland, "Security 1271 Concerns with IP Tunneling", RFC 6169, April 2011. 1273 [SIGCOMM] Luckie, M. and B. Stasiewicz, "Measuring Path MTU 1274 Discovery Behavior", November 2010. 1276 [TBIT] Medina, A., Allman, M., and S. Floyd, "Measuring 1277 Interactions Between Transport Protocols and Middleboxes", 1278 October 2004. 1280 [TCP-IP] "Archive/Hypermail of Early TCP-IP Mail List, 1281 http://www-mice.cs.ucl.ac.uk/multimedia/misc/tcp_ip/, May 1282 1987 - May 1990.". 1284 [WAND] Luckie, M., Cho, K., and B. Owens, "Inferring and 1285 Debugging Path MTU Discovery Failures", October 2005. 1287 Appendix A. Reliability 1289 Although a SEAL tunnel may span an arbitrarily-large subnetwork 1290 expanse, the IP layer sees the tunnel as a simple link that supports 1291 the IP service model. Links with high bit error rates (BERs) (e.g., 1292 IEEE 802.11) use Automatic Repeat-ReQuest (ARQ) mechanisms [RFC3366] 1293 to increase packet delivery ratios, while links with much lower BERs 1294 typically omit such mechanisms. Since SEAL tunnels may traverse 1295 arbitrarily-long paths over links of various types that are already 1296 either performing or omitting ARQ as appropriate, it would therefore 1297 often be inefficient to also require the tunnel endpoints to also 1298 perform ARQ. 1300 Appendix B. Integrity 1302 The SEAL header includes a Checksum field that covers the SEAL header 1303 and at least the inner packet headers. This provides for header 1304 integrity verification on a segment-by-segment basis for a segmented 1305 re-encapsulating tunnel path. 1307 Fragmentation and reassembly schemes must consider packet-splicing 1308 errors, e.g., when two fragments from the same packet are 1309 concatenated incorrectly, when a fragment from packet X is 1310 reassembled with fragments from packet Y, etc. The primary sources 1311 of such errors include implementation bugs and wrapping IP ID fields. 1313 In terms of wrapping ID fields, when IPv4 is used as the outer IP 1314 protocol, the 16-bit IP ID field can wrap with only 64K packets with 1315 the same (src, dst, protocol)-tuple alive in the system at a given 1316 time [RFC4963] increasing the likelihood of reassembly mis- 1317 associations 1319 SEAL avoids reassembly mis-associations by unconditionally discarding 1320 any fragmented SEAL packets following reassembly. 1322 Appendix C. Transport Mode 1324 SEAL can also be used in "transport-mode", e.g., when the inner layer 1325 comprises upper-layer protocol data rather than an encapsulated IP 1326 packet. For instance, TCP peers can negotiate the use of SEAL (e.g., 1327 by inserting a 'SEAL_OPTION' TCP option during connection 1328 establishment) for the carriage of protocol data encapsulated as 1329 IPv4/SEAL/TCP. In this sense, the "subnetwork" becomes the entire 1330 end-to-end path between the TCP peers and may potentially span the 1331 entire Internet. 1333 If both TCPs agree on the use of SEAL, their protocol messages will 1334 be carried as IPv4/SEAL/TCP and the connection will be serviced by 1335 the SEAL protocol using TCP (instead of an encapsulating tunnel 1336 endpoint) as the transport layer protocol. The SEAL protocol for 1337 transport mode otherwise observes the same specifications as for 1338 Section 4. 1340 Appendix D. Historic Evolution of PMTUD 1342 The topic of Path MTU discovery (PMTUD) saw a flurry of discussion 1343 and numerous proposals in the late 1980's through early 1990. The 1344 initial problem was posed by Art Berggreen on May 22, 1987 in a 1345 message to the TCP-IP discussion group [TCP-IP]. The discussion that 1346 followed provided significant reference material for [FRAG]. An IETF 1347 Path MTU Discovery Working Group [MTUDWG] was formed in late 1989 1348 with charter to produce an RFC. Several variations on a very few 1349 basic proposals were entertained, including: 1351 1. Routers record the PMTUD estimate in ICMP-like path probe 1352 messages (proposed in [FRAG] and later [RFC1063]) 1354 2. The destination reports any fragmentation that occurs for packets 1355 received with the "RF" (Report Fragmentation) bit set (Steve 1356 Deering's 1989 adaptation of Charles Lynn's Nov. 1987 proposal) 1358 3. A hybrid combination of 1) and Charles Lynn's Nov. 1987 (straw 1359 RFC draft by McCloughrie, Fox and Mogul on Jan 12, 1990) 1361 4. Combination of the Lynn proposal with TCP (Fred Bohle, Jan 30, 1362 1990) 1364 5. Fragmentation avoidance by setting "IP_DF" flag on all packets 1365 and retransmitting if ICMPv4 "fragmentation needed" messages 1366 occur (Geof Cooper's 1987 proposal; later adapted into [RFC1191] 1367 by Mogul and Deering). 1369 Option 1) seemed attractive to the group at the time, since it was 1370 believed that routers would migrate more quickly than hosts. Option 1371 2) was a strong contender, but repeated attempts to secure an "RF" 1372 bit in the IPv4 header from the IESG failed and the proponents became 1373 discouraged. 3) was abandoned because it was perceived as too 1374 complicated, and 4) never received any apparent serious 1375 consideration. Proposal 5) was a late entry into the discussion from 1376 Steve Deering on Feb. 24th, 1990. The discussion group soon 1377 thereafter seemingly lost track of all other proposals and adopted 1378 5), which eventually evolved into [RFC1191] and later [RFC1981]. 1380 In retrospect, the "RF" bit postulated in 2) is not needed if a 1381 "contract" is first established between the peers, as in proposal 4) 1382 and a message to the MTUDWG mailing list from jrd@PTT.LCS.MIT.EDU on 1383 Feb 19. 1990. These proposals saw little discussion or rebuttal, and 1384 were dismissed based on the following the assertions: 1386 o routers upgrade their software faster than hosts 1388 o PCs could not reassemble fragmented packets 1390 o Proteon and Wellfleet routers did not reproduce the "RF" bit 1391 properly in fragmented packets 1393 o Ethernet-FDDI bridges would need to perform fragmentation (i.e., 1394 "translucent" not "transparent" bridging) 1396 o the 16-bit IP_ID field could wrap around and disrupt reassembly at 1397 high packet arrival rates 1399 The first four assertions, although perhaps valid at the time, have 1400 been overcome by historical events. The final assertion is addressed 1401 by the mechanisms specified in SEAL. 1403 Author's Address 1405 Fred L. Templin (editor) 1406 Boeing Research & Technology 1407 P.O. Box 3707 1408 Seattle, WA 98124 1409 USA 1411 Email: fltemplin@acm.org