idnits 2.17.1 draft-templin-intarea-seal-64.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == The 'Obsoletes: ' line in the draft header should list only the _numbers_ of the RFCs which will be obsoleted by this document (if approved); it should not include the word 'RFC' in the list. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 18, 2013) is 3840 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC3971' is defined on line 1703, but no explicit reference was found in the text == Unused Reference: 'RFC4861' is defined on line 1710, but no explicit reference was found in the text == Unused Reference: 'RFC0768' is defined on line 1747, but no explicit reference was found in the text == Unused Reference: 'RFC1063' is defined on line 1755, but no explicit reference was found in the text == Unused Reference: 'RFC1146' is defined on line 1762, but no explicit reference was found in the text == Unused Reference: 'RFC2780' is defined on line 1794, but no explicit reference was found in the text == Unused Reference: 'RFC4191' is defined on line 1817, but no explicit reference was found in the text == Unused Reference: 'RFC4987' is defined on line 1838, but no explicit reference was found in the text == Unused Reference: 'RFC5226' is defined on line 1841, but no explicit reference was found in the text == Unused Reference: 'RFC5246' is defined on line 1845, but no explicit reference was found in the text == Unused Reference: 'RFC5445' is defined on line 1851, but no explicit reference was found in the text == Unused Reference: 'RFC6335' is defined on line 1867, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-05) exists of draft-ietf-6man-ext-transmit-04 == Outdated reference: A later version (-02) exists of draft-taylor-v6ops-fragdrop-01 == Outdated reference: A later version (-16) exists of draft-templin-ironbis-15 -- Obsolete informational reference (is this intentional?): RFC 1063 (Obsoleted by RFC 1191) -- Obsolete informational reference (is this intentional?): RFC 1146 (Obsoleted by RFC 6247) -- Obsolete informational reference (is this intentional?): RFC 1981 (Obsoleted by RFC 8201) -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) -- Obsolete informational reference (is this intentional?): RFC 6434 (Obsoleted by RFC 8504) Summary: 1 error (**), 0 flaws (~~), 17 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Templin, Ed. 3 Internet-Draft Boeing Research & Technology 4 Obsoletes: rfc5320 (if approved) October 18, 2013 5 Intended status: Informational 6 Expires: April 21, 2014 8 The Subnetwork Encapsulation and Adaptation Layer (SEAL) 9 draft-templin-intarea-seal-64.txt 11 Abstract 13 This document specifies a Subnetwork Encapsulation and Adaptation 14 Layer (SEAL). SEAL operates over virtual topologies configured over 15 connected IP network routing regions bounded by encapsulating border 16 nodes. These virtual topologies are manifested by tunnels that may 17 span multiple IP and/or sub-IP layer forwarding hops, where they may 18 incur packet duplication, packet reordering, source address spoofing 19 and traversal of links with diverse Maximum Transmission Units 20 (MTUs). SEAL addresses these issues through the encapsulation and 21 messaging mechanisms specified in this document. 23 Status of this Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on April 21, 2014. 40 Copyright Notice 42 Copyright (c) 2013 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 58 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . 4 59 1.2. Approach . . . . . . . . . . . . . . . . . . . . . . . . . 6 60 1.3. Differences with RFC5320 . . . . . . . . . . . . . . . . . 7 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 8 62 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 10 63 4. Applicability Statement . . . . . . . . . . . . . . . . . . . 10 64 5. SEAL Specification . . . . . . . . . . . . . . . . . . . . . . 11 65 5.1. SEAL Tunnel Model . . . . . . . . . . . . . . . . . . . . 11 66 5.2. SEAL Model of Operation . . . . . . . . . . . . . . . . . 12 67 5.3. SEAL Encapsulation Format . . . . . . . . . . . . . . . . 14 68 5.4. ITE Specification . . . . . . . . . . . . . . . . . . . . 16 69 5.4.1. Tunnel MTU . . . . . . . . . . . . . . . . . . . . . . 16 70 5.4.2. Tunnel Neighbor Soft State . . . . . . . . . . . . . . 17 71 5.4.3. SEAL Layer Pre-Processing . . . . . . . . . . . . . . 18 72 5.4.4. SEAL Encapsulation and Segmentation . . . . . . . . . 19 73 5.4.5. Outer Encapsulation . . . . . . . . . . . . . . . . . 20 74 5.4.6. Path Probing and ETE Reachability Verification . . . . 21 75 5.4.7. Processing ICMP Messages . . . . . . . . . . . . . . . 22 76 5.4.8. IPv4 Middlebox Reassembly Testing . . . . . . . . . . 24 77 5.4.9. Stateful MTU Determination . . . . . . . . . . . . . . 25 78 5.4.10. Detecting Path MTU Changes . . . . . . . . . . . . . . 25 79 5.5. ETE Specification . . . . . . . . . . . . . . . . . . . . 25 80 5.5.1. Reassembly Buffer Requirements . . . . . . . . . . . . 25 81 5.5.2. Tunnel Neighbor Soft State . . . . . . . . . . . . . . 26 82 5.5.3. IP-Layer Reassembly . . . . . . . . . . . . . . . . . 26 83 5.5.4. Decapsulation, SEAL-Layer Reassembly, and 84 Re-Encapsulation . . . . . . . . . . . . . . . . . . . 27 85 5.6. The SEAL Control Message Protocol (SCMP) . . . . . . . . . 28 86 5.6.1. Generating SCMP Messages . . . . . . . . . . . . . . . 29 87 5.6.2. Processing SCMP Messages . . . . . . . . . . . . . . . 31 88 6. Link Requirements . . . . . . . . . . . . . . . . . . . . . . 33 89 7. End System Requirements . . . . . . . . . . . . . . . . . . . 33 90 8. Router Requirements . . . . . . . . . . . . . . . . . . . . . 33 91 9. Nested Encapsulation Considerations . . . . . . . . . . . . . 34 92 10. Reliability Considerations . . . . . . . . . . . . . . . . . . 34 93 11. Integrity Considerations . . . . . . . . . . . . . . . . . . . 34 94 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 35 95 13. Security Considerations . . . . . . . . . . . . . . . . . . . 35 96 14. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 36 97 15. Implementation Status . . . . . . . . . . . . . . . . . . . . 36 98 16. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 37 99 17. References . . . . . . . . . . . . . . . . . . . . . . . . . . 37 100 17.1. Normative References . . . . . . . . . . . . . . . . . . . 37 101 17.2. Informative References . . . . . . . . . . . . . . . . . . 38 102 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 42 104 1. Introduction 106 As Internet technology and communication has grown and matured, many 107 techniques have developed that use virtual topologies (manifested by 108 tunnels of one form or another) over an actual network that supports 109 the Internet Protocol (IP) [RFC0791][RFC2460]. Those virtual 110 topologies have elements that appear as one network layer hop, but 111 are actually multiple IP or sub-IP layer hops. These multiple hops 112 often have quite diverse properties that are often not even visible 113 to the endpoints of the virtual hop. This introduces failure modes 114 that are not dealt with well in current approaches. 116 The use of IP encapsulation (also known as "tunneling") has long been 117 considered as the means for creating such virtual topologies (e.g., 118 see [RFC2003][RFC2473]). Tunnels serve a wide variety of purposes, 119 including mobility, security, routing control, traffic engineering, 120 multihoming, etc., and will remain an integral part of the 121 architecture moving forward. However, the encapsulation headers 122 often include insufficiently provisioned per-packet identification 123 values. IP encapsulation also allows an attacker to produce 124 encapsulated packets with spoofed source addresses even if the source 125 address in the encapsulating header cannot be spoofed. A denial-of- 126 service vector that is not possible in non-tunneled subnetworks is 127 therefore presented. 129 Additionally, the insertion of an outer IP header reduces the 130 effective path MTU visible to the inner network layer. When IPv6 is 131 used as the encapsulation protocol, original sources expect to be 132 informed of the MTU limitation through IPv6 Path MTU discovery 133 (PMTUD) [RFC1981]. When IPv4 is used, this reduced MTU can be 134 accommodated through the use of IPv4 fragmentation, but unmitigated 135 in-the-network fragmentation has been found to be harmful through 136 operational experience and studies conducted over the course of many 137 years [FRAG][FOLK][RFC4963]. Additionally, classical IPv4 PMTUD 138 [RFC1191] has known operational issues that are exacerbated by in- 139 the-network tunnels [RFC2923][RFC4459]. 141 The following subsections present further details on the motivation 142 and approach for addressing these issues. 144 1.1. Motivation 146 Before discussing the approach, it is necessary to first understand 147 the problems. In both the Internet and private-use networks today, 148 IP is ubiquitously deployed as the Layer 3 protocol. The primary 149 functions of IP are to provide for routing, addressing, and a 150 fragmentation and reassembly capability used to accommodate links 151 with diverse MTUs. While it is well known that the IP address space 152 is rapidly becoming depleted, there is also a growing awareness that 153 other IP protocol limitations have already or may soon become 154 problematic. 156 First, the Internet historically provided no means for discerning 157 whether the source addresses of IP packets are authentic. This 158 shortcoming is being addressed more and more through the deployment 159 of site border router ingress filters [RFC2827], however the use of 160 encapsulation provides a vector for an attacker to circumvent 161 filtering for the encapsulated packet even if filtering is correctly 162 applied to the encapsulation header. Secondly, the IP header does 163 not include a well-behaved identification value unless the source has 164 included a fragment header for IPv6 or unless the source permits 165 fragmentation for IPv4. These limitations preclude an efficient 166 means for routers to detect duplicate packets and packets that have 167 been re-ordered within the subnetwork. Additionally, recent studies 168 have shown that the arrival of fragments at high data rates can cause 169 denial-of-service (DoS) attacks on performance-sensitive networking 170 gear, prompting some administrators to configure their equipment to 171 drop fragments unconditionally [I-D.taylor-v6ops-fragdrop]. 173 For IPv4 encapsulation, when fragmentation is permitted the header 174 includes a 16-bit Identification field, meaning that at most 2^16 175 unique packets with the same (source, destination, protocol)-tuple 176 can be active in the network at the same time [RFC6864]. (When 177 middleboxes such as Network Address Translators (NATs) re-write the 178 Identification field to random values, the number of unique packets 179 is even further reduced.) Due to the escalating deployment of high- 180 speed links, however, these numbers have become too small by several 181 orders of magnitude for high data rate packet sources such as tunnel 182 endpoints [RFC4963]. 184 Furthermore, there are many well-known limitations pertaining to IPv4 185 fragmentation and reassembly - even to the point that it has been 186 deemed "harmful" in both classic and modern-day studies (see above). 187 In particular, IPv4 fragmentation raises issues ranging from minor 188 annoyances (e.g., in-the-network router fragmentation [RFC1981]) to 189 the potential for major integrity issues (e.g., mis-association of 190 the fragments of multiple IP packets during reassembly [RFC4963]). 192 As a result of these perceived limitations, a fragmentation-avoiding 193 technique for discovering the MTU of the forward path from a source 194 to a destination node was devised through the deliberations of the 195 Path MTU Discovery Working Group (MTUDWG) during the late 1980's 196 through early 1990's which resulted in the publication of [RFC1191]. 197 In this negative feedback-based method, the source node provides 198 explicit instructions to routers in the path to discard the packet 199 and return an ICMP error message if an MTU restriction is 200 encountered. However, this approach has several serious shortcomings 201 that lead to an overall "brittleness" [RFC2923]. 203 In particular, site border routers in the Internet have been known to 204 discard ICMP error messages coming from the outside world. This is 205 due in large part to the fact that malicious spoofing of error 206 messages in the Internet is trivial since there is no way to 207 authenticate the source of the messages [RFC5927]. Furthermore, when 208 a source node that requires ICMP error message feedback when a packet 209 is dropped due to an MTU restriction does not receive the messages, a 210 path MTU-related black hole occurs. This means that the source will 211 continue to send packets that are too large and never receive an 212 indication from the network that they are being discarded. This 213 behavior has been confirmed through documented studies showing clear 214 evidence of PMTUD failures for both IPv4 and IPv6 in the Internet 215 today [TBIT][WAND][SIGCOMM][RIPE]. 217 The issues with both IP fragmentation and this "classical" PMTUD 218 method are exacerbated further when IP tunneling is used [RFC4459]. 219 For example, an ingress tunnel endpoint (ITE) may be required to 220 forward encapsulated packets into the subnetwork on behalf of 221 hundreds, thousands, or even more original sources. If the ITE 222 allows IP fragmentation on the encapsulated packets, persistent 223 fragmentation could lead to undetected data corruption due to 224 Identification field wrapping and/or reassembly congestion at the 225 ETE. If the ITE instead uses classical IP PMTUD it must rely on ICMP 226 error messages coming from the subnetwork that may be suspect, 227 subject to loss due to filtering middleboxes, or insufficiently 228 provisioned for translation into error messages to be returned to the 229 original sources. 231 Although recent works have led to the development of a positive 232 feedback-based end-to-end MTU determination scheme [RFC4821], they do 233 not excuse tunnels from accounting for the encapsulation overhead 234 they add to packets. Moreover, in current practice existing 235 tunneling protocols mask the MTU issues by selecting a "lowest common 236 denominator" MTU that may be much smaller than necessary for most 237 paths and difficult to change at a later date. Therefore, a new 238 approach to accommodate tunnels over links with diverse MTUs is 239 necessary. 241 1.2. Approach 243 This document concerns subnetworks manifested through a virtual 244 topology configured over a connected network routing region and 245 bounded by encapsulating border nodes. Example connected network 246 routing regions include Mobile Ad hoc Networks (MANETs), enterprise 247 networks and the global public Internet itself. Subnetwork border 248 nodes forward unicast and multicast packets over the virtual topology 249 across multiple IP and/or sub-IP layer forwarding hops that may 250 introduce packet duplication and/or traverse links with diverse 251 Maximum Transmission Units (MTUs). 253 This document introduces a Subnetwork Encapsulation and Adaptation 254 Layer (SEAL) for tunneling inner network layer protocol packets over 255 IP subnetworks that connect Ingress and Egress Tunnel Endpoints 256 (ITEs/ETEs) of border nodes. It provides a modular specification 257 designed to be tailored to specific associated tunneling protocols. 258 (A transport-mode of operation is also possible but out of scope for 259 this document.) 261 SEAL provides a mid-layer encapsulation that accommodates links with 262 diverse MTUs, and allows routers in the subnetwork to perform 263 efficient duplicate packet and packet reordering detection. The 264 encapsulation further ensures message origin authentication, packet 265 header integrity and anti-replay in environments in which these 266 functions are necessary. 268 SEAL treats tunnels that traverse the subnetwork as ordinary links 269 that must support network layer services. Moreover, SEAL provides 270 dynamic mechanisms (including limited segmentation and reassembly) to 271 ensure a maximal path MTU over the tunnel. This is in contrast to 272 static approaches which avoid MTU issues by selecting a lowest common 273 denominator MTU value that may be overly conservative for the vast 274 majority of tunnel paths and difficult to change even when larger 275 MTUs become available. 277 1.3. Differences with RFC5320 279 This specification of SEAL is descended from an experimental 280 independent RFC publication of the same name [RFC5320]. However, 281 this specification introduces a number of important differences from 282 the earlier publication. 284 First, this specification includes a protocol version field in the 285 SEAL header whereas [RFC5320] does not, and therefore cannot be 286 updated by future revisions. This specification therefore obsoletes 287 (i.e., and does not update) [RFC5320]. 289 Secondly, [RFC5320] forms a 32-bit Identification value by 290 concatenating the 16-bit IPv4 Identification field with a 16-bit 291 Identification "extension" field in the SEAL header. This means that 292 [RFC5320] can only operate over IPv4 networks (since IPv6 headers do 293 not include a 16-bit version number) and that the SEAL Identification 294 value can be corrupted if the Identification in the outer IPv4 header 295 is rewritten. In contrast, this specification includes a 32-bit 296 Identification value that is independent of any identification fields 297 found in the inner or outer IP headers, and is therefore compatible 298 with any inner and outer IP protocol version combinations. 300 Additionally, the SEAL segmentation and reassembly procedures defined 301 in [RFC5320] differ significantly from those found in this 302 specification. In particular, this specification defines an 8-bit 303 Offset field that allows for smaller segment sizes when SEAL 304 segmentation is necessary. In contrast, [RFC5320] includes a 3-bit 305 Segment field and performs reassembly through concatenation of 306 consecutive segments. 308 This version of SEAL also includes an optional Integrity Check Vector 309 (ICV) that can be used to digitally sign the SEAL header and the 310 leading portion of the encapsulated inner packet. This allows for a 311 lightweight integrity check and a loose message origin authentication 312 capability. The header further includes new control bits as well as 313 a link identification and encapsulation level field for additional 314 control capabilities. 316 Finally, this version of SEAL includes a new messaging protocol known 317 as the SEAL Control Message Protocol (SCMP), whereas [RFC5320] 318 performs signalling through the use of SEAL-encapsulated ICMP 319 messages. The use of SCMP allows SEAL-specific departures from ICMP, 320 as well as a control messaging capability that extends to other 321 specifications, including Virtual Enterprise Traversal (VET) 322 [I-D.templin-intarea-vet]. 324 2. Terminology 326 The following terms are defined within the scope of this document: 328 subnetwork 329 a virtual topology configured over a connected network routing 330 region and bounded by encapsulating border nodes. 332 IP 333 used to generically refer to either Internet Protocol (IP) 334 version, i.e., IPv4 or IPv6. 336 Ingress Tunnel Endpoint (ITE) 337 a portal over which an encapsulating border node (host or router) 338 sends encapsulated packets into the subnetwork. 340 Egress Tunnel Endpoint (ETE) 341 a portal over which an encapsulating border node (host or router) 342 receives encapsulated packets from the subnetwork. 344 SEAL Path 345 a subnetwork path from an ITE to an ETE beginning with an 346 underlying link of the ITE as the first hop. Note that, if the 347 ITE's interface connection to the underlying link assigns multiple 348 IP addresses, each address represents a separate SEAL path. 350 inner packet 351 an unencapsulated network layer protocol packet (e.g., IPv4 352 [RFC0791], OSI/CLNP [RFC0994], IPv6 [RFC2460], etc.) before any 353 outer encapsulations are added. Internet protocol numbers that 354 identify inner packets are found in the IANA Internet Protocol 355 registry [RFC3232]. SEAL protocol packets that incur an 356 additional layer of SEAL encapsulation are also considered inner 357 packets. 359 outer IP packet 360 a packet resulting from adding an outer IP header (and possibly 361 other outer headers) to a SEAL-encapsulated inner packet. 363 packet-in-error 364 the leading portion of an invoking data packet encapsulated in the 365 body of an error control message (e.g., an ICMPv4 [RFC0792] error 366 message, an ICMPv6 [RFC4443] error message, etc.). 368 Packet Too Big (PTB) message 369 a control plane message indicating an MTU restriction (e.g., an 370 ICMPv6 "Packet Too Big" message [RFC4443], an ICMPv4 371 "Fragmentation Needed" message [RFC0792], etc.). 373 Don't Fragment (DF) bit 374 a bit that indicates whether the packet may be fragmented by the 375 network. The DF bit is explicitly included in the IPv4 header 376 [RFC0791] and may be set to '0' to allow fragmentation or '1' to 377 disallow further in-network fragmentation. The bit is absent from 378 the IPv6 header [RFC2460], but implicitly set to '1' because 379 fragmentation can occur only at IPv6 sources. 381 The following abbreviations correspond to terms used within this 382 document and/or elsewhere in common Internetworking nomenclature: 384 HLEN - the length of the SEAL header plus outer headers 386 ICV - Integrity Check Vector 387 MAC - Message Authentication Code 389 MTU - Maximum Transmission Unit 391 SCMP - the SEAL Control Message Protocol 393 SDU - SCMP Destination Unreachable message 395 SPP - SCMP Parameter Problem message 397 SPTB - SCMP Packet Too Big message 399 SEAL - Subnetwork Encapsulation and Adaptation Layer 401 TE - Tunnel Endpoint (i.e., either ingress or egress) 403 VET - Virtual Enterprise Traversal 405 3. Requirements 407 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 408 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 409 document are to be interpreted as described in [RFC2119]. When used 410 in lower case (e.g., must, must not, etc.), these words MUST NOT be 411 interpreted as described in [RFC2119], but are rather interpreted as 412 they would be in common English. 414 4. Applicability Statement 416 SEAL was originally motivated by the specific case of subnetwork 417 abstraction for Mobile Ad hoc Networks (MANETs), however the domain 418 of applicability also extends to subnetwork abstractions over 419 enterprise networks, mobile networks, ISP networks, SO/HO networks, 420 the global public Internet itself, and any other connected network 421 routing region. 423 SEAL provides a network sublayer for encapsulation of an inner 424 network layer packet within outer encapsulating headers. SEAL can 425 also be used as a sublayer within a transport layer protocol data 426 payload, where transport layer encapsulation is typically used for 427 Network Address Translator (NAT) traversal as well as operation over 428 subnetworks that give preferential treatment to certain "core" 429 Internet protocols, e.g., TCP, UDP, etc. (However, note that TCP 430 encapsulation may not be appropriate for all use cases; particularly 431 those that require low delay and/or delay variance.) The SEAL header 432 is processed in the same manner as for IPv6 extension headers, i.e., 433 it is not part of the outer IP header but rather allows for the 434 creation of an arbitrarily extensible chain of headers in the same 435 way that IPv6 does. 437 To accommodate MTU diversity, the Ingress Tunnel Endpoint (ITE) may 438 need to perform limited segmentation which the Egress Tunnel Endpoint 439 (ETE) reassembles. The ETE further acts as a passive observer that 440 informs the ITE of any packet size limitations. This allows the ITE 441 to return appropriate PMTUD feedback even if the network path between 442 the ITE and ETE filters ICMP messages. 444 SEAL further provides mechanisms to ensure message origin 445 authentication, packet header integrity, and anti-replay. The SEAL 446 framework is therefore similar to the IP Security (IPsec) 447 Authentication Header (AH) [RFC4301][RFC4302], however it provides 448 only minimal hop-by-hop authenticating services while leaving full 449 data integrity, authentication and confidentiality services as an 450 end-to-end consideration. 452 In many aspects, SEAL also very closely resembles the Generic Routing 453 Encapsulation (GRE) framework [RFC1701]. SEAL can therefore be 454 applied in the same use cases that are traditionally addressed by 455 GRE, but goes beyond GRE to also provide additional capabilities 456 (e.,g., path MTU accommodation, message origin authentication, etc.) 457 as described in this document. The SEAL header is also exactly 458 analogous to the IPv6 Fragment Header, and in fact shares the same 459 format. SEAL can therefore re-use most existing code that implements 460 IPv6 fragmentation and reassembly. 462 In practice, SEAL is typically used as an encapsulation sublayer in 463 conjunction with existing tunnel types such as IPsec, GRE, IP-in-IPv6 464 [RFC2473], IP-in-IPv4 [RFC4213][RFC2003], etc. When used with 465 existing tunnel types that insert mid-layer headers between the inner 466 and outer IP headers (e.g., IPsec, GRE, etc.), the SEAL header is 467 inserted between the mid-layer headers and outer IP header. 469 5. SEAL Specification 471 The following sections specify the operation of SEAL: 473 5.1. SEAL Tunnel Model 475 SEAL is an encapsulation sublayer used within point-to-point, point- 476 to-multipoint, and non-broadcast, multiple access (NBMA) tunnels. 477 Each SEAL path is configured over one or more underlying interfaces 478 attached to subnetwork links. The SEAL tunnel connects an ITE to one 479 or more ETE "neighbors" via encapsulation across an underlying 480 subnetwork, where the tunnel neighbor relationship may be 481 bidirectional, partially unidirectional or fully unidirectional. 483 A bidirectional tunnel neighbor relationship is one over which both 484 TEs can exchange both data and control messages. A partially 485 unidirectional tunnel neighbor relationship allows the near end ITE 486 to send data packets forward to the far end ETE, while the far end 487 only returns control messages when necessary. Finally, a fully 488 unidirectional mode of operation is one in which the near end ITE can 489 receive neither data nor control messages from the far end ETE. 491 Implications of the SEAL bidirectional and unidirectional models are 492 the same as discussed in [I-D.templin-intarea-vet]. 494 5.2. SEAL Model of Operation 496 SEAL-enabled ITEs encapsulate each inner packet in any ancillary 497 tunnel protocol headers, a SEAL header, any outer header 498 encapsulations and in some instances a SEAL trailer as shown in 499 Figure 1: 501 +--------------------+ 502 ~ outer IP header ~ 503 +--------------------+ 504 ~ other outer hdrs ~ 505 +--------------------+ 506 ~ SEAL Header ~ 507 +--------------------+ 508 ~other tunnel headers~ 509 +--------------------+ +--------------------+ 510 | | --> | | 511 ~ Inner ~ --> ~ Inner ~ 512 ~ Packet ~ --> ~ Packet ~ 513 | | --> | | 514 +--------------------+ +--------------------+ 515 ~ SEAL Trailer ~ 516 +--------------------+ 518 Figure 1: SEAL Encapsulation 520 The ITE inserts the SEAL header according to the specific tunneling 521 protocol. For simple encapsulation of an inner network layer packet 522 within an outer IP header, the ITE inserts the SEAL header following 523 the outer IP header and before the inner packet as: IP/SEAL/{inner 524 packet}. 526 For encapsulations over transports such as UDP, the ITE inserts the 527 SEAL header following the outer transport layer header and before the 528 inner packet, e.g., as IP/UDP/SEAL/{inner packet}. In that case, the 529 UDP header is seen as an "other outer header" as depicted in Figure 1 530 and the outer IP and transport layer headers are together seen as the 531 outer encapsulation headers. (Note that outer transport layer 532 headers such as UDP must sometimes be included to ensure that SEAL 533 packets will traverse the path to the ETE without loss due filtering 534 middleboxes. The ETE MUST accept both IP/SEAL and IP/UDP/SEAL as 535 equivalent packets so that the ITE can discontinue outer transport 536 layer encapsulation if the path supports raw IP/SEAL encapsulation.) 538 For SEAL encapsulations that involve other tunnel types (e.g., GRE, 539 IPsec, etc.) the ITE inserts the SEAL header as a leading extension 540 to the other tunnel headers, i.e., the SEAL encapsulation appears as 541 part of the same tunnel and not a separate tunnel. For example, for 542 GRE the ITE iserts the SEAL header as IP/SEAL/GRE/{inner packet}, and 543 for IPsec the ITE inserts the SEAL header as IP/SEAL/IPsec-header/ 544 {inner packet}/IPsec-trailer. In such cases, SEAL considers the 545 length of the inner packet only (i.e., and not the other tunnel 546 headers and trailers) when performing its packet size calculations. 548 SEAL supports both "nested" tunneling and "re-encapsulating" 549 tunneling. Nested tunneling occurs when a first tunnel is 550 encapsulated within a second tunnel, which may then further be 551 encapsulated within additional tunnels. Nested tunneling can be 552 useful, and stands in contrast to "recursive" tunneling which is an 553 anomalous condition incurred due to misconfiguration or a routing 554 loop. Considerations for nested tunneling and avoiding recursive 555 tunneling are discussed in Section 4 of [RFC2473] as well as in 556 Section 9 of this document. 558 Re-encapsulating tunneling occurs when a packet arrives at a first 559 ETE, which then acts as an ITE to re-encapsulate and forward the 560 packet to a second ETE connected to the same subnetwork. In that 561 case each ITE/ETE transition represents a segment of a bridged path 562 between the ITE nearest the source and the ETE nearest the 563 destination. Considerations for re-encapsulating tunneling are 564 discussed in[I-D.templin-ironbis]. Combinations of nested and re- 565 encapsulating tunneling are also naturally supported by SEAL. 567 The SEAL ITE considers each underlying interface as the ingress 568 attachment point to a separate SEAL path to the ETE. The ITE 569 therefore may experience different path MTUs on different SEAL paths. 571 Finally, the SEAL ITE ensures that the inner network layer protocol 572 will see a minimum MTU of 1500 bytes over each SEAL path regardless 573 of the outer network layer protocol version, i.e., even if a small 574 amount of segmentation and reassembly are necessary. This is to 575 avoid path MTU "black holes" for the minimum MTU configured by the 576 vast majority of links in the Internet. Note that in some scenarios, 577 however, reassembly may place a heavy burden on the ETE. In that 578 case, the ITE can avoid invoking segmentation and instead report an 579 MTU smaller than 1500 bytes to the original source. 581 5.3. SEAL Encapsulation Format 583 SEAL encapsulates each inner packet within any ancillary tunneling 584 protocol headers and a SEAL header. The SEAL header shares the same 585 format as the IPv6 Fragment Header [RFC2460] and is identified by the 586 same IP protocol number assigned for the IPv6 Fragment Header (type 587 '44') [I-D.ietf-6man-ext-transmit]. The SEAL header is 588 differentiated from the IPv6 Fragment Header by including a non-zero 589 value in the upper two bits of the Fragment Header "Reserved" field; 590 these two bits will heretofore serve as a SEAL protocol version 591 number. The SEAL header is formatted as shown in Figure 2: 593 0 1 2 3 594 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 595 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 596 | Next Header |VER|LINK |V|R|X| Fragment Offset |C|P|M| 597 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 598 | Identification | 599 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 601 Figure 2: SEAL Encapsulation Format 603 The fields of the SEAL header are formatted as follows: 605 Next Header (8) an 8-bit field that encodes the next header Internet 606 Protocol number the same as for the IPv4 protocol and IPv6 next 607 header fields. 609 VER (2) 610 a 2-bit version field. This document specifies Version 1 of the 611 SEAL protocol, i.e., the VER field encodes the value '01'. 613 LINK (3) 614 a 3-bit link identification value, set to a unique value by the 615 ITE for each SEAL path over which it will send encapsulated 616 packets to the ETE (up to 8 SEAL paths per ETE are therefore 617 supported). Note that, if the ITE's interface connection to the 618 underlying link assigns multiple IP addresses, each address 619 represents a separate SEAL path that must be assigned a separate 620 link ID. 622 V (1) 623 the "Integrity Check Vector (ICV) included" bit. 625 R (1) 626 the "Redirects Permitted" bit when used by VET (see: 627 [I-D.templin-intarea-vet]); reserved for future use in other 628 contexts. 630 X (1) 631 a 1-bit Reserved field. Initialized to zero for transmission; 632 ignored on reception. 634 Fragment Offset (13) a 13-bit Offset field. The offset, in 8-octet 635 units, of the data following this header. 637 C (1) 638 the "Control/Data" bit. Set to 1 by the ITE in SEAL Control 639 Message Protocol (SCMP) control messages, and set to 0 in ordinary 640 data packets. 642 P (1) 643 The "Probe" bit when C=0; set to 1 by the ITE in SEAL probe data 644 packets for which it wishes to receive an explicit acknowledgement 645 from the ETE. The "Pass" bit when C=1; set to 1 by the ETE in 646 SCMP messages it relays to the ITE on behalf of another SEAL path. 648 M (1) the "More Segments" bit. Set to 1 in a non-final segment and 649 set to 0 in the final segment of the SEAL packet. 651 Identification (32) 652 a 32-bit per-packet identification field. Set to a randomly- 653 initialized 32-bit value that is monotonically-incremented for 654 each SEAL packet transmitted to this ETE. 656 When an IIntegrity Check Vector (ICV) is included, it is added as a 657 trailing field at the end of the SEAL packet. The ICV is formatted 658 as shown in Figure 3: 660 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 661 |F|Key|Algorithm| Message Authentication Code (MAC) | 662 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... 664 Figure 3: Integrity Check Vector (ICV) Format 666 As shown in the figure, the ICV begins with a 1-octet control field 667 with a 1-bit (F)lag, a 2-bit Key identifier and a 5-bit Algorithm 668 identifier. The control octet is followed by a variable-length 669 Message Authentication Code (MAC). The ITE maintains a per ETE 670 algorithm and secret key to calculate the MAC in each packet it will 671 send to this ETE. (By default, the ITE sets the F bit and Algorithm 672 fields to 0 to indicate use of the HMAC-SHA-1 algorithm with a 160 673 bit shared secret key to calculate an 80 bit MAC per [RFC2104] over 674 the leading 128 bytes of the packet. Other values for F and 675 Algorithm are out of scope.) 677 5.4. ITE Specification 679 5.4.1. Tunnel MTU 681 The tunnel must present a stable MTU value to the inner network layer 682 as the size for admission of inner packets into the tunnel. Since 683 tunnels may support a large set of SEAL paths that accept widely 684 varying maximum packet sizes, however, a number of factors should be 685 taken into consideration when selecting a tunnel MTU. 687 Due to the ubiquitous deployment of standard Ethernet and similar 688 networking gear, the nominal Internet cell size has become 1500 689 bytes; this is the de facto size that end systems have come to expect 690 will either be delivered by the network without loss due to an MTU 691 restriction on the path or a suitable ICMP Packet Too Big (PTB) 692 message returned. When large packets sent by end systems incur 693 additional encapsulation at an ITE, however, they may be dropped 694 silently within the tunnel since the network may not always deliver 695 the necessary PTBs [RFC2923]. The ITE SHOULD therefore set a tunnel 696 MTU of at least 1500 bytes and provide accommodations to ensure that 697 packets up to that size are successfully conveyed to the ETE. 699 The inner network layer protocol consults the tunnel MTU when 700 admitting a packet into the tunnel. For non-SEAL inner IPv4 packets 701 with the IPv4 Don't Fragment (DF) bit cleared (i.e, DF==0), if the 702 packet is larger than the tunnel MTU the inner IPv4 layer uses IPv4 703 fragmentation to break the packet into fragments no larger than the 704 MTU. The ITE then admits each fragment into the tunel as an 705 independent packet. 707 For all other inner packets, the inner network layer admits the 708 packet if it is no larger than the tunnel MTU; otherwise, it drops 709 the packet and sends a PTB error message to the source with the MTU 710 value set to the MTU. The message contains as much of the invoking 711 packet as possible without the entire message exceeding the network 712 layer minimum MTU size. 714 The ITE can alternatively set an indefinite tunnel MTU such that all 715 inner packets are admitted into the tunnel regardless of their size 716 (theoretical maximums are 64KB for IPv4 and 4GB for IPv6 [RFC2675]). 717 For ITEs that host applications that use the tunnel directly, this 718 option must be carefully coordinated with protocol stack upper layers 719 since some upper layer protocols (e.g., TCP) derive their packet 720 sizing parameters from the MTU of the outgoing interface and as such 721 may select too large an initial size. This is not a problem for 722 upper layers that use conservative initial maximum segment size 723 estimates and/or when the tunnel can reduce the upper layer's maximum 724 segment size, e.g., by reducing the size advertised in the MSS option 725 of outgoing TCP messages (sometimes known as "MSS clamping"). 727 In light of the above considerations, the ITE SHOULD configure an 728 indefinite MTU on *router* tunnels so that SEAL performs all 729 subnetwork adaptation from within the tunnel as specified in the 730 following sections. The ITE MAY instead set a smaller MTU on *host* 731 tunnels; in that case, the RECOMMENDED MTU is the maximum of 1500 732 bytes and the smallest MTU among all of the underlying links minus 733 the size of the encapsulation headers. 735 5.4.2. Tunnel Neighbor Soft State 737 The ITE maintains a number of soft state variables for each ETE and 738 for each SEAL path. 740 The ITE maintains a per ETE window of Identification values for the 741 packets it has recently sent to this ETE as welll as a per ETE window 742 of Identification values for the packets it has recently received 743 from this ETE. The ITE then includes an Identification in each 744 packet it sends to this ETE. 746 When message origin authentication and integrity checking is 747 required, the ITE sets a variable "USE_ICV" to TRUE, and includes a 748 trailing ICV in each packet it sends to this ETE; otherwise, it sets 749 USE_ICV to FALSE. 751 For each SEAL path, the ITE must also account for encapsulation 752 header lengths. The ITE therefore maintains the per SEAL path 753 constant values "SHLEN" set to the length of the SEAL header and 754 trailer, "THLEN" set to the length of the outer encapsulating 755 transport layer headers (or 0 if outer transport layer encapsulation 756 is not used), "IHLEN" set to the length of the outer IP layer header, 757 and "HLEN" set to (SHLEN+THLEN+IHLEN). (The ITE must include the 758 length of the uncompressed headers even if header compression is 759 enabled when calculating these lengths.) When SEAL is used in 760 conjunction with another tunnel type such as GRE or IPsec, the length 761 of the headers associated with those tunnels is also included in the 762 HLEN calculation for the first segment only and the length of the 763 associated trailers is included in the HLEN calculation for the final 764 segment only. 766 The ITE maintains a per SEAL path variable "MAXMTU" initialized to 767 the maximum of (1500+HLEN) bytes and the MTU of the underlying link. 768 The ITE further sets a variable 'MINMTU' to the minimum MTU for the 769 SEAL path over which encapsulated packets will travel. For IPv6 770 paths, the ITE sets MINMTU=1280 per [RFC2460]. For IPv4 paths, the 771 ITE sets MINMTU=576 based on practical interpretation of [RFC1122] 772 even though the theoretical MINMTU for IPv4 is only 68 bytes 773 [RFC0791]. 775 The ITE can also set MINMTU to a larger value if there is reason to 776 believe that the minimum path MTU is larger, or to a smaller value if 777 there is reason to believe the MTU is smaller, e.g., if there may be 778 additional encapsulations on the path. If this value proves too 779 large, the ITE will receive PTB message feedback either from the ETE 780 or from a router on the path and will be able to reduce its MINMTU to 781 a smaller value. (Note that since IPv4 links with MTUs smaller than 782 1280 are presumably peformance-constrained, the ITE can instead 783 initialize MINMTU to 1280 the same as for IPv6. If this value proves 784 too large, standard IPv4 fragmentation and reassembly will provide 785 short term accommodation for the sizing constraints while the ITE 786 readjusts its MINMTU estimate.) 788 The ITE may instead maintain the packet sizing variables and 789 constants as per ETE (rather than per SEAL path) values. In that 790 case, the values reflect the smallest MTU size across all of the SEAL 791 paths associated with this ETE. 793 5.4.3. SEAL Layer Pre-Processing 795 The SEAL layer is logically positioned between the inner and outer 796 network protocol layers, where the inner layer is seen as the (true) 797 network layer and the outer layer is seen as the (virtual) data link 798 layer. Each packet to be processed by the SEAL layer is either 799 admitted into the tunnel by the inner network layer protocol as 800 described in Section 5.4.1 or is undergoing re-encapsulation from 801 within the tunnel. The SEAL layer sees the former class of packets 802 as inner packets that include inner network and transport layer 803 headers, and sees the latter class of packets as transitional SEAL 804 packets that include the outer and SEAL layer headers that were 805 inserted by the previous hop SEAL ITE. For these transitional 806 packets, the SEAL layer re-encapsulates the packet with new outer and 807 SEAL layer headers when it forwards the packet to the next hop SEAL 808 ITE. 810 We now discuss the SEAL layer pre-processing actions for these two 811 classes of packets. 813 5.4.3.1. Inner Packet Pre-Processing 815 For each for non-SEAL IPv4 inner packet with DF==0 in the IP header 816 and IPv6 inner packet with a fragment header and with (MF=0; 817 Offset=0), if the packet is larger than (MINMTU-HLEN) the ITE uses IP 818 fragmentation to fragment the packet into N pieces, where N is 819 minimized. (For IPv6 as the inner protocol, the first fragment MUST 820 be at least as large as the IPv6 minimum of 1280 bytes so that the 821 entire IPv6 header chain is likely to fit within the first segment.) 822 The ITE then submits each fragment for SEAL encapsulation as 823 specified in Section 5.4.4. 825 For all other inner packets, if the packet is no larger than (MAXMTU- 826 HLEN) for the corresponding SEAL path the ITE submits it for SEAL 827 encapsulation as specified in Section 5.4.4. Otherwise, the ITE 828 drops the packet and sends an ordinary PTB message appropriate to the 829 inner protocol version (subject to rate limiting) with the MTU field 830 set to (MAXMTU-HLEN). (For IPv4 SEAL packets with DF==0, the ITE 831 SHOULD set DF=1 and re-calculate the IPv4 header checksum before 832 generating the PTB message in order to avoid bogon filters.) After 833 sending the PTB message, the ITE discards the inner packet. 835 5.4.3.2. Transitional SEAL Packet Pre-Processing 837 For each transitional packet that is to be processed by the SEAL 838 layer from within the tunnel, if the packet is larger than MAXMTU 839 bytes for the next hop SEAL path the ITE sends an SCMP Packet Too Big 840 (SPTB) message to the previous hop subject to rate limiting with the 841 MTU field set to MAXMTU and with (C=1; P=1) in the SEAL header (see: 842 Section 5.6.1.1). After sending the SPTB message, the ITE discards 843 the packet. Otherwise, the ITE sets aside the encapsulating SEAL and 844 outer headers and submits the inner packet for SEAL re-encapsulation 845 as specified in Section 5.4.4. (Note that in the calculation for 846 MAXMTU, HLEN for the next hop SEAL path may be different than HLEN 847 for the previous hop. In that case, MAXMTU must reflect the smaller 848 of the two HLEN values.) 850 5.4.4. SEAL Encapsulation and Segmentation 852 For each inner packet/fragment submitted for SEAL encapsulation, the 853 ITE next encapsulates the packet in a SEAL header formatted as 854 specified in Section 5.3. The ITE next sets (C=0; P=0), sets LINK to 855 the value assigned to the underlying SEAL path, and sets the Next 856 Header field to the protocol number corresponding to the address 857 family of the encapsulated inner packet. For example, the ITE sets 858 the Next Header field to the value '4' for encapsulated IPv4 packets 859 [RFC2003], '41' for encapsulated IPv6 packets [RFC2473][RFC4213], 860 '47' for GRE [RFC1701], '80' for encapsulated OSI/CLNP packets 862 [RFC1070], etc. 864 Next, if the inner packet is no larger than (MINMTU-HLEN) or larger 865 than 1500, the ITE sets (M=0; Fragment Offset=0). Otherwise, the ITE 866 breaks the inner packet into N non-overlapping segments, where N is 867 minimized. (For IPv6 as the inner protocol, the first segment MUST 868 be at least as large as the IPv6 minimum of 1280 bytes so that the 869 entire IPv6 header chain is likely to fit within the first segment.) 870 The ITE then appends a clone of the SEAL header from the first 871 segment onto the head of each additional segment. The ITE then sets 872 (M=1; Fragment Offset=0) in the first segment, sets (M=0/1; Fragment 873 Offset=O(1)) in the second segment, sets (M=0/1; Fragment 874 Offset=O(2)) in the third segment (if needed), etc., then finally 875 sets (M=0; Fragment Offset=O(n)) in the final segment (where O(i) is 876 the number of 256 byte blocks that preceded this segment). 878 The ITE then writes a monotonically-incrementing integer value for 879 this ETE in the Identification field beginning with a randomly- 880 initialized value in the first packet transmitted. (For SEAL packets 881 that have been split into multiple pieces, the ITE writes the same 882 Identification value in each piece.) The monotonically-incrementing 883 requirement is to satisfy ETEs that use this value for anti-replay 884 purposes. The value is incremented modulo 2^32, i.e., it wraps back 885 to 0 when the previous value was (2^32 - 1). 887 When USE_ICV is FALSE, the ITE next sets V=0. Otherwise, the ITE 888 sets V=1, includes a trailing ICV and calculates the MAC using HMAC- 889 SHA-1 with a 160 bit secret key and 80 bit MAC field. Beginning with 890 the SEAL header, the ITE calculates the MAC over the leading 128 891 bytes of the packet (or up to the end of the packet if there are 892 fewer than 128 bytes) and places the result in the MAC field. (For 893 SEAL packets that have been split into multiple pieces, each piece 894 calculates its own MAC.) The ITE then writes the value 0 in the F 895 flag and 0x00 in the Algorithm field of the ICV control octet (other 896 values for these fields, and other MAC calculation disciplines, are 897 outside the scope of this document and may be specified in future 898 documents.) 900 If the packet is undergoing SEAL re-encapsulation, the ITE then 901 copies the R value from the SEAL header of the packet to be re- 902 encapsulated. Otherwise, it sets R=0 unless otherwise specified in 903 other documents that employ SEAL. The ITE then adds the outer 904 encapsulating headers as specified in Section 5.4.5. 906 5.4.5. Outer Encapsulation 908 Following SEAL encapsulation, the ITE next encapsulates each segment 909 in the requisite outer transport (when necessary) and IP layer 910 headers. When a transport layer header such as UDP or TCP is 911 included, the ITE writes the port number for SEAL in the transport 912 destination service port field. 914 When UDP encapsulation is used, the ITE sets the UDP checksum field 915 to zero for IPv4 packets and also sets the UDP checksum field to zero 916 for IPv6 packets even though IPv6 generally requires UDP checksums. 917 Further considerations for setting the UDP checksum field for IPv6 918 packets are discussed in [RFC6935][RFC6936]. 920 The ITE then sets the outer IP layer headers the same as specified 921 for ordinary IP encapsulation (e.g., [RFC1070][RFC2003], [RFC2473], 922 [RFC4213], etc.) except that for ordinary SEAL packets the ITE copies 923 the "TTL/Hop Limit", "Type of Service/Traffic Class" and "Congestion 924 Experienced" values in the inner network layer header into the 925 corresponding fields in the outer IP header. For transitional SEAL 926 packets undergoing re-encapsulation, the ITE instead copies the "TTL/ 927 Hop Limit", "Type of Service/Traffic Class" and "Congestion 928 Experienced" values in the original outer IP header of the 929 transitional packet into the corresponding fields in the new outer IP 930 header of the packet to be forwarded (i.e., the values are 931 transferred between outer headers and *not* copied from the inner 932 network layer header). 934 The ITE also sets the IP protocol number to the appropriate value for 935 the first protocol layer within the encapsulation (e.g., UDP, TCP, 936 SEAL, etc.). When IPv6 is used as the outer IP protocol, the ITE 937 then sets the flow label value in the outer IPv6 header the same as 938 described in [RFC6438]. When IPv4 is used as the outer IP protocol, 939 the ITE sets DF=0 in the IPv4 header to allow the packet to be 940 fragmented if it encounters a restricting link (for IPv6 SEAL paths, 941 the DF bit is absent but implicitly set to 1). 943 The ITE finally sends each outer packet via the underlying link 944 corresponding to LINK. 946 5.4.6. Path Probing and ETE Reachability Verification 948 All SEAL data packets sent by the ITE are considered implicit probes 949 that detect MTU limitations on the SEAL path, while explicit probe 950 packets can be constructed to probe the path MTU and/or verify ETE 951 reachability. These probes will elicit an SCMP message from the ETE 952 if it needs to send an acknowledgement and/or report an error 953 condition. The probe packets may also be dropped by either the ETE 954 or a router on the path, which may or may not result in an ICMP 955 message being returned to the ITE. 957 To generate an explicit probe packet, the ITE creates a duplicate of 958 an actual data packet and uses the duplicate as a probe. 959 (Alternatively, the ITE can create a packet buffer beginning with the 960 same outer headers, SEAL header and inner network layer headers that 961 would appear in an ordinary data packet, then pad the packet with 962 random data.) The ITE then sets (C=0; P=1) in the SEAL header of the 963 probe packet, and also sets DF=1 in the outer IP header when IPv4 is 964 used. 966 The ITE sends periodic explicit probes to determine whether SEAL 967 segmentation is still necessary (see Section 5.4.4). In particular, 968 if a probe packet of 1500 bytes (i.e., a packet that becomes (1500+ 969 HLEN) bytes after encapsulation) succeeds without incurring 970 fragmentation the ITE is assured that the path MTU is large enough so 971 that the segmentation/reassembly process can be suspended. This 972 probing discipline can therefore be considered as Packetization Layer 973 Path MTU Discovery (PLPMTUD) [RFC4821] applied to tunnels, which 974 operates independently of any application of PLPMTUD between end 975 systems. Note that the explicit probe size of 1500 bytes is chosen 976 since probe packets smaller than this size may be fragmented by a 977 nested ITE further down the path. For example, a successful probe 978 for a packet size of 1400 bytes does not guarantee that fragmentation 979 is not occurring at another ITE. 981 The ITE can also send probes to detect whether an outer transport 982 layer header is no longer necessary to reach this ETE. For example, 983 if the ITE sends its initial packets as IP/UDP/SEAL/*, it can send 984 probes constructed as IP/SEAL/* to determine whether the ETE is 985 reachable without the added layer of encapsulation. If so, the ITE 986 should also re-probe the path MTU since switching to a new 987 encapsulation type may result in a path change. 989 While probing, the ITE processes ICMP messages as specified in 990 Section 5.4.7 and processes SCMP messages as specified in Section 991 5.6.2. 993 5.4.7. Processing ICMP Messages 995 When the ITE sends SEAL packets, it may receive ICMP error messages 996 [RFC0792][RFC4443] from a router on the path to the ETE. Each ICMP 997 message includes an outer IP header, followed by an ICMP header, 998 followed by a portion of the SEAL data packet that generated the 999 error (also known as the "packet-in-error"). Note that the ITE may 1000 receive an ICMP message from another ITE that is at the head end of a 1001 nested level of encapsulation. The ITE has no security associations 1002 with this nested ITE, hence it should consider the message the same 1003 as if it originated from an ordinary router on the path to the ETE. 1005 The ITE should process ICMPv4 Protocol Unreachable messages and 1006 ICMPv6 Parameter Problem messages with Code "Unrecognized Next Header 1007 type encountered" as a hint that the ETE does not implement SEAL. 1008 The ITE can optionally ignore other ICMP messages that do not include 1009 sufficient information in the packet-in-error, or process them as a 1010 hint that the SEAL path to the ETE may be failing. The ITE then 1011 discards these types of messages. 1013 For other ICMP messages, the ITE first examines the SEAL data packet 1014 within the packet-in-error field. If the IP source and/or 1015 destination addresses are invalid, or if the value in the SEAL header 1016 Identification field (if present) is not within the window of packets 1017 the ITE has recently sent to this ETE, or if the MAC value in the ICV 1018 field (if present) is incorrect, the ITE discards the message. 1020 Next, if the received ICMP message is a PTB the ITE sets the 1021 temporary variable "PMTU" for this SEAL path to the MTU value in the 1022 PTB message. If the outer IP length value in the packet-in-error is 1023 no larger than (1500+HLEN) bytes the ITE sets MAXMTU=(1500+HLEN) and 1024 discards the message. If the outer IP length value in the packet-in- 1025 error is larger than (1500+HLEN) bytes and PMTU is no smaller than 1026 MINMTU the ITE sets MAXMTU to the maximum of (1500+HLEN) and PMTU; 1027 otherwise the ITE consults a plateau table (e.g., as described in 1028 [RFC1191]) to determine a new value for MAXMTU. For example, if the 1029 ITE receives a PTB message with small PMTU and packet-in-error length 1030 8KB, it can set MAXMTU=4KB. If the ITE subsequently receives a PTB 1031 message with small PMTU and length 4KB, it can set MAXMTU=2KB, etc., 1032 to a minimum value of MAXMTU=(1500+HLEN). Next, if the packet-in- 1033 error was an explicit probe (i.e., one with P=1 in the SEAL header), 1034 the ITE discards the message. Finally, if the ITE is using a MINMTU 1035 value larger than 1280 for IPv6 or 576 for IPv4, it may need to 1036 reduce MINMTU if the PMTU value is small. 1038 If the ICMP message was not discarded, the ITE transcribes it into a 1039 message appropriate for the SEAL data packet within the packet-in- 1040 error. If the previous hop toward the inner source address within 1041 the SEAL data packet is reached via the same SEAL tunnel, the ITE 1042 transcribes the message into an SCMP message the same as described 1043 for ETE generation of SCMP messages in Section 5.6.1, i.e., it copies 1044 the SEAL data packet within the packet-in-error into the packet-in- 1045 error field of the new message. (In this process, the ETE also sets 1046 (C=1; P=1) in the SEAL header of the SCMP message.) Otherwise, the 1047 ITE seeks beyond the SEAL header within the packet-in-error and 1048 transcribes the inner packet into a message appropriate for the inner 1049 protocol version (e.g., ICMPv4 for IPv4, ICMPv6 for IPv6, etc.). 1051 The ITE finally forwards the transcribed message to the previous hop 1052 toward the inner source address. 1054 5.4.8. IPv4 Middlebox Reassembly Testing 1056 The ITE can perform a qualification exchange to ensure that the 1057 subnetwork correctly delivers fragments to the ETE. This procedure 1058 can be used, e.g., to determine whether there are middleboxes on the 1059 path that violate the [RFC1812], Section 5.2.6 requirement that: "A 1060 router MUST NOT reassemble any datagram before forwarding it". 1061 Examples of middleboxes that may perform reassembly include stateful 1062 NATs and firewalls. Such devices could still allow for stateless MTU 1063 determination if they gather the fragments of a fragmented SEAL data 1064 packet for packet analysis purposes but then forward the fragments on 1065 to the final destination rather than forwarding the reassembled 1066 packet. (This process is often referred to as "Virtual Fragmentation 1067 Reassembly" (VFR)). 1069 The ITE should use knowledge of its topological arrangement as an aid 1070 in determining when middlebox reassembly testing is necessary. For 1071 example, if the ITE is aware that the ETE is located somewhere in the 1072 public Internet, middlebox reassembly testing should not be 1073 necessary. If the ITE is aware that the ETE is located behind a NAT 1074 or a firewall, however, then reassembly testing can be used to detect 1075 middleboxes that do not conform to specifications. 1077 The ITE can perform a middlebox reassembly test by sending explicit 1078 probe packets. The ITE should only send probe packets that are 1079 smaller than (576-HLEN) before encapsulation since the least an 1080 ordinary node can be expected to reassemble is 576 bytes. To 1081 generate a probe, the ITE either creates a clone of an ordinary data 1082 packet or creates a packet buffer beginning with the same outer 1083 headers, SEAL header and inner network layer header that would appear 1084 in an ordinary data packet. The ITE then pads the probe packet with 1085 random data to a length that is at least 128 bytes but smaller than 1086 (576-HLEN) bytes. 1088 The ITE then sets (C=0; P=1) in the SEAL header of the probe packet 1089 and sets the Next Header field to the inner network layer protocol 1090 type. Next, the ITE sets LINK to the appropriate value for this SEAL 1091 path, sets the Identification field, then finally calculates the ICV 1092 and sets V=1 (when USE_ICV is TRUE). 1094 The ITE then encapsulates the probe packet in the appropriate outer 1095 headers, splits it into two outer IP fragments, then sends both 1096 fragments over the same SEAL path. 1098 The ITE should send a series of probe packets (e.g., 3-5 probes with 1099 1sec intervals between tests) instead of a single isolated probe in 1100 case of packet loss. If the ETE returns an SCMP PTB message with the 1101 original first fragment in the packet-in-error, then the SEAL path 1102 correctly supports fragmentation; otherwise, the ITE enables stateful 1103 MTU determination for this SEAL path as specified in Section 5.4.9. 1105 5.4.9. Stateful MTU Determination 1107 SEAL supports a stateless MTU determination capability, however the 1108 ITE may in some instances wish to impose a stateful MTU limit on a 1109 particular SEAL path. For example, when the ETE is situated behind a 1110 middlebox that performs reassembly in violation of the specs (see: 1111 Section 5.4.8) it is imperative that fragmentation be avoided. In 1112 other instances (e.g., when the SEAL path includes performance- 1113 constrained links), the ITE may deem it necessary to cache a 1114 conservative static MTU in order to avoid sending large packets that 1115 would only be dropped due to an MTU restriction somewhere on the 1116 path. 1118 To determine a static MTU value, the ITE can send a series of probe 1119 packets of various sizes to the ETE with (C=0; P=1) in the SEAL 1120 header and DF=1 in the outer IP header. The ITE then caches the size 1121 'S' of the largest packet for which it receives a probe reply from 1122 the ETE by setting MAXMTU=MAX((S, (1500+HLEN)) for this SEAL path. 1124 For example, the ITE could send probe packets of 8KB, followed by 1125 4KB, followed by 2KB, etc. While probing, the ITE processes any ICMP 1126 PTB message it receives as a potential indication of probe failure 1127 then discards the message. 1129 5.4.10. Detecting Path MTU Changes 1131 When stateful MTU determination is used, the ITE SHOULD periodically 1132 reset MAXMTU and/or re-probe the path to determine whether MAXMTU has 1133 increased. If the path still has a too-small MTU, the ITE will 1134 receive a PTB message that reports a smaller size. 1136 5.5. ETE Specification 1138 5.5.1. Reassembly Buffer Requirements 1140 For IPv6, the ETE MUST configure a minimum reassembly buffer size of 1141 (1500 + HLEN) bytes for the reassembly of outer IPv6 packets, i.e., 1142 even though the true minimum reassembly size for IPv6 is only 1500 1143 bytes [RFC2460]. For IPv4, the ETE also MUST configure a minimum 1144 reassembly buffer size of (1500 + HLEN) bytes for the reassembly of 1145 outer IPv4 packets, i.e., even though the true minimum reassembly 1146 size for IPv4 is only 576 bytes [RFC1122]. 1148 In addition to this outer reassembly buffer requirement, the ETE 1149 further MUST configure a minimum SEAL reassembly buffer size of (1500 1150 + HLEN) bytes for the reassembly of segmented SEAL packets (see: 1151 Section 5.5.4). 1153 Note that the value "HLEN" may be variable and initially unknown to 1154 the ETE, and would typically range from a few bytes to a few tens of 1155 bytes or even more. It is therefore RECOMMENDED that the ETE 1156 configure slightly larger minimum IP/SEAL reassembly buffer sizes of 1157 2048 bytes (2KB). 1159 5.5.2. Tunnel Neighbor Soft State 1161 When message origin authentication and integrity checking is 1162 required, the ETE maintains a per-ITE MAC calculation algorithm and a 1163 symmetric secret key to verify the MAC. The ETE also maintains a 1164 window of Identification values for the packets it has recently 1165 received from this ITE as well as a window of Identification values 1166 for the packets it has recently sent to this ITE. 1168 When the tunnel neighbor relationship is bidirectional, the ETE 1169 further maintains a per SEAL path mapping of outer IP and transport 1170 layer addresses to the LINK value that appears in packets received 1171 from the ITE. 1173 5.5.3. IP-Layer Reassembly 1175 The ETE reassembles fragmented IP packets that are explicitly 1176 addressed to itself. For IP fragments that are received via a SEAL 1177 tunnel, the ETE SHOULD maintain conservative reassembly cache high- 1178 and low-water marks. When the size of the reassembly cache exceeds 1179 this high-water mark, the ETE SHOULD actively discard stale 1180 incomplete reassemblies (e.g., using an Active Queue Management (AQM) 1181 strategy) until the size falls below the low-water mark. The ETE 1182 SHOULD also actively discard any pending reassemblies that clearly 1183 have no opportunity for completion, e.g., when a considerable number 1184 of new fragments have arrived before a fragment that completes a 1185 pending reassembly arrives. 1187 The ETE processes non-SEAL IP packets as specified in the normative 1188 references, i.e., it performs any necessary IP reassembly then 1189 discards the packet if it is larger than the reassembly buffer size 1190 or delivers the (fully-reassembled) packet to the appropriate upper 1191 layer protocol module. 1193 For SEAL packets, the ETE performs any necessary IP reassembly then 1194 submits the packet for SEAL decapsulation as specified in Section 1195 5.5.4. (Note that if the packet is larger than the reassembly buffer 1196 size, the ETE still examines the leading portion of the (partially) 1197 reassembled packet during decapsulation.) 1199 5.5.4. Decapsulation, SEAL-Layer Reassembly, and Re-Encapsulation 1201 For each SEAL packet accepted for decapsulation, the ETE first 1202 examines the Identification field. If the Identification is not 1203 within the window of acceptable values for this ITE, the ETE silently 1204 discards the packet. 1206 Next, if V==1 the ETE SHOULD verify the MAC value and silently 1207 discard the packet if the value is incorrect. (Note that this means 1208 that the ETE would need to receive all IP fragments if the packet was 1209 fragmented at the outer IP layer, since the MAC is included as a 1210 trailing field.) 1212 Next, if the packet arrived as multiple IP fragments, the ETE sends 1213 an SPTB message back to the ITE with MTU set to the size of the 1214 largest fragment received (see: Section 5.6.1.1). 1216 Next, if the packet arrived as multiple IP fragments and the inner 1217 packet is larger than 1500 bytes, the ETE silently discards the 1218 packet; otherwise, it continues to process the packet. 1220 Next, if there is an incorrect value in a SEAL header field (e.g., an 1221 incorrect "VER" field value), the ETE discards the packet. If the 1222 SEAL header has C==0, the ETE also returns an SCMP "Parameter 1223 Problem" (SPP) message (see Section 5.6.1.2). 1225 Next, if the SEAL header has C==1, the ETE processes the packet as an 1226 SCMP packet as specified in Section 5.6.2. Otherwise, the ETE 1227 continues to process the packet as a SEAL data packet. 1229 Next, if the SEAL header has (M==1 || Fragment Offset!=0) the ETE 1230 checks to see if the other segments of this already-segmented SEAL 1231 packet have arrived, i.e., by looking for additional segments that 1232 have the same outer IP source address, destination address, source 1233 port number and SEAL Identification value. If all other segments 1234 have already arrived, the ETE discards the SEAL header and other 1235 outer headers from the non-initial segments and appends the segments 1236 onto the end of the first segment according to their offset value. 1237 Otherwise, the ETE caches the new segment for at most 60 seconds 1238 while awaiting the arrival of its partners. During this process, the 1239 ETE discards any segments that are overlapping with respect to 1240 segments that have already been received, and also discards any 1241 segments that have M==1 in the SEAL header but do not contain an 1242 integer multiple of 8 bytes. The ETE further SHOULD manage the SEAL 1243 reassembly cache the same as described for the IP-Layer Reassembly 1244 cache in Section 5.5.3, i.e., it SHOULD perform an early discard for 1245 any pending reassemblies that have low probability of completion. 1247 Next, if the SEAL header in the (reassembled) packet has P==1, the 1248 ETE drops the packet unconditionally and sends an SPTB message back 1249 to the ITE (see: Section 5.6.1.1) if it has not already sent an SPTB 1250 message based on IP fragmentation. (Note that the ETE therefore 1251 sends only a single SPTB message for a probe packet that also 1252 experienced IP fragmentation, i.e., it does not send multiple SPTB 1253 messages.) 1255 Finally, the ETE discards the outer headers and processes the inner 1256 packet according to the header type indicated in the SEAL Next Header 1257 field. If the next hop toward the inner destination address is via a 1258 different interface than the SEAL packet arrived on, the ETE discards 1259 the SEAL header and delivers the inner packet either to the local 1260 host or to the next hop if the packet is not destined to the local 1261 host. 1263 If the next hop is on the same tunnel the SEAL packet arrived on, 1264 however, the ETE submits the packet for SEAL re-encapsulation 1265 beginning with the specification in Section 5.4.3 above and without 1266 decrementing the value in the inner (TTL / Hop Limit) field. 1268 5.6. The SEAL Control Message Protocol (SCMP) 1270 SEAL provides a companion SEAL Control Message Protocol (SCMP) that 1271 uses the same message types and formats as for the Internet Control 1272 Message Protocol for IPv6 (ICMPv6) [RFC4443]. The SCMP messaging 1273 protocol operates over bidirectional and partially unidirectional 1274 tunnels. (For fully unidirectional tunnels, SEAL must operate 1275 without the benefit of SCMP meaning that steady-state fragmentation 1276 and reassembly may be necessary in extreme cases. In that case, the 1277 ITE must select a conservative MINMTU to ensure that IPv4 1278 fragmentation is avoided in order to avoid reassembly errors at high 1279 data rates [RFC4963].) 1281 As for ICMPv6, each SCMP message includes a 32-bit header and a 1282 variable-length body. The ITE encapsulates the SCMP message in a 1283 SEAL header and outer headers as shown in Figure 4: 1285 +--------------------+ 1286 ~ outer IP header ~ 1287 +--------------------+ 1288 ~ other outer hdrs ~ 1289 +--------------------+ 1290 ~ SEAL Header ~ 1291 +--------------------+ +--------------------+ 1292 | SCMP message header| --> | SCMP message header| 1293 +--------------------+ +--------------------+ 1294 | | --> | | 1295 ~ SCMP message body ~ --> ~ SCMP message body ~ 1296 | | --> | | 1297 +--------------------+ +--------------------+ 1299 SCMP Message SCMP Packet 1300 before encapsulation after encapsulation 1302 Figure 4: SCMP Message Encapsulation 1304 The following sections specify the generation, processing and 1305 relaying of SCMP messages. 1307 5.6.1. Generating SCMP Messages 1309 ETEs generate SCMP messages in response to receiving certain SEAL 1310 data packets using the format shown in Figure 5: 1312 0 1 2 3 1313 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1314 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1315 | Type | Code | Checksum | 1316 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1317 | Type-Specific Data | 1318 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1319 | As much of the invoking SEAL data packet as possible | 1320 ~ (beginning with the SEAL header) without the SCMP ~ 1321 | packet exceeding MINMTU bytes (*) | 1323 (*) also known as the "packet-in-error" 1325 Figure 5: SCMP Message Format 1327 The error message includes the 32-bit SCMP message header, followed 1328 by a 32-bit Type-Specific Data field, followed by the leading portion 1329 of the invoking SEAL data packet beginning with the SEAL header as 1330 the "packet-in-error". The packet-in-error includes as much of the 1331 invoking packet as possible extending to a length that would not 1332 cause the entire SCMP packet following outer encapsulation to exceed 1333 MINMTU bytes. 1335 When the ETE processes a SEAL data packet for which the 1336 Identification and ICV values are correct but an error must be 1337 returned, it prepares an SCMP message as shown in Figure 5. The ETE 1338 sets the Type and Code fields to the same values that would appear in 1339 the corresponding ICMPv6 message [RFC4443], but calculates the 1340 Checksum beginning with the SCMP message header using the algorithm 1341 specified for ICMPv4 in [RFC0792]. 1343 The ETE next encapsulates the SCMP message in the requisite SEAL and 1344 outer headers as shown in Figure 4. During encapsulation, the ETE 1345 sets the outer destination address/port numbers of the SCMP packet to 1346 the values associated with the ITE and sets the outer source address/ 1347 port numbers to its own outer address/port numbers. 1349 The ETE then sets (C=1; M=0; Fragment Offset=0) in the SEAL header, 1350 then sets V, Next Header and LINK to the same values that appeared in 1351 the SEAL header of the data packet. The ETE next sets the 1352 Identification field to the next Identification value scheduled for 1353 this ITE, then increments the next Identification value. When V==1, 1354 the ETE then prepares the ICV field the same as specified for SEAL 1355 data packet encapsulation in Section 5.4.4. If this message is in 1356 direct response to a SEAL data packet sent by the ITE, the ETE next 1357 sets P=0 and sends the resulting SCMP packet to the ITE the same as 1358 specified for SEAL data packets in Section 5.4.5. 1360 If the message is in response to an SCMP message received from a next 1361 hop ETE or to an ICMP message received from a router on the path to a 1362 next hop ETE, the ETE instead sets P=1 and passes the message to the 1363 ITE in a "reverse re-encapsulation" process. In particular, when the 1364 previous hop toward the source of the inner packet within the packet- 1365 in-error in a received SCMP/ICMP message is reached via the same 1366 tunnel as the message arrived on, the ETE replaces the outer headers 1367 of the message (up to and including the SEAL header) with headers 1368 that will be recognized and accepted by the previous hop and sends 1369 the resulting packet to the previous hop. 1371 The following sections describe additional considerations for various 1372 SCMP error messages: 1374 5.6.1.1. Generating SCMP Packet Too Big (SPTB) Messages 1376 An ETE generates an SPTB message when it receives a SEAL probe packet 1377 (i.e., one with C=0; P=1 in the SEAL header) or when it receives a 1378 SEAL packet that arrived as multiple outer IP fragments. The ETE 1379 prepares the SPTB message the same as for the corresponding ICMPv6 1380 PTB message, and writes the length of the largest outer IP fragment 1381 received in the MTU field of the message (or the full length of the 1382 outer IP packet if the packet was unfragmented). In that case, the 1383 ETE sets (C=1; P=0) in the SEAL header. 1385 An ETE also generates an SPTB message when it attempts to forward a 1386 SEAL data packet to a next hop ETE via the same tunnel the data 1387 packet arrived on, but for which MAXMTU for that SEAL path is 1388 insufficient to accommodate the packet (See Section 5.4.3.2). In 1389 that case, the ETE sets (C=1; P=1) in the SEAL header. 1391 An ETE finally generates an SPTB message when it receives an ICMP PTB 1392 message from a router on the path to a next hop ETE (See Section 1393 5.4.7). In that case, the ETE also sets (C=1; P=1) in the SEAL 1394 header. 1396 5.6.1.2. Generating Other SCMP Messages 1398 An ETE generates an SCMP "Destination Unreachable" (SDU) message 1399 under the same conditions that an IPv6 system would generate an 1400 ICMPv6 Destination Unreachable message. 1402 An ETE generates an SCMP "Parameter Problem" (SPP) message when it 1403 receives a SEAL packet with an incorrect value in the SEAL header. 1405 TEs generate other SCMP message types using methods and procedures 1406 specified in other documents. For example, SCMP message types used 1407 for tunnel neighbor coordinations are specified in VET 1408 [I-D.templin-intarea-vet]. 1410 5.6.2. Processing SCMP Messages 1412 An ITE may receive SCMP messages with C==1 in the SEAL header after 1413 sending packets to an ETE. The ITE first verifies that the outer 1414 addresses of the SCMP packet are correct, and that the Identification 1415 field contains an acceptable value. The ITE next verifies that the 1416 SEAL header fields are set correctly as specified in Section 5.6.1. 1417 When V==1, the ITE then verifies the ICV. The ITE next verifies the 1418 Checksum value in the SCMP message header. If any of these values 1419 are incorrect, the ITE silently discards the message; otherwise, it 1420 processes the message as follows: 1422 5.6.2.1. Processing SCMP PTB Messages 1424 After an ITE sends a SEAL packet to an ETE, it may receive an SPTB 1425 message with a packet-in-error containing the leading portion of the 1426 packet (see: Section 5.6.1.1). If the SEAL header has P==1 the ITE 1427 consults its forwarding information base to pass the message to the 1428 previous hop toward the source address of the encapsulated inner 1429 packet. When the previous hop is reached via the same SEAL tunnel, 1430 the ITE passes the SPTB message to the previous hop as specified in 1431 Section 5.6.1. Otherwise, the ITE transcribes the inner packet 1432 within the packet-in-error into a message appropriate for the inner 1433 protocol version (e.g., ICMPv4 for IPv4, ICMPv6 for IPv6, etc.). 1435 If the SEAL header has P==0, the ITE instead processes the message as 1436 an MTU limitation on the SEAL path to this ETE. In that case, the 1437 ITE first sets the temporary variable "PMTU" for this SEAL path to 1438 the MTU value in the SPTB message and processes the message as 1439 follows: 1441 o If PMTU is no smaller than (1500+HLEN), the ITE suspends the SEAL 1442 segmentation/reassembly process for this SEAL path so that whole 1443 (unfragmented) SEAL packets can be used. If the packet is a probe 1444 being used to establish a stateful MTU for this SEAL path (see: 1445 section 5.4.9), the ITE also sets MAXMTU=PMTU. 1447 o If PMTU is smaller than (1500+HLEN) but no smaller than MINMTU the 1448 ITE sets MAXMTU to (1500+HLEN) and resumes the SEAL segmentation/ 1449 reassembly process for this SEAL path. 1451 o If PMTU is smaller than MINMTU and the packet-in-error is a probe 1452 used for the purpose of middlebox reassembly detection (see: 1453 section 5.4.8), the ITE notes the results of the probe. 1454 Otherwise, the ITE consults a plateau table to determine a new 1455 value for MAXMTU. For example, if the ITE receives a PTB message 1456 with small PMTU and packet-in-error length 8KB, it can set 1457 MAXMTU=4KB. If the ITE subsequently receives a PTB message with 1458 small PMTU and length 4KB, it can set MAXMTU=2KB, etc., to a 1459 minimum value of MAXMTU=(1500+HLEN). Finally, if the ITE is using 1460 a MINMTU value larger than 1280 for IPv6 or 576 for IPv4, it may 1461 need to reduce MINMTU if the PMTU value is small. 1463 Next, if the packet-in-error was no larger than (1500+HLEN) or the 1464 packet-in-error was an explicit probe (i.e., one with (C==0; P==1 in 1465 the SEAL header of the packet-in-error), the ITE discards the SPTB 1466 message. 1468 5.6.2.2. Processing Other SCMP Error Messages 1470 An ITE may receive an SDU message with an appropriate code under the 1471 same circumstances that an IPv6 node would receive an ICMPv6 1472 Destination Unreachable message. The ITE transcribes the message and 1473 forwards it toward the source address of the inner packet within the 1474 packet-in-error the same as specified for SPTB messages with P==1 in 1475 Section 5.6.2.1. 1477 An ITE may receive an SPP message when the ETE receives a SEAL packet 1478 with an incorrect value in the SEAL header. The ITE should examine 1479 the SEAL header within the packet-in-error to determine whether 1480 different settings should be used in subsequent packets, but does not 1481 relay the message further. 1483 TEs process other SCMP message types using methods and procedures 1484 specified in other documents. For example, SCMP message types used 1485 for tunnel neighbor coordinations are specified in VET 1486 [I-D.templin-intarea-vet]. 1488 6. Link Requirements 1490 Subnetwork designers are expected to follow the recommendations in 1491 Section 2 of [RFC3819] when configuring link MTUs. 1493 7. End System Requirements 1495 End systems are encouraged to implement end-to-end MTU assurance 1496 (e.g., using Packetization Layer Path MTU Discovery (PLPMTUD) per 1497 [RFC4821]) even if the subnetwork is using SEAL. 1499 When end systems use PLPMTUD, SEAL will ensure that the tunnel 1500 behaves as a link in the path that assures an MTU of at least 1500 1501 bytes while not precluding discovery of larger MTUs. The PLPMTUD 1502 mechanism will therefore be able to function as designed in order to 1503 discover and utilize larger MTUs. 1505 8. Router Requirements 1507 Routers within the subnetwork are expected to observe the standard IP 1508 router requirements, including the implementation of IP fragmentation 1509 and reassembly as well as the generation of ICMP messages 1510 [RFC0792][RFC1122][RFC1812][RFC2460][RFC4443][RFC6434]. 1512 Note that, even when routers support existing requirements for the 1513 generation of ICMP messages, these messages are often filtered and 1514 discarded by middleboxes on the path to the original source of the 1515 message that triggered the ICMP. It is therefore not possible to 1516 assume delivery of ICMP messages even when routers are correctly 1517 implemented. 1519 9. Nested Encapsulation Considerations 1521 SEAL supports nested tunneling - an example would be a recursive 1522 nesting of mobile networks, where the first network receives service 1523 from an ISP, the second network receives service from the first 1524 network, the third network receives service from the second network, 1525 etc. It is imperative that such nesting not extend indefinitely; 1526 SEAL tunnels therefore honor the Encapsulation Limit option defined 1527 in [RFC2473]. 1529 In such nested arrangements, the SEAL ITE has a tunnel neighbor 1530 relationship only with ETEs at its own nesting level, i.e., it does 1531 not have a tunnel neighbor relationship with TEs at other nesting 1532 levels.Therefore, when an ITE 'A' within an outer nesting level needs 1533 to return an error message to an ITE 'B' within an inner nesting 1534 level, it generates an ordinary ICMP error message the same as if it 1535 were an ordinary router within the subnetwork. 'B' can then perform 1536 message validation as specified in Section 5.4.7, but full message 1537 origin authentication is not possible. 1539 (Note that the SCMP protocol could instead be extended to allow an 1540 outer nesting level ITE 'A' to return an SCMP message to an inner 1541 nesting level ITE 'B' rather than return an ICMP message. This would 1542 conceptually allow the control messages to pass through firewalls and 1543 NATs, however it would give no more message origin authentication 1544 assurance than for ordinary ICMP messages. It was therefore 1545 determined that the complexity of extending the SCMP protocol was of 1546 little value within the context of the anticipated use cases for 1547 nested encapsulations.) 1549 10. Reliability Considerations 1551 Although a SEAL tunnel may span an arbitrarily-large subnetwork 1552 expanse, the IP layer sees the tunnel as a simple link that supports 1553 the IP service model. Links with high bit error rates (BERs) (e.g., 1554 IEEE 802.11) use Automatic Repeat-ReQuest (ARQ) mechanisms [RFC3366] 1555 to increase packet delivery ratios, while links with much lower BERs 1556 typically omit such mechanisms. Since SEAL tunnels may traverse 1557 arbitrarily-long paths over links of various types that are already 1558 either performing or omitting ARQ as appropriate, it would therefore 1559 be inefficient to require the tunnel endpoints to also perform ARQ. 1561 11. Integrity Considerations 1563 The SEAL header includes an integrity check field that covers the 1564 SEAL header and at least the inner packet headers. This provides for 1565 header integrity verification on a segment-by-segment basis for a 1566 segmented re-encapsulating tunnel path. 1568 Fragmentation and reassembly schemes must also consider packet- 1569 splicing errors, e.g., when two fragments from the same packet are 1570 concatenated incorrectly, when a fragment from packet X is 1571 reassembled with fragments from packet Y, etc. The primary sources 1572 of such errors include implementation bugs and wrapping IPv4 ID 1573 fields. 1575 In particular, the IPv4 16-bit ID field can wrap with only 64K 1576 packets with the same (src, dst, protocol)-tuple alive in the system 1577 at a given time [RFC4963]. When the IPv4 ID field is re-written by a 1578 middlebox such as a NAT or Firewall, ID field wrapping can occur with 1579 even fewer packets alive in the system. It is therefore essential 1580 that IPv4 fragmentation and reassembly be detected early and tuned 1581 out through proper application of SEAL segmentation and reassembly. 1583 12. IANA Considerations 1585 The IANA is requested to allocate a User Port number for "SEAL" in 1586 the 'port-numbers' registry. The Service Name is "SEAL", and the 1587 Transport Protocols are TCP and UDP. The Assignee is the IESG 1588 (iesg@ietf.org) and the Contact is the IETF Chair (chair@ietf.org). 1589 The Description is "Subnetwork Encapsulation and Adaptation Layer 1590 (SEAL)", and the Reference is the RFC-to-be currently known as 1591 'draft-templin-intarea-seal'. 1593 13. Security Considerations 1595 SEAL provides a segment-by-segment message origin authentication, 1596 integrity and anti-replay service. The SEAL header is sent in-the- 1597 clear the same as for the outer IP and other outer headers. In this 1598 respect, the threat model is no different than for IPv6 extension 1599 headers. Unlike IPv6 extension headers, however, the SEAL header can 1600 be protected by an integrity check that also covers the inner packet 1601 headers. 1603 An amplification/reflection/buffer overflow attack is possible when 1604 an attacker sends IP fragments with spoofed source addresses to an 1605 ETE in an attempt to clog the ETE's reassembly buffer and/or cause 1606 the ETE to generate a stream of SCMP messages returned to a victim 1607 ITE. The SCMP message ICV, Identification, as well as the inner 1608 headers of the packet-in-error, provide mitigation for the ETE to 1609 detect and discard SEAL segments with spoofed source addresses. 1611 Security issues that apply to tunneling in general are discussed in 1612 [RFC6169]. 1614 14. Related Work 1616 Section 3.1.7 of [RFC2764] provides a high-level sketch for 1617 supporting large tunnel MTUs via a tunnel-level segmentation and 1618 reassembly capability to avoid IP level fragmentation. 1620 Section 3 of [RFC4459] describes inner and outer fragmentation at the 1621 tunnel endpoints as alternatives for accommodating the tunnel MTU. 1623 Section 4 of [RFC2460] specifies a method for inserting and 1624 processing extension headers between the base IPv6 header and 1625 transport layer protocol data. The SEAL header is inserted and 1626 processed in exactly the same manner. 1628 IPsec/AH is [RFC4301][RFC4301] is used for full message integrity 1629 verification between tunnel endpoints, whereas SEAL only ensures 1630 integrity for the inner packet headers. The AYIYA proposal 1631 [I-D.massar-v6ops-ayiya] uses similar means for providing message 1632 authentication and integrity. 1634 SEAL, along with the Virtual Enterprise Traversal (VET) 1635 [I-D.templin-intarea-vet] tunnel virtual interface abstraction, are 1636 the functional building blocks for the Interior Routing Overlay 1637 Network (IRON) [I-D.templin-ironbis] and Routing and Addressing in 1638 Networks with Global Enterprise Recursion (RANGER) [RFC5720][RFC6139] 1639 architectures. 1641 The concepts of path MTU determination through the report of 1642 fragmentation and extending the IPv4 Identification field were first 1643 proposed in deliberations of the TCP-IP mailing list and the Path MTU 1644 Discovery Working Group (MTUDWG) during the late 1980's and early 1645 1990's. An historical analysis of the evolution of these concepts, 1646 as well as the development of the eventual PMTUD mechanism, appears 1647 in [RFC5320]. 1649 15. Implementation Status 1651 An early implementation of the first revision of SEAL [RFC5320] is 1652 available at: http://isatap.com/seal. 1654 16. Acknowledgments 1656 The following individuals are acknowledged for helpful comments and 1657 suggestions: Jari Arkko, Fred Baker, Iljitsch van Beijnum, Oliver 1658 Bonaventure, Teco Boot, Bob Braden, Brian Carpenter, Steve Casner, 1659 Ian Chakeres, Noel Chiappa, Remi Denis-Courmont, Remi Despres, Ralph 1660 Droms, Aurnaud Ebalard, Gorry Fairhurst, Washam Fan, Dino Farinacci, 1661 Joel Halpern, Sam Hartman, John Heffner, Thomas Henderson, Bob 1662 Hinden, Christian Huitema, Eliot Lear, Darrel Lewis, Joe Macker, Matt 1663 Mathis, Erik Nordmark, Dan Romascanu, Dave Thaler, Joe Touch, Mark 1664 Townsley, Ole Troan, Margaret Wasserman, Magnus Westerlund, Robin 1665 Whittle, James Woodyatt, and members of the Boeing Research & 1666 Technology NST DC&NT group. 1668 Discussions with colleagues following the publication of [RFC5320] 1669 have provided useful insights that have resulted in significant 1670 improvements to this, the Second Edition of SEAL. 1672 This document received substantial review input from the IESG and 1673 IETF area directorates in the February 2013 timeframe. IESG members 1674 and IETF area directorate representatives who contributed helpful 1675 comments and suggestions are gratefully acknowledged. Discussions on 1676 the IETF IPv6 and Intarea mailing lists in the summer 2013 timeframe 1677 also stimulated several useful ideas. 1679 Path MTU determination through the report of fragmentation was first 1680 proposed by Charles Lynn on the TCP-IP mailing list in 1987. 1681 Extending the IP identification field was first proposed by Steve 1682 Deering on the MTUDWG mailing list in 1989. 1684 17. References 1686 17.1. Normative References 1688 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1689 September 1981. 1691 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 1692 RFC 792, September 1981. 1694 [RFC1122] Braden, R., "Requirements for Internet Hosts - 1695 Communication Layers", STD 3, RFC 1122, October 1989. 1697 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1698 Requirement Levels", BCP 14, RFC 2119, March 1997. 1700 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1701 (IPv6) Specification", RFC 2460, December 1998. 1703 [RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure 1704 Neighbor Discovery (SEND)", RFC 3971, March 2005. 1706 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 1707 Message Protocol (ICMPv6) for the Internet Protocol 1708 Version 6 (IPv6) Specification", RFC 4443, March 2006. 1710 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 1711 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 1712 September 2007. 1714 17.2. Informative References 1716 [FOLK] Shannon, C., Moore, D., and k. claffy, "Beyond Folklore: 1717 Observations on Fragmented Traffic", December 2002. 1719 [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", 1720 October 1987. 1722 [I-D.ietf-6man-ext-transmit] 1723 Carpenter, B. and S. Jiang, "Transmission and Processing 1724 of IPv6 Extension Headers", 1725 draft-ietf-6man-ext-transmit-04 (work in progress), 1726 September 2013. 1728 [I-D.massar-v6ops-ayiya] 1729 Massar, J., "AYIYA: Anything In Anything", 1730 draft-massar-v6ops-ayiya-02 (work in progress), July 2004. 1732 [I-D.taylor-v6ops-fragdrop] 1733 Jaeggli, J., Colitti, L., Kumari, W., Vyncke, E., Kaeo, 1734 M., and T. Taylor, "Why Operators Filter Fragments and 1735 What It Implies", draft-taylor-v6ops-fragdrop-01 (work in 1736 progress), June 2013. 1738 [I-D.templin-intarea-vet] 1739 Templin, F., "Virtual Enterprise Traversal (VET)", 1740 draft-templin-intarea-vet-40 (work in progress), May 2013. 1742 [I-D.templin-ironbis] 1743 Templin, F., "The Interior Routing Overlay Network 1744 (IRON)", draft-templin-ironbis-15 (work in progress), 1745 May 2013. 1747 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 1748 August 1980. 1750 [RFC0994] International Organization for Standardization (ISO) and 1751 American National Standards Institute (ANSI), "Final text 1752 of DIS 8473, Protocol for Providing the Connectionless- 1753 mode Network Service", RFC 994, March 1986. 1755 [RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP 1756 MTU discovery options", RFC 1063, July 1988. 1758 [RFC1070] Hagens, R., Hall, N., and M. Rose, "Use of the Internet as 1759 a subnetwork for experimentation with the OSI network 1760 layer", RFC 1070, February 1989. 1762 [RFC1146] Zweig, J. and C. Partridge, "TCP alternate checksum 1763 options", RFC 1146, March 1990. 1765 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1766 November 1990. 1768 [RFC1701] Hanks, S., Li, T., Farinacci, D., and P. Traina, "Generic 1769 Routing Encapsulation (GRE)", RFC 1701, October 1994. 1771 [RFC1812] Baker, F., "Requirements for IP Version 4 Routers", 1772 RFC 1812, June 1995. 1774 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 1775 for IP version 6", RFC 1981, August 1996. 1777 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 1778 October 1996. 1780 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 1781 Hashing for Message Authentication", RFC 2104, 1782 February 1997. 1784 [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in 1785 IPv6 Specification", RFC 2473, December 1998. 1787 [RFC2675] Borman, D., Deering, S., and R. Hinden, "IPv6 Jumbograms", 1788 RFC 2675, August 1999. 1790 [RFC2764] Gleeson, B., Heinanen, J., Lin, A., Armitage, G., and A. 1791 Malis, "A Framework for IP Based Virtual Private 1792 Networks", RFC 2764, February 2000. 1794 [RFC2780] Bradner, S. and V. Paxson, "IANA Allocation Guidelines For 1795 Values In the Internet Protocol and Related Headers", 1796 BCP 37, RFC 2780, March 2000. 1798 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 1799 Defeating Denial of Service Attacks which employ IP Source 1800 Address Spoofing", BCP 38, RFC 2827, May 2000. 1802 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", 1803 RFC 2923, September 2000. 1805 [RFC3232] Reynolds, J., "Assigned Numbers: RFC 1700 is Replaced by 1806 an On-line Database", RFC 3232, January 2002. 1808 [RFC3366] Fairhurst, G. and L. Wood, "Advice to link designers on 1809 link Automatic Repeat reQuest (ARQ)", BCP 62, RFC 3366, 1810 August 2002. 1812 [RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D., 1813 Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. 1814 Wood, "Advice for Internet Subnetwork Designers", BCP 89, 1815 RFC 3819, July 2004. 1817 [RFC4191] Draves, R. and D. Thaler, "Default Router Preferences and 1818 More-Specific Routes", RFC 4191, November 2005. 1820 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 1821 for IPv6 Hosts and Routers", RFC 4213, October 2005. 1823 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 1824 Internet Protocol", RFC 4301, December 2005. 1826 [RFC4302] Kent, S., "IP Authentication Header", RFC 4302, 1827 December 2005. 1829 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1830 Network Tunneling", RFC 4459, April 2006. 1832 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 1833 Discovery", RFC 4821, March 2007. 1835 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 1836 Errors at High Data Rates", RFC 4963, July 2007. 1838 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 1839 Mitigations", RFC 4987, August 2007. 1841 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1842 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 1843 May 2008. 1845 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 1846 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 1848 [RFC5320] Templin, F., "The Subnetwork Encapsulation and Adaptation 1849 Layer (SEAL)", RFC 5320, February 2010. 1851 [RFC5445] Watson, M., "Basic Forward Error Correction (FEC) 1852 Schemes", RFC 5445, March 2009. 1854 [RFC5720] Templin, F., "Routing and Addressing in Networks with 1855 Global Enterprise Recursion (RANGER)", RFC 5720, 1856 February 2010. 1858 [RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010. 1860 [RFC6139] Russert, S., Fleischman, E., and F. Templin, "Routing and 1861 Addressing in Networks with Global Enterprise Recursion 1862 (RANGER) Scenarios", RFC 6139, February 2011. 1864 [RFC6169] Krishnan, S., Thaler, D., and J. Hoagland, "Security 1865 Concerns with IP Tunneling", RFC 6169, April 2011. 1867 [RFC6335] Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S. 1868 Cheshire, "Internet Assigned Numbers Authority (IANA) 1869 Procedures for the Management of the Service Name and 1870 Transport Protocol Port Number Registry", BCP 165, 1871 RFC 6335, August 2011. 1873 [RFC6434] Jankiewicz, E., Loughney, J., and T. Narten, "IPv6 Node 1874 Requirements", RFC 6434, December 2011. 1876 [RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label 1877 for Equal Cost Multipath Routing and Link Aggregation in 1878 Tunnels", RFC 6438, November 2011. 1880 [RFC6864] Touch, J., "Updated Specification of the IPv4 ID Field", 1881 RFC 6864, February 2013. 1883 [RFC6935] Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and 1884 UDP Checksums for Tunneled Packets", RFC 6935, April 2013. 1886 [RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement 1887 for the Use of IPv6 UDP Datagrams with Zero Checksums", 1888 RFC 6936, April 2013. 1890 [RIPE] De Boer, M. and J. Bosma, "Discovering Path MTU Black 1891 Holes on the Internet using RIPE Atlas", July 2012. 1893 [SIGCOMM] Luckie, M. and B. Stasiewicz, "Measuring Path MTU 1894 Discovery Behavior", November 2010. 1896 [TBIT] Medina, A., Allman, M., and S. Floyd, "Measuring 1897 Interactions Between Transport Protocols and Middleboxes", 1898 October 2004. 1900 [WAND] Luckie, M., Cho, K., and B. Owens, "Inferring and 1901 Debugging Path MTU Discovery Failures", October 2005. 1903 Author's Address 1905 Fred L. Templin (editor) 1906 Boeing Research & Technology 1907 P.O. Box 3707 1908 Seattle, WA 98124 1909 USA 1911 Email: fltemplin@acm.org