idnits 2.17.1 draft-templin-intarea-seal-57.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == The 'Obsoletes: ' line in the draft header should list only the _numbers_ of the RFCs which will be obsoleted by this document (if approved); it should not include the word 'RFC' in the list. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 12, 2013) is 3970 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC3971' is defined on line 1629, but no explicit reference was found in the text == Unused Reference: 'RFC4861' is defined on line 1636, but no explicit reference was found in the text == Unused Reference: 'RFC1063' is defined on line 1672, but no explicit reference was found in the text == Unused Reference: 'RFC1146' is defined on line 1679, but no explicit reference was found in the text == Unused Reference: 'RFC2675' is defined on line 1704, but no explicit reference was found in the text == Unused Reference: 'RFC2780' is defined on line 1712, but no explicit reference was found in the text == Unused Reference: 'RFC4191' is defined on line 1735, but no explicit reference was found in the text == Unused Reference: 'RFC4987' is defined on line 1756, but no explicit reference was found in the text == Unused Reference: 'RFC5226' is defined on line 1759, but no explicit reference was found in the text == Unused Reference: 'RFC5246' is defined on line 1763, but no explicit reference was found in the text == Unused Reference: 'RFC5445' is defined on line 1769, but no explicit reference was found in the text == Unused Reference: 'RFC6335' is defined on line 1785, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-02) exists of draft-taylor-v6ops-fragdrop-01 == Outdated reference: A later version (-16) exists of draft-templin-ironbis-15 -- Obsolete informational reference (is this intentional?): RFC 1063 (Obsoleted by RFC 1191) -- Obsolete informational reference (is this intentional?): RFC 1146 (Obsoleted by RFC 6247) -- Obsolete informational reference (is this intentional?): RFC 1981 (Obsoleted by RFC 8201) -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) -- Obsolete informational reference (is this intentional?): RFC 6434 (Obsoleted by RFC 8504) Summary: 1 error (**), 0 flaws (~~), 17 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Templin, Ed. 3 Internet-Draft Boeing Research & Technology 4 Obsoletes: rfc5320 (if approved) June 12, 2013 5 Intended status: Informational 6 Expires: December 14, 2013 8 The Subnetwork Encapsulation and Adaptation Layer (SEAL) 9 draft-templin-intarea-seal-57.txt 11 Abstract 13 This document specifies a Subnetwork Encapsulation and Adaptation 14 Layer (SEAL). SEAL operates over virtual topologies configured over 15 connected IP network routing regions bounded by encapsulating border 16 nodes. These virtual topologies are manifested by tunnels that may 17 span multiple IP and/or sub-IP layer forwarding hops, where they may 18 incur packet duplication, packet reordering, source address spoofing 19 and traversal of links with diverse Maximum Transmission Units 20 (MTUs). SEAL addresses these issues through the encapsulation and 21 messaging mechanisms specified in this document. 23 Status of this Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on December 14, 2013. 40 Copyright Notice 42 Copyright (c) 2013 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 58 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . 4 59 1.2. Approach . . . . . . . . . . . . . . . . . . . . . . . . . 6 60 1.3. Differences with RFC5320 . . . . . . . . . . . . . . . . . 7 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 8 62 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 10 63 4. Applicability Statement . . . . . . . . . . . . . . . . . . . 10 64 5. SEAL Specification . . . . . . . . . . . . . . . . . . . . . . 11 65 5.1. SEAL Tunnel Model . . . . . . . . . . . . . . . . . . . . 11 66 5.2. SEAL Model of Operation . . . . . . . . . . . . . . . . . 12 67 5.3. SEAL Header and Trailer Format . . . . . . . . . . . . . . 13 68 5.4. ITE Specification . . . . . . . . . . . . . . . . . . . . 15 69 5.4.1. Tunnel Interface MTU . . . . . . . . . . . . . . . . . 15 70 5.4.2. Tunnel Neighbor Soft State . . . . . . . . . . . . . . 16 71 5.4.3. SEAL Layer Pre-Processing . . . . . . . . . . . . . . 17 72 5.4.4. SEAL Encapsulation and Segmentation . . . . . . . . . 18 73 5.4.5. Outer Encapsulation . . . . . . . . . . . . . . . . . 20 74 5.4.6. Path Probing and ETE Reachability Verification . . . . 21 75 5.4.7. Processing ICMP Messages . . . . . . . . . . . . . . . 21 76 5.4.8. IPv4 Middlebox Reassembly Testing . . . . . . . . . . 22 77 5.4.9. Stateful MTU Determination . . . . . . . . . . . . . . 23 78 5.4.10. Detecting Path MTU Changes . . . . . . . . . . . . . . 24 79 5.5. ETE Specification . . . . . . . . . . . . . . . . . . . . 24 80 5.5.1. Reassembly Buffer Requirements . . . . . . . . . . . . 24 81 5.5.2. Tunnel Neighbor Soft State . . . . . . . . . . . . . . 24 82 5.5.3. IP-Layer Reassembly . . . . . . . . . . . . . . . . . 25 83 5.5.4. Decapsulation, SEAL-Layer Reassembly, and 84 Re-Encapsulation . . . . . . . . . . . . . . . . . . . 25 85 5.6. The SEAL Control Message Protocol (SCMP) . . . . . . . . . 26 86 5.6.1. Generating SCMP Error Messages . . . . . . . . . . . . 27 87 5.6.2. Processing SCMP Error Messages . . . . . . . . . . . . 29 88 6. Link Requirements . . . . . . . . . . . . . . . . . . . . . . 31 89 7. End System Requirements . . . . . . . . . . . . . . . . . . . 31 90 8. Router Requirements . . . . . . . . . . . . . . . . . . . . . 32 91 9. Nested Encapsulation Considerations . . . . . . . . . . . . . 32 92 10. Reliability Considerations . . . . . . . . . . . . . . . . . . 33 93 11. Integrity Considerations . . . . . . . . . . . . . . . . . . . 33 94 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 33 95 13. Security Considerations . . . . . . . . . . . . . . . . . . . 34 96 14. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 34 97 15. Implementation Status . . . . . . . . . . . . . . . . . . . . 35 98 16. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 35 99 17. References . . . . . . . . . . . . . . . . . . . . . . . . . . 36 100 17.1. Normative References . . . . . . . . . . . . . . . . . . . 36 101 17.2. Informative References . . . . . . . . . . . . . . . . . . 36 102 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 40 104 1. Introduction 106 As Internet technology and communication has grown and matured, many 107 techniques have developed that use virtual topologies (manifested by 108 tunnels of one form or another) over an actual network that supports 109 the Internet Protocol (IP) [RFC0791][RFC2460]. Those virtual 110 topologies have elements that appear as one network layer hop, but 111 are actually multiple IP or sub-IP layer hops. These multiple hops 112 often have quite diverse properties that are often not even visible 113 to the endpoints of the virtual hop. This introduces failure modes 114 that are not dealt with well in current approaches. 116 The use of IP encapsulation (also known as "tunneling") has long been 117 considered as the means for creating such virtual topologies (e.g., 118 see [RFC2003][RFC2473]). However, the encapsulation headers often 119 include insufficiently provisioned per-packet identification values. 120 IP encapsulation also allows an attacker to produce encapsulated 121 packets with spoofed source addresses even if the source address in 122 the encapsulating header cannot be spoofed. A denial-of-service 123 vector that is not possible in non-tunneled subnetworks is therefore 124 presented. 126 Additionally, the insertion of an outer IP header reduces the 127 effective path MTU visible to the inner network layer. When IPv6 is 128 used as the encapsulation protocol, original sources expect to be 129 informed of the MTU limitation through IPv6 Path MTU discovery 130 (PMTUD) [RFC1981]. When IPv4 is used, this reduced MTU can be 131 accommodated through the use of IPv4 fragmentation, but unmitigated 132 in-the-network fragmentation has been found to be harmful through 133 operational experience and studies conducted over the course of many 134 years [FRAG][FOLK][RFC4963]. Additionally, classical IPv4 PMTUD 135 [RFC1191] has known operational issues that are exacerbated by in- 136 the-network tunnels [RFC2923][RFC4459]. 138 The following subsections present further details on the motivation 139 and approach for addressing these issues. 141 1.1. Motivation 143 Before discussing the approach, it is necessary to first understand 144 the problems. In both the Internet and private-use networks today, 145 IP is ubiquitously deployed as the Layer 3 protocol. The primary 146 functions of IP are to provide for routing, addressing, and a 147 fragmentation and reassembly capability used to accommodate links 148 with diverse MTUs. While it is well known that the IP address space 149 is rapidly becoming depleted, there is also a growing awareness that 150 other IP protocol limitations have already or may soon become 151 problematic. 153 First, the Internet historically provided no means for discerning 154 whether the source addresses of IP packets are authentic. This 155 shortcoming is being addressed more and more through the deployment 156 of site border router ingress filters [RFC2827], however the use of 157 encapsulation provides a vector for an attacker to circumvent 158 filtering for the encapsulated packet even if filtering is correctly 159 applied to the encapsulation header. Secondly, the IP header does 160 not include a well-behaved identification value unless the source has 161 included a fragment header for IPv6 or unless the source permits 162 fragmentation for IPv4. These limitations preclude an efficient 163 means for routers to detect duplicate packets and packets that have 164 been re-ordered within the subnetwork. Additionally, recent studies 165 have shown that the arrival of fragments at high data rates can cause 166 denial-of-service (DoS) attacks on performance-sensitive networking 167 gear, prompting some administrators to configure their equipment to 168 drop fragments unconditionally [I-D.taylor-v6ops-fragdrop]. 170 For IPv4 encapsulation, when fragmentation is permitted the header 171 includes a 16-bit Identification field, meaning that at most 2^16 172 unique packets with the same (source, destination, protocol)-tuple 173 can be active in the network at the same time [RFC6864]. (When 174 middleboxes such as Network Address Translators (NATs) re-write the 175 Identification field to random values, the number of unique packets 176 is even further reduced.) Due to the escalating deployment of high- 177 speed links, however, these numbers have become too small by several 178 orders of magnitude for high data rate packet sources such as tunnel 179 endpoints [RFC4963]. 181 Furthermore, there are many well-known limitations pertaining to IPv4 182 fragmentation and reassembly - even to the point that it has been 183 deemed "harmful" in both classic and modern-day studies (see above). 184 In particular, IPv4 fragmentation raises issues ranging from minor 185 annoyances (e.g., in-the-network router fragmentation [RFC1981]) to 186 the potential for major integrity issues (e.g., mis-association of 187 the fragments of multiple IP packets during reassembly [RFC4963]). 189 As a result of these perceived limitations, a fragmentation-avoiding 190 technique for discovering the MTU of the forward path from a source 191 to a destination node was devised through the deliberations of the 192 Path MTU Discovery Working Group (PMTUDWG) during the late 1980's 193 through early 1990's which resulted in the publication of [RFC1191]. 194 In this negative feedback-based method, the source node provides 195 explicit instructions to routers in the path to discard the packet 196 and return an ICMP error message if an MTU restriction is 197 encountered. However, this approach has several serious shortcomings 198 that lead to an overall "brittleness" [RFC2923]. 200 In particular, site border routers in the Internet have been known to 201 discard ICMP error messages coming from the outside world. This is 202 due in large part to the fact that malicious spoofing of error 203 messages in the Internet is trivial since there is no way to 204 authenticate the source of the messages [RFC5927]. Furthermore, when 205 a source node that requires ICMP error message feedback when a packet 206 is dropped due to an MTU restriction does not receive the messages, a 207 path MTU-related black hole occurs. This means that the source will 208 continue to send packets that are too large and never receive an 209 indication from the network that they are being discarded. This 210 behavior has been confirmed through documented studies showing clear 211 evidence of PMTUD failures for both IPv4 and IPv6 in the Internet 212 today [TBIT][WAND][SIGCOMM][RIPE]. 214 The issues with both IP fragmentation and this "classical" PMTUD 215 method are exacerbated further when IP tunneling is used [RFC4459]. 216 For example, an ingress tunnel endpoint (ITE) may be required to 217 forward encapsulated packets into the subnetwork on behalf of 218 hundreds, thousands, or even more original sources. If the ITE 219 allows IP fragmentation on the encapsulated packets, persistent 220 fragmentation could lead to undetected data corruption due to 221 Identification field wrapping and/or reassembly congestion at the 222 ETE. If the ITE instead uses classical IP PMTUD it must rely on ICMP 223 error messages coming from the subnetwork that may be suspect, 224 subject to loss due to filtering middleboxes, or insufficiently 225 provisioned for translation into error messages to be returned to the 226 original sources. 228 Although recent works have led to the development of a positive 229 feedback-based end-to-end MTU determination scheme [RFC4821], they do 230 not excuse tunnels from accounting for the encapsulation overhead 231 they add to packets. Moreover, in current practice existing 232 tunneling protocols mask the MTU issues by selecting a "lowest common 233 denominator" MTU that may be much smaller than necessary for most 234 paths and difficult to change at a later date. Therefore, a new 235 approach to accommodate tunnels over links with diverse MTUs is 236 necessary. 238 1.2. Approach 240 This document concerns subnetworks manifested through a virtual 241 topology configured over a connected network routing region and 242 bounded by encapsulating border nodes. Example connected network 243 routing regions include Mobile Ad hoc Networks (MANETs), enterprise 244 networks and the global public Internet itself. Subnetwork border 245 nodes forward unicast and multicast packets over the virtual topology 246 across multiple IP and/or sub-IP layer forwarding hops that may 247 introduce packet duplication and/or traverse links with diverse 248 Maximum Transmission Units (MTUs). 250 This document introduces a Subnetwork Encapsulation and Adaptation 251 Layer (SEAL) for tunneling inner network layer protocol packets over 252 IP subnetworks that connect Ingress and Egress Tunnel Endpoints 253 (ITEs/ETEs) of border nodes. It provides a modular specification 254 designed to be tailored to specific associated tunneling protocols. 255 (A transport-mode of operation is also possible, but out of scope for 256 this document.) 258 SEAL provides a mid-layer encapsulation that accommodates links with 259 diverse MTUs, and allows routers in the subnetwork to perform 260 efficient duplicate packet and packet reordering detection. The 261 encapsulation further ensures message origin authentication, packet 262 header integrity and anti-replay in environments in which these 263 functions are necessary. 265 SEAL treats tunnels that traverse the subnetwork as ordinary links 266 that must support network layer services. Moreover, SEAL provides 267 dynamic mechanisms (including limited segmentation and reassembly) to 268 ensure a maximal path MTU over the tunnel. This is in contrast to 269 static approaches which avoid MTU issues by selecting a lowest common 270 denominator MTU value that may be overly conservative for the vast 271 majority of tunnel paths and difficult to change even when larger 272 MTUs become available. 274 1.3. Differences with RFC5320 276 This specification of SEAL is descended from an experimental 277 independent RFC publication of the same name [RFC5320]. However, 278 this specification introduces a number of important differences from 279 the earlier publication. 281 First, this specification includes a protocol version field in the 282 SEAL header whereas [RFC5320] does not, and therefore cannot be 283 updated by future revisions. This specification therefore obsoletes 284 (i.e., and does not update) [RFC5320]. 286 Secondly, [RFC5320] forms a 32-bit Identification value by 287 concatenating the 16-bit IPv4 Identification field with a 16-bit 288 Identification "extension" field in the SEAL header. This means that 289 [RFC5320] can only operate over IPv4 networks (since IPv6 headers do 290 not include a 16-bit version number) and that the SEAL Identification 291 value can be corrupted if the Identification in the outer IPv4 header 292 is rewritten. In contrast, this specification includes a 32-bit 293 Identification value that is independent of any identification fields 294 found in the inner or outer IP headers, and is therefore compatible 295 with any inner and outer IP protocol version combinations. 297 Additionally, the SEAL segmentation and reassembly procedures defined 298 in [RFC5320] differ significantly from those found in this 299 specification. In particular, this specification defines a 6-bit 300 Offset field that allows for smaller segment sizes when SEAL 301 segmentation is necessary (e.g., in order to observe the IPv4 minimum 302 MTU of 68 bytes). In contrast, [RFC5320] includes a 3-bit Segment 303 field and performs reassembly through concatenation of consecutive 304 segments. 306 The SEAL header in this specification also includes an optional 307 Integrity Check Vector (ICV) that can be used to digitally sign the 308 SEAL header and the leading portion of the encapsulated inner packet. 309 This allows for a lightweight integrity check and a loose message 310 origin authentication capability. The header further includes new 311 control bits as well as a link identification and encapsulation level 312 field for additional control capabilities. 314 Finally, this version of SEAL includes a new messaging protocol known 315 as the SEAL Control Message Protocol (SCMP), whereas [RFC5320] 316 performs signalling through the use of SEAL-encapsulated ICMP 317 messages. The use of SCMP allows SEAL-specific departures from ICMP, 318 as well as a control messaging capability that extends to other 319 specifications, including Virtual Enterprise Traversal (VET) 320 [I-D.templin-intarea-vet]. 322 2. Terminology 324 The following terms are defined within the scope of this document: 326 subnetwork 327 a virtual topology configured over a connected network routing 328 region and bounded by encapsulating border nodes. 330 IP 331 used to generically refer to either Internet Protocol (IP) 332 version, i.e., IPv4 or IPv6. 334 Ingress Tunnel Endpoint (ITE) 335 a virtual interface over which an encapsulating border node (host 336 or router) sends encapsulated packets into the subnetwork. 338 Egress Tunnel Endpoint (ETE) 339 a virtual interface over which an encapsulating border node (host 340 or router) receives encapsulated packets from the subnetwork. 342 SEAL Path 343 a subnetwork path from an ITE to an ETE beginning with an 344 underlying link of the ITE as the first hop. Note that, if the 345 ITE's interface connection to the underlying link assigns multiple 346 IP addresses, each address represents a separate SEAL path. 348 inner packet 349 an unencapsulated network layer protocol packet (e.g., IPv4 350 [RFC0791], OSI/CLNP [RFC0994], IPv6 [RFC2460], etc.) before any 351 outer encapsulations are added. Internet protocol numbers that 352 identify inner packets are found in the IANA Internet Protocol 353 registry [RFC3232]. SEAL protocol packets that incur an 354 additional layer of SEAL encapsulation are also considered inner 355 packets. 357 outer IP packet 358 a packet resulting from adding an outer IP header (and possibly 359 other outer headers) to a SEAL-encapsulated inner packet. 361 packet-in-error 362 the leading portion of an invoking data packet encapsulated in the 363 body of an error control message (e.g., an ICMPv4 [RFC0792] error 364 message, an ICMPv6 [RFC4443] error message, etc.). 366 Packet Too Big (PTB) message 367 a control plane message indicating an MTU restriction (e.g., an 368 ICMPv6 "Packet Too Big" message [RFC4443], an ICMPv4 369 "Fragmentation Needed" message [RFC0792], etc.). 371 Don't Fragment (DF) bit 372 a bit that indicates whether the packet may be fragmented by the 373 network. The DF bit is explicitly included in the IPv4 header 374 [RFC0791] and may be set to '0' to allow fragmentation or '1' to 375 disallow further in-network fragmentation. The bit is absent from 376 the IPv6 header [RFC2460], but implicitly set to '1' becauuse 377 fragmentation can occur only at IPv6 sources. 379 The following abbreviations correspond to terms used within this 380 document and/or elsewhere in common Internetworking nomenclature: 382 HLEN - the length of the SEAL header plus outer headers 384 ICV - Integrity Check Vector 386 MAC - Message Authentication Code 388 MTU - Maximum Transmission Unit 389 SCMP - the SEAL Control Message Protocol 391 SDU - SCMP Destination Unreachable message 393 SPP - SCMP Parameter Problem message 395 SPTB - SCMP Packet Too Big message 397 SEAL - Subnetwork Encapsulation and Adaptation Layer 399 TE - Tunnel Endpoint (i.e., either ingress or egress) 401 VET - Virtual Enterprise Traversal 403 3. Requirements 405 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 406 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 407 document are to be interpreted as described in [RFC2119]. When used 408 in lower case (e.g., must, must not, etc.), these words MUST NOT be 409 interpreted as described in [RFC2119], but are rather interpreted as 410 they would be in common English. 412 4. Applicability Statement 414 SEAL was originally motivated by the specific case of subnetwork 415 abstraction for Mobile Ad hoc Networks (MANETs), however the domain 416 of applicability also extends to subnetwork abstractions over 417 enterprise networks, ISP networks, SO/HO networks, the global public 418 Internet itself, and any other connected network routing region. 420 SEAL provides a network sublayer for encapsulation of an inner 421 network layer packet within outer encapsulating headers. SEAL can 422 also be used as a sublayer within a transport layer protocol data 423 payload, where transport layer encapsulation is typically used for 424 Network Address Translator (NAT) traversal as well as operation over 425 subnetworks that give preferential treatment to certain "core" 426 Internet protocols, e.g., TCP, UDP, etc.. (However, note that TCP 427 encapsulation may not be appropriate for all use cases; particularly 428 those that require low delay and/or delay variance.) The SEAL header 429 is processed in a similar manner as for IPv6 extension headers, i.e., 430 it is not part of the outer IP header but rather allows for the 431 creation of an arbitrarily extensible chain of headers in the same 432 way that IPv6 does. 434 To accommodate MTU diversity, the Ingress Tunnel Endpoint (ITE) may 435 need to perform limited segmentation which the Egress Tunnel Endpoint 436 (ETE) reassembles. The ETE further acts as a passive observer that 437 informs the ITE of any packet size limitations. This allows the ITE 438 to return appropriate PMTUD feedback even if the network path between 439 the ITE and ETE filters ICMP messages. 441 SEAL further provides mechanisms to ensure message origin 442 authentication, packet header integrity, and anti-replay. The SEAL 443 framework is therefore similar to the IP Security (IPsec) 444 Authentication Header (AH) [RFC4301][RFC4302], however it provides 445 only minimal hop-by-hop authenticating services while leaving full 446 data integrity, authentication and confidentiality services as an 447 end-to-end consideration. 449 In many aspects, SEAL also very closely resembles the Generic Routing 450 Encapsulation (GRE) framework [RFC1701]. SEAL can therefore be 451 applied in the same use cases that are traditionally addressed by 452 GRE, but goes beyond GRE to also provide additional capabilities 453 (e.,g., path MTU accommodation, message origin authentication, etc.) 454 as described in this document. 456 5. SEAL Specification 458 The following sections specify the operation of SEAL: 460 5.1. SEAL Tunnel Model 462 SEAL is an encapsulation sublayer used within point-to-point, point- 463 to-multipoint, and non-broadcast, multiple access (NBMA) tunnels. 464 Each SEAL path is configured over one or more underlying interfaces 465 attached to subnetwork links. The SEAL tunnel connects an ITE to one 466 or more ETE "neighbors" via encapsulation across an underlying 467 subnetwork, where the tunnel neighbor relationship may be either 468 unidirectional or bidirectional. 470 A unidirectional tunnel neighbor relationship allows the near end ITE 471 to send data packets forward to the far end ETE, while the ETE only 472 returns control messages when necessary. A bidirectional tunnel 473 neighbor relationship is one over which both TEs can exchange both 474 data and control messages. 476 Implications of the SEAL unidirectional and bidirectional models are 477 the same as discussed in [I-D.templin-intarea-vet]. 479 5.2. SEAL Model of Operation 481 SEAL-enabled ITEs encapsulate each inner packet in a SEAL header and 482 any outer header encapsulations as shown in Figure 1: 484 +--------------------+ 485 ~ outer IP header ~ 486 +--------------------+ 487 ~ other outer hdrs ~ 488 +--------------------+ 489 ~ SEAL Header ~ 490 +--------------------+ +--------------------+ 491 | | --> | | 492 ~ Inner ~ --> ~ Inner ~ 493 ~ Packet ~ --> ~ Packet ~ 494 | | --> | | 495 +--------------------+ +----------+---------+ 497 Figure 1: SEAL Encapsulation 499 The ITE inserts the SEAL header according to the specific tunneling 500 protocol. For simple encapsulation of an inner network layer packet 501 within an outer IP header, the ITE inserts the SEAL header following 502 the outer IP header and before the inner packet as: IP/SEAL/{inner 503 packet}. 505 For encapsulations over transports such as UDP, the ITE inserts the 506 SEAL header following the outer transport layer header and before the 507 inner packet, e.g., as IP/UDP/SEAL/{inner packet}. In that case, the 508 UDP header is seen as an "other outer header" as depicted in Figure 1 509 and the outer IP and transport layer headers are together seen as the 510 outer encapsulation headers. 512 SEAL supports both "nested" tunneling and "re-encapsulating" 513 tunneling. Nested tunneling occurs when a first tunnel is 514 encapsulated within a second tunnel, which may then further be 515 encapsulated within additional tunnels. Nested tunneling can be 516 useful, and stands in contrast to "recursive" tunneling which is an 517 anomalous condition incurred due to misconfiguration or a routing 518 loop. Considerations for nested tunneling and avoiding recursive 519 tunneling are discussed in Section 4 of [RFC2473]. 521 Re-encapsulating tunneling occurs when a packet arrives at a first 522 ETE, which then acts as an ITE to re-encapsulate and forward the 523 packet to a second ETE connected to the same subnetwork. In that 524 case each ITE/ETE transition represents a segment of a bridged path 525 between the ITE nearest the source and the ETE nearest the 526 destination. Considerations for re-encapsulating tunneling are 527 discussed in[I-D.templin-ironbis]. Combinations of nested and re- 528 encapsulating tunneling are also naturally supported by SEAL. 530 The SEAL ITE considers each underlying interface as the ingress 531 attachment point to a SEAL path to the ETE. The ITE therefore may 532 experience different path MTUs on different SEAL paths. 534 Finally, the SEAL ITE ensures that the inner network layer protocol 535 will see a minimum MTU of 1500 bytes over each SEAL path regardless 536 of the outer network layer protocol version, i.e., even if a small 537 amount of segmentation and reassembly are necessary. This is to 538 avoid path MTU "black holes" for the minimum MTU configured by the 539 vast majority of links in the Internet. Note that in some scenarios, 540 however, reassembly may place a heavy burden on the ETE. In that 541 case, the ITE should avoid invoking segmentation and instead report 542 an MTU smaller than 1500 bytes to the original source. 544 5.3. SEAL Header and Trailer Format 546 The SEAL header is formatted as follows: 548 0 1 2 3 549 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 550 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 551 |VER|C|A|I|V|R|RES|M| Offset | NEXTHDR | LINK_ID |LEVEL| 552 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 553 | Identification (optional) | 554 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 555 | Integrity Check Vector (optional) | 556 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... 558 Figure 2: SEAL Header Format 560 VER (2) 561 a 2-bit version field. This document specifies Version 0 of the 562 SEAL protocol, i.e., the VER field encodes the value 0. 564 C (1) 565 the "Control/Data" bit. Set to 1 by the ITE in SEAL Control 566 Message Protocol (SCMP) control messages, and set to 0 in ordinary 567 data packets. 569 A (1) 570 the "Acknowledgement Requested" bit. Set to 1 by the ITE in SEAL 571 data packets for which it wishes to receive an explicit 572 acknowledgement from the ETE. 574 I (1) 575 the "Identification Included" bit. 577 V (1) 578 the "Integrity Check Vector included" bit. 580 R (1) 581 the "Redirects Permitted" bit (reserved for use by VET: 582 [I-D.templin-intarea-vet]). 584 RES (2) a 2-bit reserved field. 586 M (1) the "More Segments" bit. Set to 1 in a non-final segment and 587 set to 0 in the final segment of the SEAL packet. 589 Offset (6) a 6-bit Offset field. Set to 0 in the first segment of a 590 segmented SEAL packet. Set to an integral number of 32 byte 591 blocks in subsequent segments (e.g., an Offset of 10 indicates a 592 block that begins at the 320th byte in the packet). 594 NEXTHDR (8) an 8-bit field that encodes the next header Internet 595 Protocol number the same as for the IPv4 protocol and IPv6 next 596 header fields. 598 LINK_ID (5) 599 a 5-bit link identification value, set to a unique value by the 600 ITE for each SEAL path over which it will send encapsulated 601 packets to the ETE (up to 32 SEAL paths per ETE are therefore 602 supported). Note that, if the ITE's interface connection to the 603 underlying link assigns multiple IP addresses, each address 604 represents a separate SEAL path that must be assigned a separate 605 LINK_ID. 607 LEVEL (3) 608 a 3-bit nesting level; use to limit the number of tunnel nesting 609 levels. Set to an integer value up to 7 in the innermost SEAL 610 encapsulation, and decremented by 1 for each successive additional 611 SEAL encapsulation nesting level. Up to 8 levels of nesting are 612 therefore supported. 614 Identification (32) 615 an optional 32-bit per-packet identification field; present when 616 I==1. Set to a 32-bit value (beginning with 0) that is 617 monotonically-incremented for each SEAL packet transmitted to this 618 ETE. 620 Integrity Check Vector (ICV) (variable) 621 an optional variable-length integrity check vector field; present 622 when V==1. 624 5.4. ITE Specification 626 5.4.1. Tunnel Interface MTU 628 The tunnel interface must present a constant MTU value to the inner 629 network layer as the size for admission of inner packets into the 630 interface. Since NBMA tunnel virtual interfaces may support a large 631 set of SEAL paths that accept widely varying maximum packet sizes, 632 however, a number of factors should be taken into consideration when 633 selecting a tunnel interface MTU. 635 Due to the ubiquitous deployment of standard Ethernet and similar 636 networking gear, the nominal Internet cell size has become 1500 637 bytes; this is the de facto size that end systems have come to expect 638 will either be delivered by the network without loss due to an MTU 639 restriction on the path or a suitable ICMP Packet Too Big (PTB) 640 message returned. When large packets sent by end systems incur 641 additional encapsulation at an ITE, however, they may be dropped 642 silently within the tunnel since the network may not always deliver 643 the necessary PTBs [RFC2923]. The ITE SHOULD therefore set a tunnel 644 interface MTU of at least 1500 bytes. 646 The inner network layer protocol consults the tunnel interface MTU 647 when admitting a packet into the interface. For non-SEAL inner IPv4 648 packets with the IPv4 Don't Fragment (DF) bit cleared (i.e, DF==0), 649 if the packet is larger than the tunnel interface MTU the inner IPv4 650 layer uses IPv4 fragmentation to break the packet into fragments no 651 larger than the tunnel interface MTU. The ITE then admits each 652 fragment into the interface as an independent packet. 654 For all other inner packets, the inner network layer admits the 655 packet if it is no larger than the tunnel interface MTU; otherwise, 656 it drops the packet and sends a PTB error message to the source with 657 the MTU value set to the tunnel interface MTU. The message contains 658 as much of the invoking packet as possible without the entire message 659 exceeding the network layer minimum MTU size. 661 The ITE can alternatively set an indefinite MTU on the tunnel 662 interface such that all inner packets are admitted into the interface 663 regardless of their size. For ITEs that host applications that use 664 the tunnel interface directly, this option must be carefully 665 coordinated with protocol stack upper layers since some upper layer 666 protocols (e.g., TCP) derive their packet sizing parameters from the 667 MTU of the outgoing interface and as such may select too large an 668 initial size. This is not a problem for upper layers that use 669 conservative initial maximum segment size estimates and/or when the 670 tunnel interface can reduce the upper layer's maximum segment size, 671 e.g., by reducing the size advertised in the MSS option of outgoing 672 TCP messages (sometimes known as "MSS clamping"). 674 In light of the above considerations, the ITE SHOULD configure an 675 indefinite MTU on tunnel *router* interfaces so that SEAL performs 676 all subnetwork adaptation from within the interface as specified in 677 Section 5.4.3. The ITE can instead set a smaller MTU on tunnel 678 *host* interfaces, e.g., the maximum of 1500 bytes and the smallest 679 MTU among all of the underlying links minus the size of the 680 encapsulation headers. 682 5.4.2. Tunnel Neighbor Soft State 684 The tunnel virtual interface maintains a number of soft state 685 variables for each ETE and for each SEAL path. 687 When per-packet identification is required, the ITE maintains a per 688 ETE window of Identification values for the packets it has recently 689 sent to this ETE. The ITE then sets a variable "USE_ID" to TRUE, and 690 includes an Identification in each packet it sends to this ETE; 691 otherwise, it sets USE_ID to FALSE. 693 When message origin authentication and integrity checking is 694 required, the ITE also includes an ICV in the packets it sends to the 695 ETE. The ICV format is shown in Figure 3: 697 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 698 |F|Key|Algorithm| Message Authentication Code (MAC) | 699 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... 701 Figure 3: Integrity Check Vector (ICV) Format 703 As shown in the figure, the ICV begins with a 1-octet control field 704 with a 1-bit (F)lag, a 2-bit Key identifier and a 5-bit Algorithm 705 identifier. The control octet is followed by a variable-length 706 Message Authentication Code (MAC). The ITE maintains a per ETE 707 algorithm and secret key to calculate the MAC in each packet it will 708 send to this ETE. (By default, the ITE sets the F bit and Algorithm 709 fields to 0 to indicate use of the HMAC-SHA-1 algorithm with a 160 710 bit shared secret key to calculate an 80 bit MAC per [RFC2104] over 711 the leading 128 bytes of the packet. Other values for F and 712 Algorithm are out of scope.) The ITE then sets a variable "USE_ICV" 713 to TRUE, and includes an ICV in each packet it sends to this ETE; 714 otherwise, it sets USE_ICV to FALSE. 716 For each SEAL path, the ITE must also account for encapsulation 717 header lengths. The ITE therefore maintains the per SEAL path 718 constant values "SHLEN" set to the length of the SEAL header, "THLEN" 719 set to the length of the outer encapsulating transport layer headers 720 (or 0 if outer transport layer encapsulation is not used), "IHLEN" 721 set to the length of the outer IP layer header, and "HLEN" set to 722 (SHLEN+THLEN+IHLEN). (The ITE must include the length of the 723 uncompressed headers even if header compression is enabled when 724 calculating these lengths.) In addition, the ITE maintains a per 725 SEAL path variable "MAXMTU" initialized to the maximum of 1500 bytes 726 and the MTU of the underlying link minus HLEN. 728 The ITE further sets a variable 'MINMTU' to the minimum MTU for the 729 SEAL path over which encapsulated packets will travel. For IPv6 730 paths the ITE sets MINMTU=1280 (see: [RFC2460]) and for IPv4 paths 731 the ITE sets MINMTU=576 even though the true MINMTU for IPv4 is only 732 68 bytes (see: [RFC0791]). 734 The ITE can also set MINMTU to a larger value if there is reason to 735 believe that the minimum path MTU is larger, or to a smaller value if 736 there is reason to believe the MTU is smaller, e.g., if there may be 737 additional encapsulations on the path. If this value proves too 738 large, the ITE will receive PTB message feedback either from the ETE 739 or from a router on the path and will be able to reduce its MINMTU to 740 a smaller value. 742 The ITE may instead maintain the packet sizing variables and 743 constants as per ETE (rather than per SEAL path) values. In that 744 case, the values reflect the lowest-common-denominator size across 745 all of the SEAL paths associated with this ETE. 747 5.4.3. SEAL Layer Pre-Processing 749 The SEAL layer is logically positioned between the inner and outer 750 network protocol layers, where the inner layer is seen as the (true) 751 network layer and the outer layer is seen as the (virtual) data link 752 layer. Each packet to be processed by the SEAL layer is either 753 admitted into the tunnel interface by the inner network layer 754 protocol as described in Section 5.4.1 or is undergoing re- 755 encapsulation from within the tunnel interface. The SEAL layer sees 756 the former class of packets as inner packets that include inner 757 network and transport layer headers, and sees the latter class of 758 packets as transitional SEAL packets that include the outer and SEAL 759 layer headers that were inserted by the previous hop SEAL ITE. For 760 these transitional packets, the SEAL layer re-encapsulates the packet 761 with new outer and SEAL layer headers when it forwards the packet to 762 the next hop SEAL ITE. 764 We now discuss the SEAL layer pre-processing actions for these two 765 classes of packets. 767 5.4.3.1. Inner Packet Pre-Processing 769 For each inner packet admitted into the tunnel interface, if the 770 packet is itself a SEAL packet (i.e., one with the port number for 771 SEAL in the transport layer header or one with the protocol number 772 for SEAL in the IP layer header) and the LEVEL field of the SEAL 773 header contains the value 0, the ITE silently discards the packet. 775 Otherwise, for non-SEAL IPv4 inner packets with DF==0 in the IP 776 header and IPv6 inner packets with a fragment header and with (MF=0; 777 Offset=0), if the packet is larger than (MINMTU-HLEN) the ITE uses IP 778 fragmentation to fragment the packet into N roughly equal-length 779 pieces, where N is minimized and each fragment is significantly 780 smaller than (MINMTU-HLEN) to allow for additional encapsulations in 781 the path. The ITE then submits each fragment for SEAL encapsulation 782 as specified in Section 5.4.4. 784 For all other inner packets, if the packet is no larger than MAXMTU 785 for the corresponding SEAL path the ITE submits it for SEAL 786 encapsulation as specified in Section 5.4.4. Otherwise, the ITE 787 drops the packet and sends an ordinary PTB message appropriate to the 788 inner protocol version (subject to rate limiting) with the MTU field 789 set to MAXMTU. (For IPv4 SEAL packets with DF==0, the ITE should set 790 DF=1 and re-calculate the IPv4 header checksum before generating the 791 PTB message in order to avoid bogon filters.) After sending the PTB 792 message, the ITE discards the inner packet. 794 5.4.3.2. Transitional SEAL Packet Pre-Processing 796 For each transitional packet that is to be processed by the SEAL 797 layer from within the tunnel interface, the ITE sets aside the SEAL 798 encapsulation headers that were received from the previous hop. 799 Next, if the packet is no larger than MAXMTU for the next hop SEAL 800 path the ITE submits it for SEAL encapsulation as specified in 801 Section 5.4.4. Otherwise, the ITE drops the packet and sends an SCMP 802 Packet Too Big (SPTB) message to the previous hop subject to rate 803 limiting (see: Section 5.6.1.1) with the MTU field set to MAXMTU. 804 After sending the SPTB message, the ITE discards the packet. 806 5.4.4. SEAL Encapsulation and Segmentation 808 For each inner packet/fragment submitted for SEAL encapsulation, the 809 ITE next encapsulates the packet in a SEAL header formatted as 810 specified in Section 5.3. The SEAL header includes an Identification 811 field when USE_ID is TRUE, followed by an ICV field when USE_ICV is 812 TRUE. 814 The ITE next sets C=0 and RES=0 in the SEAL header. The ITE also 815 sets A=1 if ETE reachability determination is necessary (see: Section 816 5.4.6) or for stateful MTU determination (see Section 5.4.9). 817 Otherwise, the ITE sets A=0. 819 The ITE then sets LINK_ID to the value assigned to the underlying 820 SEAL path, and sets NEXTHDR to the protocol number corresponding to 821 the address family of the encapsulated inner packet. For example, 822 the ITE sets NEXTHDR to the value '4' for encapsulated IPv4 packets 823 [RFC2003], '41' for encapsulated IPv6 packets [RFC2473][RFC4213], 824 '80' for encapsulated OSI/CLNP packets [RFC1070], etc. 826 Next, if the inner packet is not itself a SEAL packet the ITE sets 827 LEVEL to an integer value between 0 and 7 as a specification of the 828 number of additional layers of nested SEAL encapsulations permitted. 829 If the inner packet is a SEAL packet that is undergoing nested 830 encapsulation, the ITE instead sets LEVEL to the value that appears 831 in the inner packet's SEAL header minus 1. If the inner packet is 832 undergoing SEAL re-encapsulation, the ITE instead copies the LEVEL 833 value from the SEAL header of the packet to be re-encapsulated. 835 Next, if the inner packet is no larger than (MINMTU-HLEN) or larger 836 than 1500, the ITE sets (M=0; Offset=0). Otherwise, the ITE breaks 837 the inner packet into a N roughly equal-length non-overlapping 838 segments (where N is minimized and each fragment is significantly 839 smaller than (MINMTU-HLEN) to allow for additional encapsulations in 840 the path) then appends a clone of the SEAL header from the first 841 segment onto the head of each additional segment. The ITE MUST also 842 include an Identification field and set USE_ID=TRUE for each segment. 843 The ITE then sets (M=1; Offset=0) in the first segment, sets (M=0/1; 844 Offset=O(1)) in the second segment, sets (M=0/1; Offset=O(2)) in the 845 third segment (if needed), etc., then finally sets (M=0; Offset=O(n)) 846 in the final segment (where O(i) is the number of 32 byte blocks that 847 preceded this segment). 849 When USE_ID is FALSE, the ITE next sets I=0. Otherwise, the ITE sets 850 I=1 and writes a monotonically-incrementing integer value for this 851 ETE in the Identification field beginning with 0 in the first packet 852 transmitted. (For SEAL packets that have been split into multiple 853 pieces, the ITE writes the same Identification value in each piece.) 854 The monotonically-incrementing requirement is to satisfy ETEs that 855 use this value for anti-replay purposes. The value is incremented 856 modulo 2^32, i.e., it wraps back to 0 when the previous value was 857 (2^32 - 1). 859 When USE_ICV is FALSE, the ITE next sets V=0. Otherwise, the ITE 860 sets V=1, includes an ICV and calculates the MAC using HMAC-SHA-1 861 with a 160 bit secret key and 80 bit MAC field. Beginning with the 862 SEAL header, the ITE sets the ICV field to 0, calculates the MAC over 863 the leading 128 bytes of the packet (or up to the end of the packet 864 if there are fewer than 128 bytes) and places the result in the MAC 865 field. (For SEAL packets that have been split into multiple pieces, 866 each piece calculates its own MAC.) The ITE then writes the value 0 867 in the F flag and 0x00 in the Algorithm field of the ICV control 868 octet (other values for these fields, and other MAC calculation 869 disciplines, are outside the scope of this document and may be 870 specified in future documents.) 872 The ITE then adds the outer encapsulating headers as specified in 873 Section 5.4.5. 875 5.4.5. Outer Encapsulation 877 Following SEAL encapsulation, the ITE next encapsulates each segment 878 in the requisite outer transport (when necessary) and IP layer 879 headers. When a transport layer header such as UDP or TCP is 880 included, the ITE writes the port number for SEAL in the transport 881 destination service port field. 883 When UDP encapsulation is used, the ITE sets the UDP checksum field 884 to zero for IPv4 packets and also sets the UDP checksum field to zero 885 for IPv6 packets even though IPv6 generally requires UDP checksums. 886 Further considerations for setting the UDP checksum field for IPv6 887 packets are discussed in [RFC6935][RFC6936]. 889 The ITE then sets the outer IP layer headers the same as specified 890 for ordinary IP encapsulation (e.g., [RFC1070][RFC2003], [RFC2473], 891 [RFC4213], etc.) except that for ordinary SEAL packets the ITE copies 892 the "TTL/Hop Limit", "Type of Service/Traffic Class" and "Congestion 893 Experienced" values in the inner network layer header into the 894 corresponding fields in the outer IP header. For transitional SEAL 895 packets undergoing re-encapsulation, the ITE instead copies the "TTL/ 896 Hop Limit", "Type of Service/Traffic Class" and "Congestion 897 Experienced" values in the outer IP header of the received packet 898 into the corresponding fields in the outer IP header of the packet to 899 be forwarded (i.e., the values are transferred between outer headers 900 and *not* copied from the inner network layer header). 902 The ITE also sets the IP protocol number to the appropriate value for 903 the first protocol layer within the encapsulation (e.g., UDP, TCP, 904 SEAL, etc.). When IPv6 is used as the outer IP protocol, the ITE 905 then sets the flow label value in the outer IPv6 header the same as 906 described in [RFC6438]. When IPv4 is used as the outer IP protocol, 907 the ITE instead sets DF=0 in the IPv4 header to allow the packet to 908 be fragmented if it encounters a restricting link (for IPv6 SEAL 909 paths, the DF bit is implicitly set to 1). 911 The ITE finally sends each outer packet via the underlying link 912 corresponding to LINK_ID. 914 5.4.6. Path Probing and ETE Reachability Verification 916 All SEAL data packets sent by the ITE are considered implicit probes. 917 SEAL data packets will elicit an SCMP message from the ETE if it 918 needs to acknowledge a probe and/or report an error condition. SEAL 919 data packets may also be dropped by either the ETE or a router on the 920 path, which may or may not result in an ICMP message being returned 921 to the ITE. 923 The ITE processes ICMP messages as specified in Section 5.4.7. 925 The ITE processes SCMP messages as specified in Section 5.6.2. 927 5.4.7. Processing ICMP Messages 929 When the ITE sends SEAL packets, it may receive ICMP error messages 930 [RFC0792][RFC4443] from an ordinary router within the subnetwork. 931 Each ICMP message includes an outer IP header, followed by an ICMP 932 header, followed by a portion of the SEAL data packet that generated 933 the error (also known as the "packet-in-error") beginning with the 934 outer IP header. 936 The ITE should process ICMPv4 Protocol Unreachable messages and 937 ICMPv6 Parameter Problem messages with Code "Unrecognized Next Header 938 type encountered" as a hint that the IP destination address does not 939 implement SEAL. The ITE can optionally ignore ICMP messages that do 940 not include sufficient information in the packet-in-error, or process 941 them as a hint that the SEAL path may be failing. 943 For other ICMP messages, the ITE should use any outer header 944 information available as a first-pass authentication filter (e.g., to 945 determine if the source of the message is within the same 946 administrative domain as the ITE) and discards the message if first 947 pass filtering fails. 949 Next, the ITE examines the packet-in-error beginning with the SEAL 950 header. If the value in the Identification field (if present) is not 951 within the window of packets the ITE has recently sent to this ETE, 952 or if the MAC value in the SEAL header ICV field (if present) is 953 incorrect, the ITE discards the message. 955 Next, if the received ICMP message is a PTB the ITE sets the 956 temporary variable "PMTU" for this SEAL path to the MTU value in the 957 PTB message. If PMTU==0, the ITE consults a plateau table (e.g., as 958 described in [RFC1191]) to determine PMTU based on the length field 959 in the outer IP header of the packet-in-error. For example, if the 960 ITE receives a PTB message with MTU==0 and length 4KB, it can set 961 PMTU=2KB. If the ITE subsequently receives a PTB message with MTU==0 962 and length 2KB, it can set PMTU=1792, etc. to a minimum value of 963 PMTU=(1500+HLEN). If the ITE is performing stateful MTU 964 determination for this SEAL path (see Section 5.4.9), the ITE next 965 sets MAXMTU=MAX((PMTU-HLEN), 1500). 967 If the ICMP message was not discarded, the ITE then transcribes it 968 into a message to return to the previous hop. If the inner packet 969 was a SEAL data packet, the ITE transcribes the ICMP message into an 970 SCMP message. Otherwise, the ITE transcribes the ICMP message into a 971 message appropriate for the inner protocol version. 973 To transcribe the message, the ITE extracts the inner packet from 974 within the ICMP message packet-in-error field and uses it to generate 975 a new message corresponding to the type of the received ICMP message. 976 For SCMP messages, the ITE generates the message the same as 977 described for ETE generation of SCMP messages in Section 5.6.1. For 978 (S)PTB messages, the ITE writes (PMTU-HLEN) in the MTU field. 980 The ITE finally forwards the transcribed message to the previous hop 981 toward the inner source address. 983 5.4.8. IPv4 Middlebox Reassembly Testing 985 The ITE can perform a qualification exchange to ensure that the 986 subnetwork correctly delivers fragments to the ETE. This procedure 987 can be used, e.g., to determine whether there are middleboxes on the 988 path that violate the [RFC1812], Section 5.2.6 requirement that: "A 989 router MUST NOT reassemble any datagram before forwarding it". 991 The ITE should use knowledge of its topological arrangement as an aid 992 in determining when middlebox reassembly testing is necessary. For 993 example, if the ITE is aware that the ETE is located somewhere in the 994 public Internet, middlebox reassembly testing should not be 995 necessary. If the ITE is aware that the ETE is located behind a NAT 996 or a firewall, however, then reassembly testing can be used to detect 997 middleboxes that do not conform to specifications. 999 The ITE can perform a middlebox reassembly test by selecting a data 1000 packet to be used as a probe. While performing the test with real 1001 data packets, the ITE should select only inner packets that are no 1002 larger than (1500-HLEN) bytes for testing purposes. The ITE can also 1003 construct an explicit probe packet instead of using ordinary SEAL 1004 data packets. 1006 To generate an explicit probe packet, the ITE creates a packet buffer 1007 beginning with the same outer headers, SEAL header and inner network 1008 layer header that would appear in an ordinary data packet, then pads 1009 the packet with random data to a length that is at least 128 bytes 1010 but no longer than (1500-HLEN) bytes. The ITE then writes the value 1011 '0' in the inner network layer TTL (for IPv4) or Hop Limit (for IPv6) 1012 field. 1014 The ITE then sets C=0 in the SEAL header of the probe packet and sets 1015 the NEXTHDR field to the inner network layer protocol type. (The ITE 1016 may also set A=1 if it requires a positive acknowledgement; 1017 otherwise, it sets A=0.) Next, the ITE sets LINK_ID and LEVEL to the 1018 appropriate values for this SEAL path, sets Identification and I=1 1019 (when USE_ID is TRUE), then finally calculates the ICV and sets V=1 1020 (when USE_ICV is TRUE). 1022 The ITE then encapsulates the probe packet in the appropriate outer 1023 headers, splits it into two outer IPv4 fragments, then sends both 1024 fragments over the same SEAL path. 1026 The ITE should send a series of probe packets (e.g., 3-5 probes with 1027 1sec intervals between tests) instead of a single isolated probe in 1028 case of packet loss. If the ETE returns an SCMP PTB message with MTU 1029 != 0, then the SEAL path correctly supports fragmentation; otherwise, 1030 the ITE enables stateful MTU determination for this SEAL path as 1031 specified in Section 5.4.9. 1033 (Examples of middleboxes that may perform reassembly include stateful 1034 NATs and firewalls. Such devices could still allow for stateless MTU 1035 determination if they gather the fragments of a fragmented IPv4 SEAL 1036 data packet for packet analysis purposes but then forward the 1037 fragments on to the final destination rather than forwarding the 1038 reassembled packet.) 1040 5.4.9. Stateful MTU Determination 1042 SEAL supports a stateless MTU determination capability, however the 1043 ITE may in some instances wish to impose a stateful MTU limit on a 1044 particular SEAL path. For example, when the ETE is situated behind a 1045 middlebox that performs IPv4 reassembly (see: Section 5.4.8) it is 1046 imperative that fragmentation be avoided. In other instances (e.g., 1047 when the SEAL path includes performance-constrained links), the ITE 1048 may deem it necessary to cache a conservative static MTU in order to 1049 avoid sending large packets that would only be dropped due to an MTU 1050 restriction somewhere on the path. 1052 To determine a static MTU value, the ITE sends a series of probe 1053 packets of various sizes to the ETE with A=1 in the SEAL header and 1054 DF=1 in the outer IP header. The ITE then caches the size 'S' of the 1055 largest packet for which it receives a probe reply from the ETE by 1056 setting MAXMTU=MAX((S-HLEN), 1500) for this SEAL path. 1058 For example, the ITE could send probe packets of 4KB, followed by 1059 2KB, followed by 1792 bytes, etc. While probing, the ITE processes 1060 any ICMP PTB message it receives as a potential indication of probe 1061 failure then discards the message. 1063 5.4.10. Detecting Path MTU Changes 1065 When stateful MTU determination is used, the ITE SHOULD periodically 1066 reset MAXMTU and/or re-probe the path to determine whether MAXMTU has 1067 increased. If the path still has a too-small MTU, the ITE will 1068 receive a PTB message that reports a smaller size. 1070 5.5. ETE Specification 1072 5.5.1. Reassembly Buffer Requirements 1074 For IPv6, the ETE configures a reassembly buffer size of (1500 + 1075 HLEN) bytes for the reassembly of outer IPv6 packets, i.e., even 1076 though the true minimum reassembly size for IPv6 is only 1500 bytes 1077 [RFC2460]. For IPv4, the ETE also configures a reassembly buffer 1078 size of (1500 + HLEN) bytes for the reassembly of outer IPv4 packets, 1079 i.e., even though the true minimum reassembly size for IPv4 is only 1080 576 bytes [RFC1122]. 1082 In addition to this outer reassembly buffer requirement, the ETE 1083 further configures a SEAL reassembly buffer size of (1500 + HLEN) 1084 bytes for the reassembly of segmented SEAL packets (see: Section 1085 5.5.4). 1087 5.5.2. Tunnel Neighbor Soft State 1089 When message origin authentication and integrity checking is 1090 required, the ETE maintains a per-ITE MAC calculation algorithm and a 1091 symmetric secret key to verify the MAC. When per-packet 1092 identification is required, the ETE also maintains a window of 1093 Identification values for the packets it has recently received from 1094 this ITE. 1096 When the tunnel neighbor relationship is bidirectional, the ETE 1097 further maintains a per SEAL path mapping of outer IP and transport 1098 layer addresses to the LINK_ID that appears in packets received from 1099 the ITE. 1101 5.5.3. IP-Layer Reassembly 1103 The ETE reassembles fragmented IP packets that are explcitly 1104 addressed to itself. For IP fragments that are received via a SEAL 1105 tunnel, the ETE SHOULD maintain conservative reassembly cache high- 1106 and low-water marks. When the size of the reassembly cache exceeds 1107 this high-water mark, the ETE SHOULD actively discard stale 1108 incomplete reassemblies (e.g., using an Active Queue Management (AQM) 1109 strategy) until the size falls below the low-water mark. The ETE 1110 SHOULD also actively discard any pending reassemblies that clearly 1111 have no opportunity for completion, e.g., when a considerable number 1112 of new fragments have arrived before a fragment that completes a 1113 pending reassembly arrives. 1115 The ETE processes non-SEAL IP packets as specified in the normative 1116 references, i.e., it performs any necessary IP reassembly then 1117 discards the packet if it is larger than the reassembly buffer size 1118 or delivers the (fully-reassembled) packet to the appropriate upper 1119 layer protocol module. 1121 For SEAL packets, the ETE performs any necessary IP reassembly then 1122 submits the packet for SEAL decapsulation as specified in Section 1123 5.5.4. (Note that if the packet is larger than the reassembly buffer 1124 size, the ETE still examines the leading portion of the (partially) 1125 reassembled packet during decapsulation.) 1127 5.5.4. Decapsulation, SEAL-Layer Reassembly, and Re-Encapsulation 1129 For each SEAL packet accepted for decapsulation, when I==1 the ETE 1130 first examines the Identification field. If the Identification is 1131 not within the window of acceptable values for this ITE, the ETE 1132 silently discards the packet. 1134 Next, if V==1 the ETE SHOULD verify the MAC value (with the MAC field 1135 itself reset to 0) and silently discard the packet if the value is 1136 incorrect. 1138 Next, if the packet arrived as multiple IP fragments, the ETE sends 1139 an SPTB message back to the ITE with MTU set to the size of the 1140 largest fragment received minus HLEN (see: Section 5.6.1.1). 1142 Next, if the packet arrived as multiple IP fragments and the inner 1143 packet is larger than 1500 bytes, the ETE silently discards the 1144 packet; otherwise, it continues to process the packet. 1146 Next, if there is an incorrect value in a SEAL header field (e.g., an 1147 incorrect "VER" field value), the ETE discards the packet. If the 1148 SEAL header has C==0, the ETE also returns an SCMP "Parameter 1149 Problem" (SPP) message (see Section 5.6.1.2). 1151 Next, if the SEAL header has C==1, the ETE processes the packet as an 1152 SCMP packet as specified in Section 5.6.2. Otherwise, the ETE 1153 continues to process the packet as a SEAL data packet. 1155 Next, if the SEAL header has (M==1 || Offset!=0) the ETE checks to 1156 see if the other segments of this already-segmented SEAL packet have 1157 arrived, i.e., by looking for additional segments that have the same 1158 outer IP source address, destination address, source transport port 1159 number (if present) and SEAL Identification value. If the other 1160 segments have already arrived, the ETE discards the SEAL header and 1161 other outer headers from the non-initial segments and appends them 1162 onto the end of the first segment according to their offset value. 1163 Otherwise, the ETE caches the segment for at most 60 seconds while 1164 awaiting the arrival of its partners. During this process, the ETE 1165 discards any segments that are overlapping with respect to segments 1166 that have already been received. The ETE further SHOULD manage the 1167 SEAL reassembly cache the same as described for the IP-Layer 1168 Reassembly cache in Section 5.5.3, i.e., it SHOULD perform an early 1169 discard for any pending reassemblies that have low probability of 1170 completion. 1172 Next, if the SEAL header in the (reassembled) packet has A==1, the 1173 ETE sends an SPTB message back to the ITE with MTU=0 (see: Section 1174 5.6.1.1). 1176 Finally, the ETE discards the outer headers and processes the inner 1177 packet according to the header type indicated in the SEAL NEXTHDR 1178 field. If the inner (TTL / Hop Limit) field encodes the value 0, the 1179 ETE silently discards the packet. Otherwise, if the next hop toward 1180 the inner destination address is via a different interface than the 1181 SEAL packet arrived on, the ETE discards the SEAL header and delivers 1182 the inner packet either to the local host or to the next hop 1183 interface if the packet is not destined to the local host. 1185 If the next hop is on the same interface the SEAL packet arrived on, 1186 however, the ETE submits the packet for SEAL re-encapsulation 1187 beginning with the specification in Section 5.4.3 above and without 1188 decrementing the value in the inner (TTL / Hop Limit) field. In this 1189 process, the packet remains within the tunnel (i.e., it does not exit 1190 and then re-enter the tunnel); hence, the packet is not discarded if 1191 the LEVEL field in the SEAL header contains the value 0. 1193 5.6. The SEAL Control Message Protocol (SCMP) 1195 SEAL provides a companion SEAL Control Message Protocol (SCMP) that 1196 uses the same message types and formats as for the Internet Control 1197 Message Protocol for IPv6 (ICMPv6) [RFC4443]. As for ICMPv6, each 1198 SCMP message includes a 32-bit header and a variable-length body. 1199 The ITE encapsulates the SCMP message in a SEAL header and outer 1200 headers as shown in Figure 4: 1202 +--------------------+ 1203 ~ outer IP header ~ 1204 +--------------------+ 1205 ~ other outer hdrs ~ 1206 +--------------------+ 1207 ~ SEAL Header ~ 1208 +--------------------+ +--------------------+ 1209 | SCMP message header| --> | SCMP message header| 1210 +--------------------+ +--------------------+ 1211 | | --> | | 1212 ~ SCMP message body ~ --> ~ SCMP message body ~ 1213 | | --> | | 1214 +--------------------+ +--------------------+ 1216 SCMP Message SCMP Packet 1217 before encapsulation after encapsulation 1219 Figure 4: SCMP Message Encapsulation 1221 The following sections specify the generation, processing and 1222 relaying of SCMP messages. 1224 5.6.1. Generating SCMP Error Messages 1226 ETEs generate SCMP error messages in response to receiving certain 1227 SEAL data packets using the format shown in Figure 5: 1229 0 1 2 3 1230 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1231 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1232 | Type | Code | Checksum | 1233 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1234 | Type-Specific Data | 1235 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1236 | As much of the invoking SEAL data packet as possible | 1237 ~ (beginning with the SEAL header) without the SCMP ~ 1238 | packet exceeding MINMTU bytes (*) | 1240 (*) also known as the "packet-in-error" 1242 Figure 5: SCMP Error Message Format 1244 The error message includes the 32-bit SCMP message header, followed 1245 by a 32-bit Type-Specific Data field, followed by the leading portion 1246 of the invoking SEAL data packet beginning with the SEAL header as 1247 the "packet-in-error". The packet-in-error includes as much of the 1248 invoking packet as possible extending to a length that would not 1249 cause the entire SCMP packet following outer encapsulation to exceed 1250 MINMTU bytes. 1252 When the ETE processes a SEAL data packet for which the 1253 Identification and ICV values are correct but an error must be 1254 returned, it prepares an SCMP error message as shown in Figure 5. 1255 The ETE sets the Type and Code fields to the same values that would 1256 appear in the corresponding ICMPv6 message [RFC4443], but calculates 1257 the Checksum beginning with the SCMP message header using the 1258 algorithm specified for ICMPv4 in [RFC0792]. 1260 The ETE next encapsulates the SCMP message in the requisite SEAL and 1261 outer headers as shown in Figure 4. During encapsulation, the ETE 1262 sets the outer destination address/port numbers of the SCMP packet to 1263 the values associated with the ITE and sets the outer source address/ 1264 port numbers to its own outer address/port numbers. 1266 The ETE then sets (C=1; A=0; RES=0; M=0; Offset=0) in the SEAL 1267 header, then sets I, V, NEXTHDR and LEVEL to the same values that 1268 appeared in the SEAL header of the data packet. If the neighbor 1269 relationship between the ITE and ETE is unidirectional, the ETE next 1270 sets the LINK_ID field to the same value that appeared in the SEAL 1271 header of the data packet. Otherwise, the ETE sets the LINK_ID field 1272 to the value it would use in sending a SEAL packet to this ITE. 1274 When I==1, the ETE next sets the Identification field to an 1275 appropriate value for the ITE. If the neighbor relationship between 1276 the ITE and ETE is unidirectional, the ETE sets the Identification 1277 field to the same value that appeared in the SEAL header of the data 1278 packet. Otherwise, the ETE sets the Identification field to the 1279 value it would use in sending the next SEAL packet to this ITE. 1281 When V==1, the ETE then prepares the ICV field the same as specified 1282 for SEAL data packet encapsulation in Section 5.4.4. 1284 Finally, the ETE sends the resulting SCMP packet to the ITE the same 1285 as specified for SEAL data packets in Section 5.4.5. 1287 The following sections describe additional considerations for various 1288 SCMP error messages: 1290 5.6.1.1. Generating SCMP Packet Too Big (SPTB) Messages 1292 An ETE generates an SPTB message when it receives a SEAL data packet 1293 that arrived as multiple outer IP fragments. The ETE prepares the 1294 SPTB message the same as for the corresponding ICMPv6 PTB message, 1295 and writes the length of the largest outer IP fragment received minus 1296 HLEN in the MTU field of the message. 1298 The ETE also generates an SPTB message when it accepts a SEAL 1299 protocol data packet with A==1 in the SEAL header. The ETE prepares 1300 the SPTB message the same as above, except that it writes the value 0 1301 in the MTU field. 1303 5.6.1.2. Generating Other SCMP Error Messages 1305 An ETE generates an SCMP "Destination Unreachable" (SDU) message 1306 under the same circumstances that an IPv6 system would generate an 1307 ICMPv6 Destination Unreachable message. 1309 An ETE generates an SCMP "Parameter Problem" (SPP) message when it 1310 receives a SEAL packet with an incorrect value in the SEAL header. 1312 TEs generate other SCMP message types using methods and procedures 1313 specified in other documents. For example, SCMP message types used 1314 for tunnel neighbor coordinations are specified in VET 1315 [I-D.templin-intarea-vet]. 1317 5.6.2. Processing SCMP Error Messages 1319 An ITE may receive SCMP messages with C==1 in the SEAL header after 1320 sending packets to an ETE. The ITE first verifies that the outer 1321 addresses of the SCMP packet are correct, and (when I==1) that the 1322 Identification field contains an acceptable value. The ITE next 1323 verifies that the SEAL header fields are set correctly as specified 1324 in Section 5.6.1. When V==1, the ITE then verifies the ICV. The ITE 1325 next verifies the Checksum value in the SCMP message header. If any 1326 of these values are incorrect, the ITE silently discards the message; 1327 otherwise, it processes the message as follows: 1329 5.6.2.1. Processing SCMP PTB Messages 1331 After an ITE sends a SEAL data packet to an ETE, it may receive an 1332 SPTB message with a packet-in-error containing the leading portion of 1333 the packet (see: Section 5.6.1.1). For SPTB messages with MTU==0, 1334 the ITE processes the message as confirmation that the ETE received a 1335 SEAL data packet with A==1 in the SEAL header. The ITE then discards 1336 the message. 1338 For SPTB messages with MTU!=0, the ITE processes the message as an 1339 indication of a packet size limitation as follows. If the inner 1340 packet is no larger than 1500 bytes, the ITE reduces its MINMTU value 1341 for this ITE. If the inner packet length is larger than 1500 and the 1342 MTU value is not substantially less than MINMTU bytes, the value is 1343 likely to reflect the true MTU of the restricting link on the path to 1344 the ETE; otherwise, a router on the path may be generating runt 1345 fragments. 1347 In that case, the ITE can consult a plateau table (e.g., as described 1348 in [RFC1191]) to rewrite the MTU value to a reduced size. For 1349 example, if the ITE receives an IPv4 SPTB message with MTU==256 and 1350 inner packet length 4KB, it can rewrite the MTU to 2KB. If the ITE 1351 subsequently receives an IPv4 SPTB message with MTU==256 and inner 1352 packet length 2KB, it can rewrite the MTU to 1792, etc., to a minimum 1353 of 1500 bytes. If the ITE is performing stateful MTU determination 1354 for this SEAL path, it then writes the new MTU value minus HLEN in 1355 MAXMTU. 1357 The ITE then checks its forwarding tables to discover the previous 1358 hop toward the source address of the inner packet. If the previous 1359 hop is reached via the same tunnel interface the SPTB message arrived 1360 on, the ITE relays the message to the previous hop. In order to 1361 relay the message, the first writes zero in the Identification and 1362 ICV fields of the SEAL header within the packet-in-error. The ITE 1363 next rewrites the outer SEAL header fields with values corresponding 1364 to the previous hop and recalculates the MAC using the MAC 1365 calculation parameters associated with the previous hop. Next, the 1366 ITE replaces the SPTB's outer headers with headers of the appropriate 1367 protocol version and fills in the header fields as specified in 1368 Section 5.4.5, where the destination address/port correspond to the 1369 previous hop and the source address/port correspond to the ITE. The 1370 ITE then sends the message to the previous hop the same as if it were 1371 issuing a new SPTB message. (Note that, in this process, the values 1372 within the SEAL header of the packet-in-error are meaningless to the 1373 previous hop and therefore cannot be used by the previous hop for 1374 authentication purposes.) 1376 If the previous hop is not reached via the same tunnel interface, the 1377 ITE instead transcribes the message into a format appropriate for the 1378 inner packet (i.e., the same as described for transcribing ICMP 1379 messages in Section 5.4.7) and sends the resulting transcribed 1380 message to the original source. (NB: if the inner packet within the 1381 SPTB message is an IPv4 SEAL packet with DF==0, the ITE should set 1382 DF=1 and re-calculate the IPv4 header checksum while transcribing the 1383 message in order to avoid bogon filters.) The ITE then discards the 1384 SPTB message. 1386 Note that the ITE may receive an SPTB message from another ITE that 1387 is at the head end of a nested level of encapsulation. The ITE has 1388 no security associations with this nested ITE, hence it should 1389 consider this SPTB message the same as if it had received an ICMP PTB 1390 message from an ordinary router on the path to the ETE. That is, the 1391 ITE should examine the packet-in-error field of the SPTB message and 1392 only process the message if it is able to recognize the packet as one 1393 it had previously sent. 1395 5.6.2.2. Processing Other SCMP Error Messages 1397 An ITE may receive an SDU message with an appropriate code under the 1398 same circumstances that an IPv6 node would receive an ICMPv6 1399 Destination Unreachable message. The ITE either transcribes or 1400 relays the message toward the source address of the inner packet 1401 within the packet-in-error the same as specified for SPTB messages in 1402 Section 5.6.2.1. 1404 An ITE may receive an SPP message when the ETE receives a SEAL packet 1405 with an incorrect value in the SEAL header. The ITE should examine 1406 the SEAL header within the packet-in-error to determine whether a 1407 different setting should be used in subsequent packets, but does not 1408 relay the message further. 1410 TEs process other SCMP message types using methods and procedures 1411 specified in other documents. For example, SCMP message types used 1412 for tunnel neighbor coordinations are specified in VET 1413 [I-D.templin-intarea-vet]. 1415 6. Link Requirements 1417 Subnetwork designers are expected to follow the recommendations in 1418 Section 2 of [RFC3819] when configuring link MTUs. 1420 7. End System Requirements 1422 End systems are encouraged to implement end-to-end MTU assurance 1423 (e.g., using Packetization Layer Path MTU Discovery (PLPMTUD) per 1424 [RFC4821]) even if the subnetwork is using SEAL. 1426 When end systems use PLPMTUD, SEAL will ensure that the tunnel 1427 behaves as a link in the path that assures an MTU of at least 1500 1428 bytes while not precluding discovery of larger MTUs. The PMPMTUD 1429 mechanism will therefore be able to function as designed in order to 1430 discover and utilize larger MTUs. 1432 8. Router Requirements 1434 Routers within the subnetwork are expected to observe the standard IP 1435 router requirements, including the implementation of IP fragmentation 1436 and reassembly as well as the generation of ICMP messages 1437 [RFC0792][RFC1122][RFC1812][RFC2460][RFC4443][RFC6434]. 1439 Note that, even when routers support existing requirements for the 1440 generation of ICMP messages, these messages are often filtered and 1441 discarded by middleboxes on the path to the original source of the 1442 message that triggered the ICMP. It is therefore not possible to 1443 assume delivery of ICMP messages even when routers are correctly 1444 implemented. 1446 9. Nested Encapsulation Considerations 1448 SEAL supports nested tunneling for up to 8 layers of encapsulation. 1449 In this model, the SEAL ITE has a tunnel neighbor relationship only 1450 with ETEs at its own nesting level, i.e., it does not have a tunnel 1451 neighbor relationship with other ITEs, nor with ETEs at other nesting 1452 levels. 1454 Therefore, when an ITE 'A' within an outer nesting level needs to 1455 return an error message to an ITE 'B' within an inner nesting level, 1456 it generates an ordinary ICMP error message the same as if it were an 1457 ordinary router within the subnetwork. 'B' can then perform message 1458 validation as specified in Section 5.4.7, but full message origin 1459 authentication is not possible. 1461 Since ordinary ICMP messages are used for coordinations between ITEs 1462 at different nesting levels, nested SEAL encapsulations should only 1463 be used when the ITEs are within a common administrative domain 1464 and/or when there is no ICMP filtering middlebox such as a firewall 1465 or NAT between them. An example would be a recursive nesting of 1466 mobile networks, where the first network receives service from an 1467 ISP, the second network receives service from the first network, the 1468 third network receives service from the second network, etc. 1470 NB: As an alternative, the SCMP protocol could be extended to allow 1471 ITE 'A' to return an SCMP message to ITE 'B' rather than return an 1472 ICMP message. This would conceptually allow the control messages to 1473 pass through firewalls and NATs, however it would give no more 1474 message origin authentication assurance than for ordinary ICMP 1475 messages. It was therefore determined that the complexity of 1476 extending the SCMP protocol was of little value within the context of 1477 the anticipated use cases for nested encapsulations. 1479 10. Reliability Considerations 1481 Although a SEAL tunnel may span an arbitrarily-large subnetwork 1482 expanse, the IP layer sees the tunnel as a simple link that supports 1483 the IP service model. Links with high bit error rates (BERs) (e.g., 1484 IEEE 802.11) use Automatic Repeat-ReQuest (ARQ) mechanisms [RFC3366] 1485 to increase packet delivery ratios, while links with much lower BERs 1486 typically omit such mechanisms. Since SEAL tunnels may traverse 1487 arbitrarily-long paths over links of various types that are already 1488 either performing or omitting ARQ as appropriate, it would therefore 1489 be inefficient to require the tunnel endpoints to also perform ARQ. 1491 11. Integrity Considerations 1493 The SEAL header includes an integrity check field that covers the 1494 SEAL header and at least the inner packet headers. This provides for 1495 header integrity verification on a segment-by-segment basis for a 1496 segmented re-encapsulating tunnel path. 1498 Fragmentation and reassembly schemes must also consider packet- 1499 splicing errors, e.g., when two fragments from the same packet are 1500 concatenated incorrectly, when a fragment from packet X is 1501 reassembled with fragments from packet Y, etc. The primary sources 1502 of such errors include implementation bugs and wrapping IPv4 ID 1503 fields. 1505 In particular, the IPv4 16-bit ID field can wrap with only 64K 1506 packets with the same (src, dst, protocol)-tuple alive in the system 1507 at a given time [RFC4963]. When the IPv4 ID field is re-written by a 1508 middlebox such as a NAT or Firewall, ID field wrapping can occur with 1509 even fewer packets alive in the system. It is therefore essential 1510 that IPv4 fragmentation and reassembly be avoided. 1512 12. IANA Considerations 1514 The IANA is requested to allocate a User Port number for "SEAL" in 1515 the 'port-numbers' registry. The Service Name is "SEAL", and the 1516 Transport Protocols are TCP and UDP. The Assignee is the IESG 1517 (iesg@ietf.org) and the Contact is the IETF Chair (chair@ietf.org). 1518 The Description is "Subnetwork Encapsulation and Adaptation Layer 1519 (SEAL)", and the Reference is the RFC-to-be currently known as 1520 'draft-templin-intarea.seal'. 1522 13. Security Considerations 1524 SEAL provides a segment-by-segment message origin authentication, 1525 integrity and anti-replay service. The SEAL header is sent in-the- 1526 clear the same as for the outer IP and other outer headers. In this 1527 respect, the threat model is no different than for IPv6 extension 1528 headers. Unlike IPv6 extension headers, however, the SEAL header can 1529 be protected by an integrity check that also covers the inner packet 1530 headers. 1532 An amplification/reflection/buffer overflow attack is possible when 1533 an attacker sends IP fragments with spoofed source addresses to an 1534 ETE in an attempt to clog the ETE's reassembly buffer and/or cause 1535 the ETE to generate a stream of SCMP messages returned to a victim 1536 ITE. The SCMP message ICV, Identification, as well as the inner 1537 headers of the packet-in-error, provide mitigation for the ETE to 1538 detect and discard SEAL segments with spoofed source addresses. 1540 Security issues that apply to tunneling in general are discussed in 1541 [RFC6169]. 1543 14. Related Work 1545 Section 3.1.7 of [RFC2764] provides a high-level sketch for 1546 supporting large tunnel MTUs via a tunnel-level segmentation and 1547 reassembly capability to avoid IP level fragmentation. 1549 Section 3 of [RFC4459] describes inner and outer fragmentation at the 1550 tunnel endpoints as alternatives for accommodating the tunnel MTU. 1552 Section 4 of [RFC2460] specifies a method for inserting and 1553 processing extension headers between the base IPv6 header and 1554 transport layer protocol data. The SEAL header is inserted and 1555 processed in exactly the same manner. 1557 IPsec/AH is [RFC4301][RFC4301] is used for full message integrity 1558 verification between tunnel endpoints, whereas SEAL only ensures 1559 integrity for the inner packet headers. The AYIYA proposal 1560 [I-D.massar-v6ops-ayiya] uses similar means for providing message 1561 authentication and integrity. 1563 SEAL, along with the Virtual Enterprise Traversal (VET) 1564 [I-D.templin-intarea-vet] tunnel virtual interface abstraction, are 1565 the functional building blocks for the Interior Routing Overlay 1566 Network (IRON) [I-D.templin-ironbis] and Routing and Addressing in 1567 Networks with Global Enterprise Recursion (RANGER) [RFC5720][RFC6139] 1568 architectures. 1570 The concepts of path MTU determination through the report of 1571 fragmentation and extending the IPv4 Identification field were first 1572 proposed in deliberations of the TCP-IP mailing list and the Path MTU 1573 Discovery Working Group (MTUDWG) during the late 1980's and early 1574 1990's. An historical analysis of the evolution of these concepts, 1575 as well as the development of the eventual PMTUD mechanism, appears 1576 in [RFC5320]. 1578 15. Implementation Status 1580 An early implementation of the first revision of SEAL [RFC5320] is 1581 available at: http://isatap.com/seal. 1583 16. Acknowledgments 1585 The following individuals are acknowledged for helpful comments and 1586 suggestions: Jari Arkko, Fred Baker, Iljitsch van Beijnum, Oliver 1587 Bonaventure, Teco Boot, Bob Braden, Brian Carpenter, Steve Casner, 1588 Ian Chakeres, Noel Chiappa, Remi Denis-Courmont, Remi Despres, Ralph 1589 Droms, Aurnaud Ebalard, Gorry Fairhurst, Washam Fan, Dino Farinacci, 1590 Joel Halpern, Sam Hartman, John Heffner, Thomas Henderson, Bob 1591 Hinden, Christian Huitema, Eliot Lear, Darrel Lewis, Joe Macker, Matt 1592 Mathis, Erik Nordmark, Dan Romascanu, Dave Thaler, Joe Touch, Mark 1593 Townsley, Ole Troan, Margaret Wasserman, Magnus Westerlund, Robin 1594 Whittle, James Woodyatt, and members of the Boeing Research & 1595 Technology NST DC&NT group. 1597 Discussions with colleagues following the publication of [RFC5320] 1598 have provided useful insights that have resulted in significant 1599 improvements to this, the Second Edition of SEAL. 1601 This document received substantial review input from the IESG and 1602 IETF area directorates in the February 2013 timeframe. IESG members 1603 and IETF area directorate representatives who contributed helpful 1604 comments and suggestions are gratefully acknowledged. 1606 Path MTU determination through the report of fragmentation was first 1607 proposed by Charles Lynn on the TCP-IP mailing list in 1987. 1608 Extending the IP identification field was first proposed by Steve 1609 Deering on the MTUDWG mailing list in 1989. 1611 17. References 1612 17.1. Normative References 1614 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1615 September 1981. 1617 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 1618 RFC 792, September 1981. 1620 [RFC1122] Braden, R., "Requirements for Internet Hosts - 1621 Communication Layers", STD 3, RFC 1122, October 1989. 1623 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1624 Requirement Levels", BCP 14, RFC 2119, March 1997. 1626 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1627 (IPv6) Specification", RFC 2460, December 1998. 1629 [RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure 1630 Neighbor Discovery (SEND)", RFC 3971, March 2005. 1632 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 1633 Message Protocol (ICMPv6) for the Internet Protocol 1634 Version 6 (IPv6) Specification", RFC 4443, March 2006. 1636 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 1637 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 1638 September 2007. 1640 17.2. Informative References 1642 [FOLK] Shannon, C., Moore, D., and k. claffy, "Beyond Folklore: 1643 Observations on Fragmented Traffic", December 2002. 1645 [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", 1646 October 1987. 1648 [I-D.massar-v6ops-ayiya] 1649 Massar, J., "AYIYA: Anything In Anything", 1650 draft-massar-v6ops-ayiya-02 (work in progress), July 2004. 1652 [I-D.taylor-v6ops-fragdrop] 1653 Jaeggli, J., Colitti, L., Kumari, W., Vyncke, E., Kaeo, 1654 M., and T. Taylor, "Why Operators Filter Fragments and 1655 What It Implies", draft-taylor-v6ops-fragdrop-01 (work in 1656 progress), June 2013. 1658 [I-D.templin-intarea-vet] 1659 Templin, F., "Virtual Enterprise Traversal (VET)", 1660 draft-templin-intarea-vet-40 (work in progress), May 2013. 1662 [I-D.templin-ironbis] 1663 Templin, F., "The Interior Routing Overlay Network 1664 (IRON)", draft-templin-ironbis-15 (work in progress), 1665 May 2013. 1667 [RFC0994] International Organization for Standardization (ISO) and 1668 American National Standards Institute (ANSI), "Final text 1669 of DIS 8473, Protocol for Providing the Connectionless- 1670 mode Network Service", RFC 994, March 1986. 1672 [RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP 1673 MTU discovery options", RFC 1063, July 1988. 1675 [RFC1070] Hagens, R., Hall, N., and M. Rose, "Use of the Internet as 1676 a subnetwork for experimentation with the OSI network 1677 layer", RFC 1070, February 1989. 1679 [RFC1146] Zweig, J. and C. Partridge, "TCP alternate checksum 1680 options", RFC 1146, March 1990. 1682 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1683 November 1990. 1685 [RFC1701] Hanks, S., Li, T., Farinacci, D., and P. Traina, "Generic 1686 Routing Encapsulation (GRE)", RFC 1701, October 1994. 1688 [RFC1812] Baker, F., "Requirements for IP Version 4 Routers", 1689 RFC 1812, June 1995. 1691 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 1692 for IP version 6", RFC 1981, August 1996. 1694 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 1695 October 1996. 1697 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 1698 Hashing for Message Authentication", RFC 2104, 1699 February 1997. 1701 [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in 1702 IPv6 Specification", RFC 2473, December 1998. 1704 [RFC2675] Borman, D., Deering, S., and R. Hinden, "IPv6 Jumbograms", 1705 RFC 2675, August 1999. 1707 [RFC2764] Gleeson, B., Heinanen, J., Lin, A., Armitage, G., and A. 1709 Malis, "A Framework for IP Based Virtual Private 1710 Networks", RFC 2764, February 2000. 1712 [RFC2780] Bradner, S. and V. Paxson, "IANA Allocation Guidelines For 1713 Values In the Internet Protocol and Related Headers", 1714 BCP 37, RFC 2780, March 2000. 1716 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 1717 Defeating Denial of Service Attacks which employ IP Source 1718 Address Spoofing", BCP 38, RFC 2827, May 2000. 1720 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", 1721 RFC 2923, September 2000. 1723 [RFC3232] Reynolds, J., "Assigned Numbers: RFC 1700 is Replaced by 1724 an On-line Database", RFC 3232, January 2002. 1726 [RFC3366] Fairhurst, G. and L. Wood, "Advice to link designers on 1727 link Automatic Repeat reQuest (ARQ)", BCP 62, RFC 3366, 1728 August 2002. 1730 [RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D., 1731 Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. 1732 Wood, "Advice for Internet Subnetwork Designers", BCP 89, 1733 RFC 3819, July 2004. 1735 [RFC4191] Draves, R. and D. Thaler, "Default Router Preferences and 1736 More-Specific Routes", RFC 4191, November 2005. 1738 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 1739 for IPv6 Hosts and Routers", RFC 4213, October 2005. 1741 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 1742 Internet Protocol", RFC 4301, December 2005. 1744 [RFC4302] Kent, S., "IP Authentication Header", RFC 4302, 1745 December 2005. 1747 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1748 Network Tunneling", RFC 4459, April 2006. 1750 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 1751 Discovery", RFC 4821, March 2007. 1753 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 1754 Errors at High Data Rates", RFC 4963, July 2007. 1756 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 1757 Mitigations", RFC 4987, August 2007. 1759 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1760 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 1761 May 2008. 1763 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 1764 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 1766 [RFC5320] Templin, F., "The Subnetwork Encapsulation and Adaptation 1767 Layer (SEAL)", RFC 5320, February 2010. 1769 [RFC5445] Watson, M., "Basic Forward Error Correction (FEC) 1770 Schemes", RFC 5445, March 2009. 1772 [RFC5720] Templin, F., "Routing and Addressing in Networks with 1773 Global Enterprise Recursion (RANGER)", RFC 5720, 1774 February 2010. 1776 [RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010. 1778 [RFC6139] Russert, S., Fleischman, E., and F. Templin, "Routing and 1779 Addressing in Networks with Global Enterprise Recursion 1780 (RANGER) Scenarios", RFC 6139, February 2011. 1782 [RFC6169] Krishnan, S., Thaler, D., and J. Hoagland, "Security 1783 Concerns with IP Tunneling", RFC 6169, April 2011. 1785 [RFC6335] Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S. 1786 Cheshire, "Internet Assigned Numbers Authority (IANA) 1787 Procedures for the Management of the Service Name and 1788 Transport Protocol Port Number Registry", BCP 165, 1789 RFC 6335, August 2011. 1791 [RFC6434] Jankiewicz, E., Loughney, J., and T. Narten, "IPv6 Node 1792 Requirements", RFC 6434, December 2011. 1794 [RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label 1795 for Equal Cost Multipath Routing and Link Aggregation in 1796 Tunnels", RFC 6438, November 2011. 1798 [RFC6864] Touch, J., "Updated Specification of the IPv4 ID Field", 1799 RFC 6864, February 2013. 1801 [RFC6935] Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and 1802 UDP Checksums for Tunneled Packets", RFC 6935, April 2013. 1804 [RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement 1805 for the Use of IPv6 UDP Datagrams with Zero Checksums", 1806 RFC 6936, April 2013. 1808 [RIPE] De Boer, M. and J. Bosma, "Discovering Path MTU Black 1809 Holes on the Internet using RIPE Atlas", July 2012. 1811 [SIGCOMM] Luckie, M. and B. Stasiewicz, "Measuring Path MTU 1812 Discovery Behavior", November 2010. 1814 [TBIT] Medina, A., Allman, M., and S. Floyd, "Measuring 1815 Interactions Between Transport Protocols and Middleboxes", 1816 October 2004. 1818 [WAND] Luckie, M., Cho, K., and B. Owens, "Inferring and 1819 Debugging Path MTU Discovery Failures", October 2005. 1821 Author's Address 1823 Fred L. Templin (editor) 1824 Boeing Research & Technology 1825 P.O. Box 3707 1826 Seattle, WA 98124 1827 USA 1829 Email: fltemplin@acm.org