idnits 2.17.1 draft-templin-intarea-seal-54.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == The 'Obsoletes: ' line in the draft header should list only the _numbers_ of the RFCs which will be obsoleted by this document (if approved); it should not include the word 'RFC' in the list. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 19, 2013) is 4025 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC3971' is defined on line 1630, but no explicit reference was found in the text == Unused Reference: 'RFC4861' is defined on line 1637, but no explicit reference was found in the text == Unused Reference: 'RFC1063' is defined on line 1686, but no explicit reference was found in the text == Unused Reference: 'RFC1146' is defined on line 1693, but no explicit reference was found in the text == Unused Reference: 'RFC2675' is defined on line 1718, but no explicit reference was found in the text == Unused Reference: 'RFC2780' is defined on line 1725, but no explicit reference was found in the text == Unused Reference: 'RFC4191' is defined on line 1748, but no explicit reference was found in the text == Unused Reference: 'RFC4987' is defined on line 1769, but no explicit reference was found in the text == Unused Reference: 'RFC5226' is defined on line 1772, but no explicit reference was found in the text == Unused Reference: 'RFC5246' is defined on line 1776, but no explicit reference was found in the text == Unused Reference: 'RFC5445' is defined on line 1782, but no explicit reference was found in the text == Unused Reference: 'RFC6335' is defined on line 1798, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-02) exists of draft-taylor-v6ops-fragdrop-00 == Outdated reference: A later version (-40) exists of draft-templin-intarea-vet-38 == Outdated reference: A later version (-16) exists of draft-templin-ironbis-13 -- Obsolete informational reference (is this intentional?): RFC 1063 (Obsoleted by RFC 1191) -- Obsolete informational reference (is this intentional?): RFC 1146 (Obsoleted by RFC 6247) -- Obsolete informational reference (is this intentional?): RFC 1981 (Obsoleted by RFC 8201) -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) -- Obsolete informational reference (is this intentional?): RFC 6434 (Obsoleted by RFC 8504) Summary: 1 error (**), 0 flaws (~~), 18 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Templin, Ed. 3 Internet-Draft Boeing Research & Technology 4 Obsoletes: rfc5320 (if approved) April 19, 2013 5 Intended status: Informational 6 Expires: October 21, 2013 8 Boeing's Subnetwork Encapsulation and Adaptation Layer (SEAL) 9 draft-templin-intarea-seal-54.txt 11 Abstract 13 This document specifies a Subnetwork Encapsulation and Adaptation 14 Layer (SEAL) developed by Boeing. SEAL operates over virtual 15 topologies configured over connected IP network routing regions 16 bounded by encapsulating border nodes. These virtual topologies are 17 manifested by tunnels that may span multiple IP and/or sub-IP layer 18 forwarding hops, where they may incur packet duplication, packet 19 reordering, source address spoofing and traversal of links with 20 diverse Maximum Transmission Units (MTUs). SEAL uniquely addresses 21 these issues through the encapsulation and messaging mechanisms 22 specified in this document. 24 Status of this Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on October 21, 2013. 41 Copyright Notice 43 Copyright (c) 2013 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . 4 60 1.2. Approach . . . . . . . . . . . . . . . . . . . . . . . . . 6 61 1.3. Differences with RFC5320 . . . . . . . . . . . . . . . . . 7 62 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 8 63 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 10 64 4. Applicability Statement . . . . . . . . . . . . . . . . . . . 10 65 5. SEAL Specification . . . . . . . . . . . . . . . . . . . . . . 11 66 5.1. SEAL Tunnel Model . . . . . . . . . . . . . . . . . . . . 11 67 5.2. SEAL Model of Operation . . . . . . . . . . . . . . . . . 12 68 5.3. SEAL Header and Trailer Format . . . . . . . . . . . . . . 13 69 5.4. ITE Specification . . . . . . . . . . . . . . . . . . . . 15 70 5.4.1. Tunnel Interface MTU . . . . . . . . . . . . . . . . . 15 71 5.4.2. Tunnel Neighbor Soft State . . . . . . . . . . . . . . 16 72 5.4.3. SEAL Layer Pre-Processing . . . . . . . . . . . . . . 17 73 5.4.4. SEAL Encapsulation and Segmentation . . . . . . . . . 18 74 5.4.5. Outer Encapsulation . . . . . . . . . . . . . . . . . 20 75 5.4.6. Path Probing and ETE Reachability Verification . . . . 21 76 5.4.7. Processing ICMP Messages . . . . . . . . . . . . . . . 21 77 5.4.8. IPv4 Middlebox Reassembly Testing . . . . . . . . . . 22 78 5.4.9. Stateful MTU Determination . . . . . . . . . . . . . . 23 79 5.4.10. Detecting Path MTU Changes . . . . . . . . . . . . . . 24 80 5.5. ETE Specification . . . . . . . . . . . . . . . . . . . . 24 81 5.5.1. Minimum Reassembly Buffer Requirements . . . . . . . . 24 82 5.5.2. Tunnel Neighbor Soft State . . . . . . . . . . . . . . 24 83 5.5.3. IP-Layer Reassembly . . . . . . . . . . . . . . . . . 25 84 5.5.4. Decapsulation and Re-Encapsulation . . . . . . . . . . 25 85 5.6. The SEAL Control Message Protocol (SCMP) . . . . . . . . . 27 86 5.6.1. Generating SCMP Error Messages . . . . . . . . . . . . 27 87 5.6.2. Processing SCMP Error Messages . . . . . . . . . . . . 29 88 6. Link Requirements . . . . . . . . . . . . . . . . . . . . . . 31 89 7. End System Requirements . . . . . . . . . . . . . . . . . . . 32 90 8. Router Requirements . . . . . . . . . . . . . . . . . . . . . 32 91 9. Nested Encapsulation Considerations . . . . . . . . . . . . . 32 92 10. Reliability Considerations . . . . . . . . . . . . . . . . . . 33 93 11. Integrity Considerations . . . . . . . . . . . . . . . . . . . 33 94 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34 95 13. Security Considerations . . . . . . . . . . . . . . . . . . . 34 96 14. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 34 97 15. Implementation Status . . . . . . . . . . . . . . . . . . . . 35 98 16. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 35 99 17. References . . . . . . . . . . . . . . . . . . . . . . . . . . 36 100 17.1. Normative References . . . . . . . . . . . . . . . . . . . 36 101 17.2. Informative References . . . . . . . . . . . . . . . . . . 36 102 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 40 104 1. Introduction 106 As Internet technology and communication has grown and matured, many 107 techniques have developed that use virtual topologies (manifested by 108 tunnels of one form or another) over an actual network that supports 109 the Internet Protocol (IP) [RFC0791][RFC2460]. Those virtual 110 topologies have elements that appear as one hop in the virtual 111 topology, but are actually multiple IP or sub-IP layer hops. These 112 multiple hops often have quite diverse properties that are often not 113 even visible to the endpoints of the virtual hop. This introduces 114 failure modes that are not dealt with well in current approaches. 116 The use of IP encapsulation (also known as "tunneling") has long been 117 considered as the means for creating such virtual topologies (e.g., 118 see [RFC2003][RFC2473]). However, the encapsulation headers often 119 include insufficiently provisioned per-packet identification values. 120 IP encapsulation also allows an attacker to produce encapsulated 121 packets with spoofed source addresses even if the source address in 122 the encapsulating header cannot be spoofed. A denial-of-service 123 vector that is not possible in non-tunneled subnetworks is therefore 124 presented. 126 Additionally, the insertion of an outer IP header reduces the 127 effective path MTU visible to the inner network layer. When IPv6 is 128 used as the encapsulation protocol, original sources expect to be 129 informed of the MTU limitation through IPv6 Path MTU discovery 130 (PMTUD) [RFC1981]. When IPv4 is used, this reduced MTU can be 131 accommodated through the use of IPv4 fragmentation, but unmitigated 132 in-the-network fragmentation has been found to be harmful through 133 operational experience and studies conducted over the course of many 134 years [FRAG][FOLK][RFC4963]. Additionally, classical IPv4 PMTUD 135 [RFC1191] has known operational issues that are exacerbated by in- 136 the-network tunnels [RFC2923][RFC4459]. 138 The following subsections present further details on the motivation 139 and approach for addressing these issues. 141 1.1. Motivation 143 Before discussing the approach, it is necessary to first understand 144 the problems. In both the Internet and private-use networks today, 145 IP is ubiquitously deployed as the Layer 3 protocol. The primary 146 functions of IP are to provide for routing, addressing, and a 147 fragmentation and reassembly capability used to accommodate links 148 with diverse MTUs. While it is well known that the IP address space 149 is rapidly becoming depleted, there is a lesser-known but growing 150 consensus that other IP protocol limitations have already or may soon 151 become problematic. 153 First, the Internet historically provided no means for discerning 154 whether the source addresses of IP packets are authentic. This 155 shortcoming is being addressed more and more through the deployment 156 of site border router ingress filters [RFC2827], however the use of 157 encapsulation provides a vector for an attacker to circumvent 158 filtering for the encapsulated packet even if filtering is correctly 159 applied to the encapsulation header. Secondly, the IP header does 160 not include a well-behaved identification value unless the source has 161 included a fragment header for IPv6 or unless the source permits 162 fragmentation for IPv4. These limitations preclude an efficient 163 means for routers to detect duplicate packets and packets that have 164 been re-ordered within the subnetwork. Additionally, recent studies 165 have shown that the arrival of fragments at high data rates can cause 166 denial-of-service (DoS) attacks on performance-sensitive networking 167 gear, prompting some administrators to configure their equipment to 168 drop fragments unconditionally [I-D.taylor-v6ops-fragdrop]. 170 For IPv4 encapsulation, when fragmentation is permitted the header 171 includes a 16-bit Identification field, meaning that at most 2^16 172 unique packets with the same (source, destination, protocol)-tuple 173 can be active in the network at the same time [RFC6864]. (When 174 middleboxes such as Network Address Translators (NATs) re-write the 175 Identification field to random values, the number of unique packets 176 is even further reduced.) Due to the escalating deployment of high- 177 speed links, however, these numbers have become too small by several 178 orders of magnitude for high data rate packet sources such as tunnel 179 endpoints [RFC4963]. 181 Furthermore, there are many well-known limitations pertaining to IPv4 182 fragmentation and reassembly - even to the point that it has been 183 deemed "harmful" in both classic and modern-day studies (see above). 184 In particular, IPv4 fragmentation raises issues ranging from minor 185 annoyances (e.g., in-the-network router fragmentation [RFC1981]) to 186 the potential for major integrity issues (e.g., mis-association of 187 the fragments of multiple IP packets during reassembly [RFC4963]). 189 As a result of these perceived limitations, a fragmentation-avoiding 190 technique for discovering the MTU of the forward path from a source 191 to a destination node was devised through the deliberations of the 192 Path MTU Discovery Working Group (PMTUDWG) during the late 1980's 193 through early 1990's which resulted in the publication of [RFC1191]. 194 In this negative feedback-based method, the source node provides 195 explicit instructions to routers in the path to discard the packet 196 and return an ICMP error message if an MTU restriction is 197 encountered. However, this approach has several serious shortcomings 198 that lead to an overall "brittleness" [RFC2923]. 200 In particular, site border routers in the Internet have been known to 201 discard ICMP error messages coming from the outside world. This is 202 due in large part to the fact that malicious spoofing of error 203 messages in the Internet is trivial since there is no way to 204 authenticate the source of the messages [RFC5927]. Furthermore, when 205 a source node that requires ICMP error message feedback when a packet 206 is dropped due to an MTU restriction does not receive the messages, a 207 path MTU-related black hole occurs. This means that the source will 208 continue to send packets that are too large and never receive an 209 indication from the network that they are being discarded. This 210 behavior has been confirmed through documented studies showing clear 211 evidence of PMTUD failures for both IPv4 and IPv6 in the Internet 212 today [TBIT][WAND][SIGCOMM][RIPE]. 214 The issues with both IP fragmentation and this "classical" PMTUD 215 method are exacerbated further when IP tunneling is used [RFC4459]. 216 For example, an ingress tunnel endpoint (ITE) may be required to 217 forward encapsulated packets into the subnetwork on behalf of 218 hundreds, thousands, or even more original sources. If the ITE 219 allows IP fragmentation on the encapsulated packets, persistent 220 fragmentation could lead to undetected data corruption due to 221 Identification field wrapping and/or reassembly congestion at the 222 ETE. If the ITE instead uses classical IP PMTUD it must rely on ICMP 223 error messages coming from the subnetwork that may be suspect, 224 subject to loss due to filtering middleboxes, or insufficiently 225 provisioned for translation into error messages to be returned to the 226 original sources. 228 Although recent works have led to the development of a positive 229 feedback-based end-to-end MTU determination scheme [RFC4821], they do 230 not excuse tunnels from accounting for the encapsulation overhead 231 they add to packets. Moreover, in current practice existing 232 tunneling protocols mask the MTU issues by selecting a "lowest common 233 denominator" MTU that may be much smaller than necessary for most 234 paths and difficult to change at a later date. Therefore, a new 235 approach to accommodate tunnels over links with diverse MTUs is 236 necessary. 238 1.2. Approach 240 This document concerns subnetworks manifested through a virtual 241 topology configured over a connected network routing region and 242 bounded by encapsulating border nodes. Example connected network 243 routing regions include Mobile Ad hoc Networks (MANETs), enterprise 244 networks and the global public Internet itself. Subnetwork border 245 nodes forward unicast and multicast packets over the virtual topology 246 across multiple IP and/or sub-IP layer forwarding hops that may 247 introduce packet duplication and/or traverse links with diverse 248 Maximum Transmission Units (MTUs). 250 This document introduces a Subnetwork Encapsulation and Adaptation 251 Layer (SEAL) developed by Boeing for tunneling inner network layer 252 protocol packets over IP subnetworks that connect Ingress and Egress 253 Tunnel Endpoints (ITEs/ETEs) of border nodes. It provides a modular 254 specification designed to be tailored to specific associated 255 tunneling protocols. (A transport-mode of operation is also 256 possible, but out of scope for this document.) 258 SEAL provides a mid-layer encapsulation that accommodates links with 259 diverse MTUs, and allows routers in the subnetwork to perform 260 efficient duplicate packet and packet reordering detection. The 261 encapsulation further ensures message origin authentication, packet 262 header integrity and anti-replay in environments in which these 263 functions are necessary. 265 SEAL treats tunnels that traverse the subnetwork as ordinary links 266 that must support network layer services. Moreover, SEAL provides 267 dynamic mechanisms to ensure a maximal path MTU over the tunnel. 268 This is in contrast to static approaches which avoid MTU issues by 269 selecting a lowest common denominator MTU value that may be overly 270 conservative for the vast majority of tunnel paths and difficult to 271 change even when larger MTUs become available. 273 The following sections provide the SEAL normative specifications, 274 while the appendices present non-normative additional considerations. 276 1.3. Differences with RFC5320 278 This specification of SEAL is descended from an experimental 279 independent RFC publication of the same name [RFC5320]. However, 280 this specification introduces a number of important differences from 281 the earlier publication. 283 First, this specification includes a protocol version field in the 284 SEAL header whereas [RFC5320] does not, and therefore cannot be 285 updated by future revisions. This specification therefore obsoletes 286 (i.e., and does not update) [RFC5320]. 288 Secondly, [RFC5320] forms a 32-bit Identification value by 289 concatenating the 16-bit IPv4 Identification field with a 16-bit 290 Identification "extension" field in the SEAL header. This means that 291 [RFC5320] can only operate over IPv4 networks (since IPv6 headers do 292 not include a 16-bit version number) and that the SEAL Identification 293 value can be corrupted if the Identification in the outer IPv4 header 294 is rewritten. In contrast, this specification includes a 32-bit 295 Identification value that is independent of any identification fields 296 found in the inner or outer IP headers, and is therefore compatible 297 with any inner and outer IP protocol version combinations. 299 Additionally, the SEAL segmentation and reassembly procedures defined 300 in [RFC5320] differ significantly from those found in this 301 specification. In particular, this specification defines a 6-bit 302 Offset field that allows for smaller segment sizes when SEAL 303 segmentation is necessary (e.g., in order to observe the IPv4 minimum 304 MTU of 68 bytes). In contrast, [RFC5320] includes a 3-bit Segment 305 field and performs reassembly through concatenation of consecutive 306 segments. 308 The SEAL header in this specification also includes an optional 309 Integrity Check Vector (ICV) that can be used to digitally sign the 310 SEAL header and the leading portion of the encapsulated inner packet. 311 This allows for a lightweight integrity check and a loose message 312 origin authentication capability. The header further includes new 313 control bits as well as a link identification and encapsulation level 314 field for additional control capabilities. 316 Finally, this version of SEAL includes a new messaging protocol known 317 as the SEAL Control Message Protocol (SCMP), whereas [RFC5320] 318 performs signalling through the use of SEAL-encapsulated ICMP 319 messages. The use of SCMP allows SEAL-specific departures from ICMP, 320 as well as a control messaging capability that extends to other 321 specifications, including Virtual Enterprise Traversal (VET) 322 [I-D.templin-intarea-vet]. 324 2. Terminology 326 The following terms are defined within the scope of this document: 328 subnetwork 329 a virtual topology configured over a connected network routing 330 region and bounded by encapsulating border nodes. 332 IP 333 used to generically refer to either Internet Protocol (IP) 334 version, i.e., IPv4 or IPv6. 336 Ingress Tunnel Endpoint (ITE) 337 a virtual interface over which an encapsulating border node (host 338 or router) sends encapsulated packets into the subnetwork. 340 Egress Tunnel Endpoint (ETE) 341 a virtual interface over which an encapsulating border node (host 342 or router) receives encapsulated packets from the subnetwork. 344 SEAL Path 345 a subnetwork path from an ITE to an ETE beginning with an 346 underlying link of the ITE as the first hop. Note that, if the 347 ITE's interface connection to the underlying link assigns multiple 348 IP addresses, each address represents a separate SEAL path. 350 inner packet 351 an unencapsulated network layer protocol packet (e.g., IPv4 352 [RFC0791], OSI/CLNP [RFC0994], IPv6 [RFC2460], etc.) before any 353 outer encapsulations are added. Internet protocol numbers that 354 identify inner packets are found in the IANA Internet Protocol 355 registry [RFC3232]. SEAL protocol packets that incur an 356 additional layer of SEAL encapsulation are also considered inner 357 packets. 359 outer IP packet 360 a packet resulting from adding an outer IP header (and possibly 361 other outer headers) to a SEAL-encapsulated inner packet. 363 packet-in-error 364 the leading portion of an invoking data packet encapsulated in the 365 body of an error control message (e.g., an ICMPv4 [RFC0792] error 366 message, an ICMPv6 [RFC4443] error message, etc.). 368 Packet Too Big (PTB) message 369 a control plane message indicating an MTU restriction (e.g., an 370 ICMPv6 "Packet Too Big" message [RFC4443], an ICMPv4 371 "Fragmentation Needed" message [RFC0792], etc.). 373 Don't Fragment (DF) bit 374 a bit that indicates whether the packet may be fragmented by the 375 network. The DF bit is explicitly included in the IPv4 header 376 [RFC0791] and may be set to '0' to allow fragmentation or '1' to 377 disallow further in-network fragmentation. The bit is absent from 378 the IPv6 header [RFC2460], but implicitly set to '1' becauuse 379 fragmentation can occur only at IPv6 sources. 381 The following abbreviations correspond to terms used within this 382 document and/or elsewhere in common Internetworking nomenclature: 384 HLEN - the length of the SEAL header plus outer headers 386 ICV - Integrity Check Vector 388 MAC - Message Authentication Code 390 MTU - Maximum Transmission Unit 391 SCMP - the SEAL Control Message Protocol 393 SDU - SCMP Destination Unreachable message 395 SPP - SCMP Parameter Problem message 397 SPTB - SCMP Packet Too Big message 399 SEAL - Subnetwork Encapsulation and Adaptation Layer 401 TE - Tunnel Endpoint (i.e., either ingress or egress) 403 VET - Virtual Enterprise Traversal 405 3. Requirements 407 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 408 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 409 document are to be interpreted as described in [RFC2119]. When used 410 in lower case (e.g., must, must not, etc.), these words MUST NOT be 411 interpreted as described in [RFC2119], but are rather interpreted as 412 they would be in common English. 414 4. Applicability Statement 416 SEAL was originally motivated by the specific case of subnetwork 417 abstraction for Mobile Ad hoc Networks (MANETs), however the domain 418 of applicability also extends to subnetwork abstractions over 419 enterprise networks, ISP networks, SOHO networks, the global public 420 Internet itself, and any other connected network routing region. 422 SEAL provides a network sublayer for encapsulation of an inner 423 network layer packet within outer encapsulating headers. SEAL can 424 also be used as a sublayer within a transport layer protocol data 425 payload, where transport layer encapsulation is typically used for 426 Network Address Translator (NAT) traversal as well as operation over 427 subnetworks that give preferential treatment to certain "core" 428 Internet protocols, e.g., TCP, UDP, etc.. (However, note that TCP 429 encapsulation may not be appropriate for all use cases; particularly 430 those that require low delay and/or delay variance.) The SEAL header 431 is processed in a similar manner as for IPv6 extension headers, i.e., 432 it is not part of the outer IP header but rather allows for the 433 creation of an arbitrarily extensible chain of headers in the same 434 way that IPv6 does. 436 To accommodate MTU diversity, the Ingress Tunnel Endpoint (ITE) may 437 need to perform any necessary segmentation which the Egress Tunnel 438 Endpoint (ETE) must reassemble. The ETE further acts as a passive 439 observer that informs the ITE of any packet size limitations. This 440 allows the ITE to return appropriate PMTUD feedback even if the 441 network path between the ITE and ETE filters ICMP messages. 443 SEAL further provides mechanisms to ensure message origin 444 authentication, packet header integrity, and anti-replay. The SEAL 445 framework is therefore similar to the IP Security (IPsec) 446 Authentication Header (AH) [RFC4301][RFC4302], however it provides 447 only minimal hop-by-hop authenticating services while leaving full 448 data integrity, authentication and confidentiality services as an 449 end-to-end consideration. 451 In many aspects, SEAL also very closely resembles the Generic Routing 452 Encapsulation (GRE) framework [RFC1701]. SEAL can therefore be 453 applied in the same use cases that are traditionally addressed by 454 GRE, but goes beyond GRE to also provide additional capabilities 455 (e.,g., path MTU accommodation, message origin authentication, etc.) 456 as described in this document. 458 5. SEAL Specification 460 The following sections specify the operation of SEAL: 462 5.1. SEAL Tunnel Model 464 SEAL is an encapsulation sublayer used within point-to-point and non- 465 broadcast, multiple access (NBMA) tunnels. Each SEAL path is 466 configured over one or more underlying interfaces attached to 467 subnetwork links. The SEAL tunnel connects an ITE to one or more ETE 468 "neighbors" via encapsulation across an underlying subnetwork, where 469 the tunnel neighbor relationship may be either unidirectional or 470 bidirectional. 472 A unidirectional tunnel neighbor relationship allows the near end ITE 473 to send data packets forward to the far end ETE, while the ETE only 474 returns control messages when necessary. A bidirectional tunnel 475 neighbor relationship is one over which both TEs can exchange both 476 data and control messages. 478 Implications of the SEAL unidirectional and bidirectional models are 479 the same as discussed in [I-D.templin-intarea-vet]. 481 5.2. SEAL Model of Operation 483 SEAL-enabled ITEs encapsulate each inner packet in a SEAL header and 484 any outer header encapsulations as shown in Figure 1: 486 +--------------------+ 487 ~ outer IP header ~ 488 +--------------------+ 489 ~ other outer hdrs ~ 490 +--------------------+ 491 ~ SEAL Header ~ 492 +--------------------+ +--------------------+ 493 | | --> | | 494 ~ Inner ~ --> ~ Inner ~ 495 ~ Packet ~ --> ~ Packet ~ 496 | | --> | | 497 +--------------------+ +----------+---------+ 499 Figure 1: SEAL Encapsulation 501 The ITE inserts the SEAL header according to the specific tunneling 502 protocol. For simple encapsulation of an inner network layer packet 503 within an outer IP header, the ITE inserts the SEAL header following 504 the outer IP header and before the inner packet as: IP/SEAL/{inner 505 packet}. 507 For encapsulations over transports such as UDP, the ITE inserts the 508 SEAL header following the outer transport layer header and before the 509 inner packet, e.g., as IP/UDP/SEAL/{inner packet}. In that case, the 510 UDP header is seen as an "other outer header" as depicted in Figure 1 511 and the outer IP and transport layer headers are together seen as the 512 outer encapsulation headers. 514 SEAL supports both "nested" tunneling and "re-encapsulating" 515 tunneling. Nested tunneling occurs when a first tunnel is 516 encapsulated within a second tunnel, which may then further be 517 encapsulated within additional tunnels. Nested tunneling can be 518 useful, and stands in contrast to "recursive" tunneling which is an 519 anomalous condition incurred due to misconfiguration or a routing 520 loop. Considerations for nested tunneling and avoiding recursive 521 tunneling are discussed in Section 4 of [RFC2473]. 523 Re-encapsulating tunneling occurs when a packet arrives at a first 524 ETE, which then acts as an ITE to re-encapsulate and forward the 525 packet to a second ETE connected to the same subnetwork. In that 526 case each ITE/ETE transition represents a segment of a bridged path 527 between the ITE nearest the source and the ETE nearest the 528 destination. Considerations for re-encapsulating tunneling are 529 discussed in[I-D.templin-ironbis]. Combinations of nested and re- 530 encapsulating tunneling are also naturally supported by SEAL. 532 The SEAL ITE considers each underlying interface as the ingress 533 attachment point to a SEAL path to the ETE. The ITE therefore may 534 experience different path MTUs on different SEAL paths. 536 Finally, the SEAL ITE ensures that the inner network layer protocol 537 will see a minimum MTU of 1500 bytes over each SEAL path regardless 538 of the outer network layer protocol version, i.e., even if a small 539 amount of fragmentation and reassembly are necessary. This is 540 necessary to avoid path MTU "black holes" for the minimum MTU 541 configured by the vast majority of links in the Internet. 543 5.3. SEAL Header and Trailer Format 545 The SEAL header is formatted as follows: 547 0 1 2 3 548 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 550 |VER|C|A|I|V|R|RES|M| Offset | NEXTHDR | LINK_ID |LEVEL| 551 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 552 | Identification (optional) | 553 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 554 | Integrity Check Vector (optional) | 555 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... 557 Figure 2: SEAL Header Format 559 VER (2) 560 a 2-bit version field. This document specifies Version 0 of the 561 SEAL protocol, i.e., the VER field encodes the value 0. 563 C (1) 564 the "Control/Data" bit. Set to 1 by the ITE in SEAL Control 565 Message Protocol (SCMP) control messages, and set to 0 in ordinary 566 data packets. 568 A (1) 569 the "Acknowledgement Requested" bit. Set to 1 by the ITE in SEAL 570 data packets for which it wishes to receive an explicit 571 acknowledgement from the ETE. 573 I (1) 574 the "Identification Included" bit. 576 V (1) 577 the "Integrity Check Vector included" bit. 579 R (1) 580 the "Redirects Permitted" bit (reserved for use by VET: 581 [I-D.templin-intarea-vet]). 583 RES (2) a 2-bit reserved field. 585 M (1) the "More Segments" bit. Set to 1 in a non-final segment and 586 set to 0 in the final segment of the SEAL packet. 588 Offset (6) a 6-bit Offset field. Set to 0 in the first segment of a 589 segmented SEAL packet. Set to an integral number of 32 byte 590 blocks in subsequent segments (e.g., an Offset of 10 indicates a 591 block that begins at the 320th byte in the packet). 593 NEXTHDR (8) an 8-bit field that encodes the next header Internet 594 Protocol number the same as for the IPv4 protocol and IPv6 next 595 header fields. 597 LINK_ID (5) 598 a 5-bit link identification value, set to a unique value by the 599 ITE for each SEAL path over which it will send encapsulated 600 packets to the ETE (up to 32 SEAL paths per ETE are therefore 601 supported). Note that, if the ITE's interface connection to the 602 underlying link assigns multiple IP addresses, each address 603 represents a separate SEAL path that must be assigned a separate 604 LINK_ID. 606 LEVEL (3) 607 a 3-bit nesting level; use to limit the number of tunnel nesting 608 levels. Set to an integer value up to 7 in the innermost SEAL 609 encapsulation, and decremented by 1 for each successive additional 610 SEAL encapsulation nesting level. Up to 8 levels of nesting are 611 therefore supported. 613 Identification (32) 614 an optional 32-bit per-packet identification field; present when 615 I==1. Set to a 32-bit value (beginning with 0) that is 616 monotonically-incremented for each SEAL packet transmitted to this 617 ETE. 619 Integrity Check Vector (ICV) (variable) 620 an optional variable-length integrity check vector field; present 621 when V==1. 623 5.4. ITE Specification 625 5.4.1. Tunnel Interface MTU 627 The tunnel interface must present a constant MTU value to the inner 628 network layer as the size for admission of inner packets into the 629 interface. Since NBMA tunnel virtual interfaces may support a large 630 set of SEAL paths that accept widely varying maximum packet sizes, 631 however, a number of factors should be taken into consideration when 632 selecting a tunnel interface MTU. 634 Due to the ubiquitous deployment of standard Ethernet and similar 635 networking gear, the nominal Internet cell size has become 1500 636 bytes; this is the de facto size that end systems have come to expect 637 will either be delivered by the network without loss due to an MTU 638 restriction on the path or a suitable ICMP Packet Too Big (PTB) 639 message returned. When large packets sent by end systems incur 640 additional encapsulation at an ITE, however, they may be dropped 641 silently within the tunnel since the network may not always deliver 642 the necessary PTBs [RFC2923]. The ITE SHOULD therefore set a tunnel 643 interface MTU of at least 1500 bytes. 645 The inner network layer protocol consults the tunnel interface MTU 646 when admitting a packet into the interface. For non-SEAL inner IPv4 647 packets with the IPv4 Don't Fragment (DF) cleared (i.e, DF==0), if 648 the packet is larger than the tunnel interface MTU the inner IPv4 649 layer uses IPv4 fragmentation to break the packet into fragments no 650 larger than the tunnel interface MTU. The ITE then admits each 651 fragment into the interface as an independent packet. 653 For all other inner packets, the inner network layer admits the 654 packet if it is no larger than the tunnel interface MTU; otherwise, 655 it drops the packet and sends a PTB error message to the source with 656 the MTU value set to the tunnel interface MTU. The message contains 657 as much of the invoking packet as possible without the entire message 658 exceeding the network layer minimum MTU size. 660 The ITE can alternatively set an indefinite MTU on the tunnel 661 interface such that all inner packets are admitted into the interface 662 regardless of their size. For ITEs that host applications that use 663 the tunnel interface directly, this option must be carefully 664 coordinated with protocol stack upper layers since some upper layer 665 protocols (e.g., TCP) derive their packet sizing parameters from the 666 MTU of the outgoing interface and as such may select too large an 667 initial size. This is not a problem for upper layers that use 668 conservative initial maximum segment size estimates and/or when the 669 tunnel interface can reduce the upper layer's maximum segment size, 670 e.g., by reducing the size advertised in the MSS option of outgoing 671 TCP messages (sometimes known as "MSS clamping"). 673 In light of the above considerations, the ITE should configure an 674 indefinite MTU on tunnel *router* interfaces so that SEAL performs 675 all subnetwork adaptation from within the interface as specified in 676 Section 5.4.3. The ITE can instead set a smaller MTU on tunnel 677 *host* interfaces (e.g., the smallest MTU among all of the underlying 678 links minus the size of the encapsulation headers) but SHOULD NOT set 679 an MTU smaller than 1500 bytes. 681 5.4.2. Tunnel Neighbor Soft State 683 The tunnel virtual interface maintains a number of soft state 684 variables for each ETE and for each SEAL path. 686 When per-packet identification is required, the ITE maintains a per 687 ETE window of Identification values for the packets it has recently 688 sent to this ETE. The ITE then sets a variable "USE_ID" to TRUE, and 689 includes an Identification in each packet it sends to this ETE; 690 otherwise, it sets USE_ID to FALSE. 692 When message origin authentication and integrity checking is 693 required, the ITE also includes an ICV in the packets it sends to the 694 ETE. The ICV format is shown in Figure 3: 696 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 697 |F|Key|Algorithm| Message Authentication Code (MAC) | 698 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... 700 Figure 3: Integrity Check Vector (ICV) Format 702 As shown in the figure, the ICV begins with a 1-octet control field 703 with a 1-bit (F)lag, a 2-bit Key identifier and a 5-bit Algorithm 704 identifier. The control octet is followed by a variable-length 705 Message Authentication Code (MAC). The ITE maintains a per ETE 706 algorithm and secret key to calculate the MAC in each packet it will 707 send to this ETE. (By default, the ITE sets the F bit and Algorithm 708 fields to 0 to indicate use of the HMAC-SHA-1 algorithm with a 160 709 bit shared secret key to calculate an 80 bit MAC per [RFC2104] over 710 the leading 128 bytes of the packet. Other values for F and 711 Algorithm are out of scope.) The ITE then sets a variable "USE_ICV" 712 to TRUE, and includes an ICV in each packet it sends to this ETE; 713 otherwise, it sets USE_ICV to FALSE. 715 For each SEAL path, the ITE must also account for encapsulation 716 header lengths. The ITE therefore maintains the per SEAL path 717 constant values "SHLEN" set to the length of the SEAL header, "THLEN" 718 set to the length of the outer encapsulating transport layer headers 719 (or 0 if outer transport layer encapsulation is not used), "IHLEN" 720 set to the length of the outer IP layer header, and "HLEN" set to 721 (SHLEN+THLEN+IHLEN). (The ITE must include the length of the 722 uncompressed headers even if header compression is enabled when 723 calculating these lengths.) In addition, the ITE maintains a per 724 SEAL path variable "MAXMTU" initialized to the maximum of 1500 bytes 725 and the MTU of the underlying link minus HLEN. (Thereafter, the ITE 726 must not reduce MAXMTU to a value smaller than 1500 bytes.) 728 The ITE further sets a variable 'MINMTU' to the minimum MTU for the 729 SEAL path over which encapsulated packets will travel. For IPv6 730 paths the ITE sets MINMTU=1280 (see: [RFC2460]) and for IPv4 paths 731 the ITE sets MINMTU=576 even though the true MINMTU for IPv4 is only 732 68 bytes (see: [RFC0791]). 734 The ITE can also set MINMTU to a larger value if there is reason to 735 believe that the minimum path MTU is larger, or to a smaller value if 736 there is reason to believe there may be additional encapsulations on 737 the path. If this value proves too large, the ITE will receive PTB 738 message feedback either from the ETE or from a router on the path and 739 will be able to reduce its MINMTU to a smaller value. 741 The ITE may instead maintain the packet sizing variables and 742 constants as per ETE (rather than per SEAL path) values. In that 743 case, the values reflect the lowest-common-denominator size across 744 all of the SEAL paths associated with this ETE. 746 5.4.3. SEAL Layer Pre-Processing 748 The SEAL layer is logically positioned between the inner and outer 749 network protocol layers, where the inner layer is seen as the (true) 750 network layer and the outer layer is seen as the (virtual) data link 751 layer. Each packet to be processed by the SEAL layer is either 752 admitted into the tunnel interface by the inner network layer 753 protocol as described in Section 5.4.1 or is undergoing re- 754 encapsulation from within the tunnel interface. The SEAL layer sees 755 the former class of packets as inner packets that include inner 756 network and transport layer headers, and sees the latter class of 757 packets as transitional SEAL packets that include the outer and SEAL 758 layer headers that were inserted by the previous hop SEAL ITE. For 759 these transitional packets, the SEAL layer re-encapsulates the packet 760 with new outer and SEAL layer headers when it forwards the packet to 761 the next hop SEAL ITE. 763 We now discuss the SEAL layer pre-processing actions for these two 764 classes of packets. 766 5.4.3.1. Inner Packet Pre-Processing 768 For each inner packet admitted into the tunnel interface, if the 769 packet is itself a SEAL packet (i.e., one with the port number for 770 SEAL in the transport layer header or one with the protocol number 771 for SEAL in the IP layer header) and the LEVEL field of the SEAL 772 header contains the value 0, the ITE silently discards the packet. 774 Otherwise, for non-SEAL IPv4 inner packets with DF==0 in the IP 775 header and IPv6 inner packets with a fragment header and with (MF=0; 776 Offset=0), if the packet is larger than (MINMTU-HLEN) the ITE uses IP 777 fragmentation to fragment the packet into N roughly equal-length 778 pieces, where N is minimized and each fragment is significantly 779 smaller than (MINMTU-HLEN) to allow for additional encapsulations in 780 the path. The ITE then submits each fragment for SEAL encapsulation 781 as specified in Section 5.4.4. 783 For all other inner packets, if the packet is no larger than MAXMTU 784 for the corresponding SEAL path the ITE submits it for SEAL 785 encapsulation as specified in Section 5.4.4. Otherwise, the ITE 786 drops the packet and sends an ordinary ICMP PTB message appropriate 787 to the inner protocol version with the MTU field set to MAXMTU. (For 788 IPv4 SEAL packets with DF==0, the ITE should set DF=1 and re- 789 calculate the IPv4 header checksum before generating the PTB message 790 in order to avoid bogon filters.) After sending the PTB message, the 791 ITE discards the inner packet. 793 5.4.3.2. Transitional SEAL Packet Pre-Processing 795 For each transitional packet that is to be processed by the SEAL 796 layer from within the tunnel interface, the ITE sets aside the SEAL 797 encapsulation headers that were received from the previous hop. 798 Next, if the packet is no larger than MAXMTU for the next hop SEAL 799 path the ITE submits it for SEAL encapsulation as specified in 800 Section 5.4.4. Otherwise, the ITE drops the packet and sends an SCMP 801 Packet Too Big (SPTB) message to the previous hop subject to rate 802 limiting (see: Section 5.6.1.1) with the MTU field set to MAXMTU. 803 After sending the SPTB message, the ITE discards the packet. 805 5.4.4. SEAL Encapsulation and Segmentation 807 For each inner packet/fragment submitted for SEAL encapsulation, the 808 ITE next encapsulates the packet in a SEAL header formatted as 809 specified in Section 5.3. The SEAL header includes an Identification 810 field when USE_ID is TRUE, followed by an ICV field when USE_ICV is 811 TRUE. 813 The ITE next sets C=0 and RES=0 in the SEAL header. The ITE also 814 sets A=1 if necessary for ETE reachability determination (see: 815 Section 5.4.6) or for stateful MTU determination (see Section 5.4.9). 816 Otherwise, the ITE sets A=0. 818 The ITE then sets LINK_ID to the value assigned to the underlying 819 SEAL path, and sets NEXTHDR to the protocol number corresponding to 820 the address family of the encapsulated inner packet. For example, 821 the ITE sets NEXTHDR to the value '4' for encapsulated IPv4 packets 822 [RFC2003], '41' for encapsulated IPv6 packets [RFC2473][RFC4213], 823 '80' for encapsulated OSI/CLNP packets [RFC1070], etc. 825 Next, if the inner packet is not itself a SEAL packet the ITE sets 826 LEVEL to an integer value between 0 and 7 as a specification of the 827 number of additional layers of nested SEAL encapsulations permitted. 828 If the inner packet is a SEAL packet that is undergoing nested 829 encapsulation, the ITE instead sets LEVEL to the value that appears 830 in the inner packet's SEAL header minus 1. If the inner packet is 831 undergoing SEAL re-encapsulation, the ITE instead copies the LEVEL 832 value from the SEAL header of the packet to be re-encapsulated. 834 Next, if the inner packet is no larger than (MINMTU-HLEN) or larger 835 than 1500, the ITE sets (M=0; Offset=0). Otherwise, the ITE breaks 836 the inner packet into a N roughly equal-length non-overlapping 837 segments (where N is minimized and each fragment is significantly 838 smaller than (MINMTU-HLEN) to allow for additional encapsulations in 839 the path) then appends a clone of the SEAL header from the first 840 segment onto the head of each additional segment. The ITE then sets 841 (M=1; Offset=0) in the first segment, sets (M=0/1; Offset=i) in the 842 second segment, sets (M=0/1; Offset=j) in the third segment (if 843 needed), etc., then finally sets (M=0; Offset=k) in the final segment 844 (where i, j, k, etc. are the number of 32 byte blocks that preceded 845 this segment). 847 When USE_ID is FALSE, the ITE next sets I=0. Otherwise, the ITE sets 848 I=1 and writes a monotonically-incrementing integer value for this 849 ETE in the Identification field beginning with 0 in the first packet 850 transmitted. (For SEAL packets that have been split into multiple 851 pieces, the ITE writes the same Identification value in each piece.) 852 The monotonically-incrementing requirement is to satisfy ETEs that 853 use this value for anti-replay purposes. The value is incremented 854 modulo 2^32, i.e., it wraps back to 0 when the previous value was 855 (2^32 - 1). 857 When USE_ICV is FALSE, the ITE next sets V=0. Otherwise, the ITE 858 sets V=1, includes an ICV and calculates the MAC using HMAC-SHA-1 859 with a 160 bit secret key and 80 bit MAC field. Beginning with the 860 SEAL header, the ITE sets the ICV field to 0, calculates the MAC over 861 the leading 128 bytes of the packet (or up to the end of the packet 862 if there are fewer than 128 bytes) and places the result in the MAC 863 field. (For SEAL packets that have been split into multiple pieces, 864 each piece calculates its own MAC.) The ITE then writes the value 0 865 in the F flag and 0x00 in the Algorithm field of the ICV control 866 octet (other values for these fields, and other MAC calculation 867 disciplines, are outside the scope of this document and may be 868 specified in future documents.) 870 The ITE then adds the outer encapsulating headers as specified in 871 Section 5.4.5. 873 5.4.5. Outer Encapsulation 875 Following SEAL encapsulation, the ITE next encapsulates each segment 876 in the requisite outer transport (when necessary) and IP layer 877 headers. When a transport layer header such as UDP or TCP is 878 included, the ITE writes the port number for SEAL in the transport 879 destination service port field. 881 When UDP encapsulation is used, the ITE sets the UDP checksum field 882 to zero for IPv4 packets and also sets the UDP checksum field to zero 883 for IPv6 packets even though IPv6 generally requires UDP checksums. 884 Further considerations for setting the UDP checksum field for IPv6 885 packets are discussed in 886 [I-D.ietf-6man-udpzero][I-D.ietf-6man-udpchecksums]. 888 The ITE then sets the outer IP layer headers the same as specified 889 for ordinary IP encapsulation (e.g., [RFC1070][RFC2003], [RFC2473], 890 [RFC4213], etc.) except that for ordinary SEAL packets the ITE copies 891 the "TTL/Hop Limit", "Type of Service/Traffic Class" and "Congestion 892 Experienced" values in the inner network layer header into the 893 corresponding fields in the outer IP header. For transitional SEAL 894 packets undergoing re-encapsulation, the ITE instead copies the "TTL/ 895 Hop Limit", "Type of Service/Traffic Class" and "Congestion 896 Experienced" values in the outer IP header of the received packet 897 into the corresponding fields in the outer IP header of the packet to 898 be forwarded (i.e., the values are transferred between outer headers 899 and *not* copied from the inner network layer header). 901 The ITE also sets the IP protocol number to the appropriate value for 902 the first protocol layer within the encapsulation (e.g., UDP, TCP, 903 SEAL, etc.). When IPv6 is used as the outer IP protocol, the ITE 904 then sets the flow label value in the outer IPv6 header the same as 905 described in [RFC6438]. When IPv4 is used as the outer IP protocol, 906 the ITE instead sets DF=0 in the IPv4 header to allow the packet to 907 be fragmented if it encounters a restricting link (for IPv6 SEAL 908 paths, the DF bit is implicitly set to 1). 910 The ITE finally sends each outer packet via the underlying link 911 corresponding to LINK_ID. 913 5.4.6. Path Probing and ETE Reachability Verification 915 All SEAL data packets sent by the ITE are considered implicit probes. 916 SEAL data packets will elicit an SCMP message from the ETE if it 917 needs to acknowledge a probe and/or report an error condition. SEAL 918 data packets may also be dropped by either the ETE or a router on the 919 path, which may or may not result in an ICMP message being returned 920 to the ITE. 922 The ITE processes ICMP messages as specified in Section 5.4.7. 924 The ITE processes SCMP messages as specified in Section 5.6.2. 926 5.4.7. Processing ICMP Messages 928 When the ITE sends SEAL packets, it may receive ICMP error messages 929 [RFC0792][RFC4443] from an ordinary router within the subnetwork. 930 Each ICMP message includes an outer IP header, followed by an ICMP 931 header, followed by a portion of the SEAL data packet that generated 932 the error (also known as the "packet-in-error") beginning with the 933 outer IP header. 935 The ITE should process ICMPv4 Protocol Unreachable messages and 936 ICMPv6 Parameter Problem messages with Code "Unrecognized Next Header 937 type encountered" as a hint that the IP destination address does not 938 implement SEAL. The ITE can optionally ignore ICMP messages that do 939 not include sufficient information in the packet-in-error, or process 940 them as a hint that the SEAL path may be failing. 942 For other ICMP messages, the ITE should use any outer header 943 information available as a first-pass authentication filter (e.g., to 944 determine if the source of the message is within the same 945 administrative domain as the ITE) and discards the message if first 946 pass filtering fails. 948 Next, the ITE examines the packet-in-error beginning with the SEAL 949 header. If the value in the Identification field (if present) is not 950 within the window of packets the ITE has recently sent to this ETE, 951 or if the MAC value in the SEAL header ICV field (if present) is 952 incorrect, the ITE discards the message. 954 Next, if the received ICMP message is a PTB the ITE sets the 955 temporary variable "PMTU" for this SEAL path to the MTU value in the 956 PTB message. If PMTU==0, the ITE consults a plateau table (e.g., as 957 described in [RFC1191]) to determine PMTU based on the length field 958 in the outer IP header of the packet-in-error. For example, if the 959 ITE receives a PTB message with MTU==0 and length 4KB, it can set 960 PMTU=2KB. If the ITE subsequently receives a PTB message with MTU==0 961 and length 2KB, it can set PMTU=1792, etc. to a minimum value of 962 PMTU=(1500+HLEN). If the ITE is performing stateful MTU 963 determination for this SEAL path (see Section 5.4.9), the ITE next 964 sets MAXMTU=MAX((PMTU-HLEN), 1500). 966 If the ICMP message was not discarded, the ITE then transcribes it 967 into a message to return to the previous hop. If the inner packet 968 was a SEAL data packet, the ITE transcribes the ICMP message into an 969 SCMP message. Otherwise, the ITE transcribes the ICMP message into a 970 message appropriate for the inner protocol version. 972 To transcribe the message, the ITE extracts the inner packet from 973 within the ICMP message packet-in-error field and uses it to generate 974 a new message corresponding to the type of the received ICMP message. 975 For SCMP messages, the ITE generates the message the same as 976 described for ETE generation of SCMP messages in Section 5.6.1. For 977 (S)PTB messages, the ITE writes (PMTU-HLEN) in the MTU field. 979 The ITE finally forwards the transcribed message to the previous hop 980 toward the inner source address. 982 5.4.8. IPv4 Middlebox Reassembly Testing 984 The ITE can perform a qualification exchange to ensure that the 985 subnetwork correctly delivers fragments to the ETE. This procedure 986 can be used, e.g., to determine whether there are middleboxes on the 987 path that violate the [RFC1812], Section 5.2.6 requirement that: "A 988 router MUST NOT reassemble any datagram before forwarding it". 990 The ITE should use knowledge of its topological arrangement as an aid 991 in determining when middlebox reassembly testing is necessary. For 992 example, if the ITE is aware that the ETE is located somewhere in the 993 public Internet, middlebox reassembly testing should not be 994 necessary. If the ITE is aware that the ETE is located behind a NAT 995 or a firewall, however, then reassembly testing can be used to detect 996 middleboxes that do not conform to specifications. 998 The ITE can perform a middlebox reassembly test by selecting a data 999 packet to be used as a probe. While performing the test with real 1000 data packets, the ITE should select only inner packets that are no 1001 larger than (1500-HLEN) bytes for testing purposes. The ITE can also 1002 construct a dummy probe packet instead of using ordinary SEAL data 1003 packets. 1005 To generate a dummy probe packet, the ITE creates a packet buffer 1006 beginning with the same outer headers, SEAL header and inner network 1007 layer header that would appear in an ordinary data packet, then pads 1008 the packet with random data to a length that is at least 128 bytes 1009 but no longer than (1500-HLEN) bytes. The ITE then writes the value 1010 '0' in the inner network layer TTL (for IPv4) or Hop Limit (for IPv6) 1011 field. 1013 The ITE then sets C=0 in the SEAL header of the probe packet and sets 1014 the NEXTHDR field to the inner network layer protocol type. (The ITE 1015 may also set A=1 if it requires a positive acknowledgement; 1016 otherwise, it sets A=0.) Next, the ITE sets LINK_ID and LEVEL to the 1017 appropriate values for this SEAL path, sets Identification and I=1 1018 (when USE_ID is TRUE), then finally calculates the ICV and sets V=1 1019 (when USE_ICV is TRUE). 1021 The ITE then encapsulates the probe packet in the appropriate outer 1022 headers, splits it into two outer IPv4 fragments, then sends both 1023 fragments over the same SEAL path. 1025 The ITE should send a series of probe packets (e.g., 3-5 probes with 1026 1sec intervals between tests) instead of a single isolated probe in 1027 case of packet loss. If the ETE returns an SCMP PTB message with MTU 1028 != 0, then the SEAL path correctly supports fragmentation; otherwise, 1029 the ITE enables stateful MTU determination for this SEAL path as 1030 specified in Section 5.4.9. 1032 (Examples of middleboxes that may perform reassembly include stateful 1033 NATs and firewalls. Such devices could still allow for stateless MTU 1034 determination if they gather the fragments of a fragmented IPv4 SEAL 1035 data packet for packet analysis purposes but then forward the 1036 fragments on to the final destination rather than forwarding the 1037 reassembled packet.) 1039 5.4.9. Stateful MTU Determination 1041 SEAL supports a stateless MTU determination capability, however the 1042 ITE may in some instances wish to impose a stateful MTU limit on a 1043 particular SEAL path. For example, when the ETE is situated behind a 1044 middlebox that performs IPv4 reassembly (see: Section 5.4.8) it is 1045 imperative that fragmentation be avoided. In other instances (e.g., 1046 when the SEAL path includes performance-constrained links), the ITE 1047 may deem it necessary to cache a conservative static MTU in order to 1048 avoid sending large packets that would only be dropped due to an MTU 1049 restriction somewhere on the path. 1051 To determine a static MTU value, the ITE sends a series of dummy 1052 probe packets of various sizes to the ETE with A=1 in the SEAL header 1053 and DF=1 in the outer IP header. The ITE then caches the size 'S' of 1054 the largest packet for which it receives a probe reply from the ETE 1055 by setting MAXMTU=MAX((S-HLEN), 1500) for this SEAL path. 1057 For example, the ITE could send probe packets of 4KB, followed by 1058 2KB, followed by 1792 bytes, etc. While probing, the ITE processes 1059 any ICMP PTB message it receives as a potential indication of probe 1060 failure then discards the message. 1062 5.4.10. Detecting Path MTU Changes 1064 When stateful MTU determination is used, the ITE SHOULD periodically 1065 reset MAXMTU and/or re-probe the path to determine whether MAXMTU has 1066 increased. If the path still has a too-small MTU, the ITE will 1067 receive a PTB message that reports a smaller size. 1069 5.5. ETE Specification 1071 5.5.1. Minimum Reassembly Buffer Requirements 1073 For IPv6, the ETE must configure a minimum reassembly buffer size of 1074 (1500 + HLEN) bytes for the reassembly of outer IPv6 packets, i.e., 1075 even though the true minimum reassembly size for IPv6 is only 1500 1076 bytes [RFC2460]. For IPv4, the ETE must also configure a minimum 1077 reassembly buffer size of (1500 + HLEN) bytes for the reassembly of 1078 outer IPv4 packets, i.e., even though the true minimum reassembly 1079 size for IPv4 is only 576 bytes [RFC1122]. 1081 In addition to this outer reassembly buffer requirement, the ETE must 1082 further configure a minimum SEAL reassembly buffer size of (1500 + 1083 HLEN) bytes for the reassembly of segmented SEAL packets (see: 1084 Section 5.5.4). 1086 5.5.2. Tunnel Neighbor Soft State 1088 When message origin authentication and integrity checking is 1089 required, the ETE maintains a per-ITE MAC calculation algorithm and a 1090 symmetric secret key to verify the MAC. When per-packet 1091 identification is required, the ETE also maintains a window of 1092 Identification values for the packets it has recently received from 1093 this ITE. 1095 When the tunnel neighbor relationship is bidirectional, the ETE 1096 further maintains a per SEAL path mapping of outer IP and transport 1097 layer addresses to the LINK_ID that appears in packets received from 1098 the ITE. 1100 5.5.3. IP-Layer Reassembly 1102 The ETE reassembles fragmented IP packets that are explcitly 1103 addressed to itself. For IP fragments that are received via a SEAL 1104 tunnel, the ETE SHOULD maintain conservative reassembly cache high- 1105 and low-water marks. When the size of the reassembly cache exceeds 1106 this high-water mark, the ETE SHOULD actively discard stale 1107 incomplete reassemblies (e.g., using an Active Queue Management (AQM) 1108 strategy) until the size falls below the low-water mark. The ETE 1109 SHOULD also actively discard any pending reassemblies that clearly 1110 have no opportunity for completion, e.g., when a considerable number 1111 of new fragments have arrived before a fragment that completes a 1112 pending reassembly arrives. 1114 The ETE processes non-SEAL IP packets as specified in the normative 1115 references, i.e., it performs any necessary IP reassembly then 1116 discards the packet if it is larger than the reassembly buffer size 1117 or delivers the (fully-reassembled) packet to the appropriate upper 1118 layer protocol module. 1120 For SEAL packets, the ETE performs any necessary IP reassembly then 1121 submits the packet for SEAL decapsulation as specified in Section 1122 5.5.4. (Note that if the packet is larger than the reassembly buffer 1123 size, the ETE still examines the leading portion of the (partially) 1124 reassembled packet during decapsulation as specified in the next 1125 section.) 1127 5.5.4. Decapsulation and Re-Encapsulation 1129 For each SEAL packet accepted for decapsulation, when I==1 the ETE 1130 first examines the Identification field. If the Identification is 1131 not within the window of acceptable values for this ITE, the ETE 1132 silently discards the packet. 1134 Next, if V==1 the ETE SHOULD verify the MAC value (with the MAC field 1135 itself reset to 0) and silently discard the packet if the value is 1136 incorrect. 1138 Next, if the packet arrived as multiple IP fragments, the ETE sends 1139 an SPTB message back to the ITE with MTU set to the size of the 1140 largest fragment received minus HLEN (see: Section 5.6.1.1). 1142 Next, if the packet arrived as multiple IP fragments and the inner 1143 packet is larger than 1500 bytes, the ETE silently discards the 1144 packet; otherwise, it continues to process the packet. 1146 Next, if there is an incorrect value in a SEAL header field (e.g., an 1147 incorrect "VER" field value), the ETE discards the packet. If the 1148 SEAL header has C==0, the ETE also returns an SCMP "Parameter 1149 Problem" (SPP) message (see Section 5.6.1.2). 1151 Next, if the SEAL header has C==1, the ETE processes the packet as an 1152 SCMP packet as specified in Section 5.6.2. Otherwise, the ETE 1153 continues to process the packet as a SEAL data packet. 1155 Next, if the SEAL header has (M==1 || Offset!==0) the ETE checks to 1156 see if the other segments of this already-segmented SEAL packet have 1157 arrived, i.e., by looking for additional segments that have the same 1158 outer IP source address, destination address, source transport port 1159 number (if present) and SEAL Identification value. If the other 1160 segments have already arrived, the ETE discards the SEAL header and 1161 other outer headers from the non-initial segments and appends them 1162 onto the end of the first segment according to their offset value. 1163 Otherwise, the ETE caches the segment for at most 60 seconds while 1164 awaiting the arrival of its partners. During this process, the ETE 1165 discards any segments that are overlapping with respect to segments 1166 that have already been received. The ETE further SHOULD manage the 1167 SEAL reassembly cache the same as described for the IP-Layer 1168 Reassembly cache in Section 5.5.3, i.e., it SHOULD perform an early 1169 discard for any pending reassemblies that have low probability of 1170 completion. 1172 Next, if the SEAL header in the (reassembled) packet has A==1, the 1173 ETE sends an SPTB message back to the ITE with MTU=0 (see: Section 1174 5.6.1.1). 1176 Finally, the ETE discards the outer headers and processes the inner 1177 packet according to the header type indicated in the SEAL NEXTHDR 1178 field. If the inner (TTL / Hop Limit) field encodes the value 0, the 1179 ETE silently discards the packet. Otherwise, if the next hop toward 1180 the inner destination address is via a different interface than the 1181 SEAL packet arrived on, the ETE discards the SEAL header and delivers 1182 the inner packet either to the local host or to the next hop 1183 interface if the packet is not destined to the local host. 1185 If the next hop is on the same interface the SEAL packet arrived on, 1186 however, the ETE submits the packet for SEAL re-encapsulation 1187 beginning with the specification in Section 5.4.3 above and without 1188 decrementing the value in the inner (TTL / Hop Limit) field. In this 1189 process, the packet remains within the tunnel (i.e., it does not exit 1190 and then re-enter the tunnel); hence, the packet is not discarded if 1191 the LEVEL field in the SEAL header contains the value 0. 1193 5.6. The SEAL Control Message Protocol (SCMP) 1195 SEAL provides a companion SEAL Control Message Protocol (SCMP) that 1196 uses the same message types and formats as for the Internet Control 1197 Message Protocol for IPv6 (ICMPv6) [RFC4443]. As for ICMPv6, each 1198 SCMP message includes a 32-bit header and a variable-length body. 1199 The ITE encapsulates the SCMP message in a SEAL header and outer 1200 headers as shown in Figure 4: 1202 +--------------------+ 1203 ~ outer IP header ~ 1204 +--------------------+ 1205 ~ other outer hdrs ~ 1206 +--------------------+ 1207 ~ SEAL Header ~ 1208 +--------------------+ +--------------------+ 1209 | SCMP message header| --> | SCMP message header| 1210 +--------------------+ +--------------------+ 1211 | | --> | | 1212 ~ SCMP message body ~ --> ~ SCMP message body ~ 1213 | | --> | | 1214 +--------------------+ +--------------------+ 1216 SCMP Message SCMP Packet 1217 before encapsulation after encapsulation 1219 Figure 4: SCMP Message Encapsulation 1221 The following sections specify the generation, processing and 1222 relaying of SCMP messages. 1224 5.6.1. Generating SCMP Error Messages 1226 ETEs generate SCMP error messages in response to receiving certain 1227 SEAL data packets using the format shown in Figure 5: 1229 0 1 2 3 1230 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1231 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1232 | Type | Code | Checksum | 1233 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1234 | Type-Specific Data | 1235 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1236 | As much of the invoking SEAL data packet as possible | 1237 ~ (beginning with the SEAL header) without the SCMP ~ 1238 | packet exceeding MINMTU bytes (*) | 1240 (*) also known as the "packet-in-error" 1242 Figure 5: SCMP Error Message Format 1244 The error message includes the 32-bit SCMP message header, followed 1245 by a 32-bit Type-Specific Data field, followed by the leading portion 1246 of the invoking SEAL data packet beginning with the SEAL header as 1247 the "packet-in-error". The packet-in-error includes as much of the 1248 invoking packet as possible extending to a length that would not 1249 cause the entire SCMP packet following outer encapsulation to exceed 1250 MINMTU bytes. 1252 When the ETE processes a SEAL data packet for which the 1253 Identification and ICV values are correct but an error must be 1254 returned, it prepares an SCMP error message as shown in Figure 5. 1255 The ETE sets the Type and Code fields to the same values that would 1256 appear in the corresponding ICMPv6 message [RFC4443], but calculates 1257 the Checksum beginning with the SCMP message header using the 1258 algorithm specified for ICMPv4 in [RFC0792]. 1260 The ETE next encapsulates the SCMP message in the requisite SEAL and 1261 outer headers as shown in Figure 4. During encapsulation, the ETE 1262 sets the outer destination address/port numbers of the SCMP packet to 1263 the values associated with the ITE and sets the outer source address/ 1264 port numbers to its own outer address/port numbers. 1266 The ETE then sets (C=1; A=0; RES=0; M=0; Offset=0) in the SEAL 1267 header, then sets I, V, NEXTHDR and LEVEL to the same values that 1268 appeared in the SEAL header of the data packet. If the neighbor 1269 relationship between the ITE and ETE is unidirectional, the ETE next 1270 sets the LINK_ID field to the same value that appeared in the SEAL 1271 header of the data packet. Otherwise, the ETE sets the LINK_ID field 1272 to the value it would use in sending a SEAL packet to this ITE. 1274 When I==1, the ETE next sets the Identification field to an 1275 appropriate value for the ITE. If the neighbor relationship between 1276 the ITE and ETE is unidirectional, the ETE sets the Identification 1277 field to the same value that appeared in the SEAL header of the data 1278 packet. Otherwise, the ETE sets the Identification field to the 1279 value it would use in sending the next SEAL packet to this ITE. 1281 When V==1, the ETE then prepares the ICV field the same as specified 1282 for SEAL data packet encapsulation in Section 5.4.4. 1284 Finally, the ETE sends the resulting SCMP packet to the ITE the same 1285 as specified for SEAL data packets in Section 5.4.5. 1287 The following sections describe additional considerations for various 1288 SCMP error messages: 1290 5.6.1.1. Generating SCMP Packet Too Big (SPTB) Messages 1292 An ETE generates an SPTB message when it receives a SEAL data packet 1293 that arrived as multiple outer IP fragments. The ETE prepares the 1294 SPTB message the same as for the corresponding ICMPv6 PTB message, 1295 and writes the length of the largest outer IP fragment received minus 1296 HLEN in the MTU field of the message. 1298 The ETE also generates an SPTB message when it accepts a SEAL 1299 protocol data packet with A==1 in the SEAL header. The ETE prepares 1300 the SPTB message the same as above, except that it writes the value 0 1301 in the MTU field. 1303 5.6.1.2. Generating Other SCMP Error Messages 1305 An ETE generates an SCMP "Destination Unreachable" (SDU) message 1306 under the same circumstances that an IPv6 system would generate an 1307 ICMPv6 Destination Unreachable message. 1309 An ETE generates an SCMP "Parameter Problem" (SPP) message when it 1310 receives a SEAL packet with an incorrect value in the SEAL header. 1312 TEs generate other SCMP message types using methods and procedures 1313 specified in other documents. For example, SCMP message types used 1314 for tunnel neighbor coordinations are specified in VET 1315 [I-D.templin-intarea-vet]. 1317 5.6.2. Processing SCMP Error Messages 1319 An ITE may receive SCMP messages with C==1 in the SEAL header after 1320 sending packets to an ETE. The ITE first verifies that the outer 1321 addresses of the SCMP packet are correct, and (when I==1) that the 1322 Identification field contains an acceptable value. The ITE next 1323 verifies that the SEAL header fields are set correctly as specified 1324 in Section 5.6.1. When V==1, the ITE then verifies the ICV. The ITE 1325 next verifies the Checksum value in the SCMP message header. If any 1326 of these values are incorrect, the ITE silently discards the message; 1327 otherwise, it processes the message as follows: 1329 5.6.2.1. Processing SCMP PTB Messages 1331 After an ITE sends a SEAL data packet to an ETE, it may receive an 1332 SPTB message with a packet-in-error containing the leading portion of 1333 the packet (see: Section 5.6.1.1). For IP SPTB messages with MTU==0, 1334 the ITE processes the message as confirmation that the ETE received a 1335 SEAL data packet with A==1 in the SEAL header. The ITE then discards 1336 the message. 1338 For SPTB messages with MTU != 0, the ITE processes the message as an 1339 indication of a packet size limitation as follows. If the inner 1340 packet is no larger than 1500 bytes, the ITE reduces its MINMTU value 1341 for this ITE. If the inner packet length is larger than 1500 and the 1342 MTU value is not substantially less than MINMTU bytes, the value is 1343 likely to reflect the true MTU of the restricting link on the path to 1344 the ETE; otherwise, a router on the path may be generating runt 1345 fragments. 1347 In that case, the ITE can consult a plateau table (e.g., as described 1348 in [RFC1191]) to rewrite the MTU value to a reduced size. For 1349 example, if the ITE receives an IPv4 SPTB message with MTU==256 and 1350 inner packet length 4KB, it can rewrite the MTU to 2KB. If the ITE 1351 subsequently receives an IPv4 SPTB message with MTU==256 and inner 1352 packet length 2KB, it can rewrite the MTU to 1792, etc., to a minimum 1353 of 1500 bytes. If the ITE is performing stateful MTU determination 1354 for this SEAL path, it then writes the new MTU value minus HLEN in 1355 MAXMTU. 1357 The ITE then checks its forwarding tables to discover the previous 1358 hop toward the source address of the inner packet. If the previous 1359 hop is reached via the same tunnel interface the SPTB message arrived 1360 on, the ITE relays the message to the previous hop. In order to 1361 relay the message, the first writes zero in the Identification and 1362 ICV fields of the SEAL header within the packet-in-error. The ITE 1363 next rewrites the outer SEAL header fields with values corresponding 1364 to the previous hop and recalculates the MAC using the MAC 1365 calculation parameters associated with the previous hop. Next, the 1366 ITE replaces the SPTB's outer headers with headers of the appropriate 1367 protocol version and fills in the header fields as specified in 1368 Section 5.4.5, where the destination address/port correspond to the 1369 previous hop and the source address/port correspond to the ITE. The 1370 ITE then sends the message to the previous hop the same as if it were 1371 issuing a new SPTB message. (Note that, in this process, the values 1372 within the SEAL header of the packet-in-error are meaningless to the 1373 previous hop and therefore cannot be used by the previous hop for 1374 authentication purposes.) 1376 If the previous hop is not reached via the same tunnel interface, the 1377 ITE instead transcribes the message into a format appropriate for the 1378 inner packet (i.e., the same as described for transcribing ICMP 1379 messages in Section 5.4.7) and sends the resulting transcribed 1380 message to the original source. (NB: if the inner packet within the 1381 SPTB message is an IPv4 SEAL packet with DF==0, the ITE should set 1382 DF=1 and re-calculate the IPv4 header checksum while transcribing the 1383 message in order to avoid bogon filters.) The ITE then discards the 1384 SPTB message. 1386 Note that the ITE may receive an SPTB message from another ITE that 1387 is at the head end of a nested level of encapsulation. The ITE has 1388 no security associations with this nested ITE, hence it should 1389 consider this SPTB message the same as if it had received an ICMP PTB 1390 message from an ordinary router on the path to the ETE. That is, the 1391 ITE should examine the packet-in-error field of the SPTB message and 1392 only process the message if it is able to recognize the packet as one 1393 it had previously sent. 1395 5.6.2.2. Processing Other SCMP Error Messages 1397 An ITE may receive an SDU message with an appropriate code under the 1398 same circumstances that an IPv6 node would receive an ICMPv6 1399 Destination Unreachable message. The ITE either transcribes or 1400 relays the message toward the source address of the inner packet 1401 within the packet-in-error the same as specified for SPTB messages in 1402 Section 5.6.2.1. 1404 An ITE may receive an SPP message when the ETE receives a SEAL packet 1405 with an incorrect value in the SEAL header. The ITE should examine 1406 the SEAL header within the packet-in-error to determine whether a 1407 different setting should be used in subsequent packets, but does not 1408 relay the message further. 1410 TEs process other SCMP message types using methods and procedures 1411 specified in other documents. For example, SCMP message types used 1412 for tunnel neighbor coordinations are specified in VET 1413 [I-D.templin-intarea-vet]. 1415 6. Link Requirements 1417 Subnetwork designers are expected to follow the recommendations in 1418 Section 2 of [RFC3819] when configuring link MTUs. 1420 7. End System Requirements 1422 End systems are encouraged to implement end-to-end MTU assurance 1423 (e.g., using Packetization Layer Path MTU Discovery (PLPMTUD) per 1424 [RFC4821]) even if the subnetwork is using SEAL. 1426 When end systems use PLPMTUD, SEAL will ensure that the tunnel 1427 behaves as a link in the path that assures an MTU of at least 1500 1428 bytes while not precluding discovery of larger MTUs. The PMPMTUD 1429 mechanism will therefore be able to function as designed in order to 1430 discover and utilize larger MTUs. 1432 8. Router Requirements 1434 Routers within the subnetwork are expected to observe the standard IP 1435 router requirements, including the implementation of IP fragmentation 1436 and reassembly as well as the generation of ICMP messages 1437 [RFC0792][RFC1122][RFC1812][RFC2460][RFC4443][RFC6434]. 1439 Note that, even when routers support existing requirements for the 1440 generation of ICMP messages, these messages are often filtered and 1441 discarded by middleboxes on the path to the original source of the 1442 message that triggered the ICMP. It is therefore not possible to 1443 assume delivery of ICMP messages even when routers are correctly 1444 implemented. 1446 9. Nested Encapsulation Considerations 1448 SEAL supports nested tunneling for up to 8 layers of encapsulation. 1449 In this model, the SEAL ITE has a tunnel neighbor relationship only 1450 with ETEs at its own nesting level, i.e., it does not have a tunnel 1451 neighbor relationship with other ITEs, nor with ETEs at other nesting 1452 levels. 1454 Therefore, when an ITE 'A' within an outer nesting level needs to 1455 return an error message to an ITE 'B' within an inner nesting level, 1456 it generates an ordinary ICMP error message the same as if it were an 1457 ordinary router within the subnetwork. 'B' can then perform message 1458 validation as specified in Section 5.4.7, but full message origin 1459 authentication is not possible. 1461 Since ordinary ICMP messages are used for coordinations between ITEs 1462 at different nesting levels, nested SEAL encapsulations should only 1463 be used when the ITEs are within a common administrative domain 1464 and/or when there is no ICMP filtering middlebox such as a firewall 1465 or NAT between them. An example would be a recursive nesting of 1466 mobile networks, where the first network receives service from an 1467 ISP, the second network receives service from the first network, the 1468 third network receives service from the second network, etc. 1470 NB: As an alternative, the SCMP protocol could be extended to allow 1471 ITE 'A' to return an SCMP message to ITE 'B' rather than return an 1472 ICMP message. This would conceptually allow the control messages to 1473 pass through firewalls and NATs, however it would give no more 1474 message origin authentication assurance than for ordinary ICMP 1475 messages. It was therefore determined that the complexity of 1476 extending the SCMP protocol was of little value within the context of 1477 the anticipated use cases for nested encapsulations. 1479 10. Reliability Considerations 1481 Although a SEAL tunnel may span an arbitrarily-large subnetwork 1482 expanse, the IP layer sees the tunnel as a simple link that supports 1483 the IP service model. Links with high bit error rates (BERs) (e.g., 1484 IEEE 802.11) use Automatic Repeat-ReQuest (ARQ) mechanisms [RFC3366] 1485 to increase packet delivery ratios, while links with much lower BERs 1486 typically omit such mechanisms. Since SEAL tunnels may traverse 1487 arbitrarily-long paths over links of various types that are already 1488 either performing or omitting ARQ as appropriate, it would therefore 1489 be inefficient to require the tunnel endpoints to also perform ARQ. 1491 11. Integrity Considerations 1493 The SEAL header includes an integrity check field that covers the 1494 SEAL header and at least the inner packet headers. This provides for 1495 header integrity verification on a segment-by-segment basis for a 1496 segmented re-encapsulating tunnel path. 1498 Fragmentation and reassembly schemes must also consider packet- 1499 splicing errors, e.g., when two fragments from the same packet are 1500 concatenated incorrectly, when a fragment from packet X is 1501 reassembled with fragments from packet Y, etc. The primary sources 1502 of such errors include implementation bugs and wrapping IPv4 ID 1503 fields. 1505 In particular, the IPv4 16-bit ID field can wrap with only 64K 1506 packets with the same (src, dst, protocol)-tuple alive in the system 1507 at a given time [RFC4963]. When the IPv4 ID field is re-written by a 1508 middlebox such as a NAT or Firewall, ID field wrapping can occur with 1509 even fewer packets alive in the system. It is therefore essential 1510 that IPv4 fragmentation and reassembly be avoided. 1512 12. IANA Considerations 1514 The IANA is requested to allocate a User Port number for "SEAL" in 1515 the 'port-numbers' registry. The Service Name is "SEAL", and the 1516 Transport Protocols are TCP and UDP. The Assignee is the IESG 1517 (iesg@ietf.org) and the Contact is the IETF Chair (chair@ietf.org). 1518 The Description is "Subnetwork Encapsulation and Adaptation Layer 1519 (SEAL)", and the Reference is the RFC-to-be currently known as 1520 'draft-templin-intarea.seal'. 1522 13. Security Considerations 1524 SEAL provides a segment-by-segment message origin authentication, 1525 integrity and anti-replay service. The SEAL header is sent in-the- 1526 clear the same as for the outer IP and other outer headers. In this 1527 respect, the threat model is no different than for IPv6 extension 1528 headers. Unlike IPv6 extension headers, however, the SEAL header can 1529 be protected by an integrity check that also covers the inner packet 1530 headers. 1532 An amplification/reflection/buffer overflow attack is possible when 1533 an attacker sends IP fragments with spoofed source addresses to an 1534 ETE in an attempt to clog the ETE's reassembly buffer and/or cause 1535 the ETE to generate a stream of SCMP messages returned to a victim 1536 ITE. The SCMP message ICV, Identification, as well as the inner 1537 headers of the packet-in-error, provide mitigation for the ETE to 1538 detect and discard SEAL segments with spoofed source addresses. 1540 Security issues that apply to tunneling in general are discussed in 1541 [RFC6169]. 1543 14. Related Work 1545 Section 3.1.7 of [RFC2764] provides a high-level sketch for 1546 supporting large tunnel MTUs via a tunnel-level segmentation and 1547 reassembly capability to avoid IP level fragmentation. 1549 Section 3 of [RFC4459] describes inner and outer fragmentation at the 1550 tunnel endpoints as alternatives for accommodating the tunnel MTU. 1552 Section 4 of [RFC2460] specifies a method for inserting and 1553 processing extension headers between the base IPv6 header and 1554 transport layer protocol data. The SEAL header is inserted and 1555 processed in exactly the same manner. 1557 IPsec/AH is [RFC4301][RFC4301] is used for full message integrity 1558 verification between tunnel endpoints, whereas SEAL only ensures 1559 integrity for the inner packet headers. The AYIYA proposal 1560 [I-D.massar-v6ops-ayiya] uses similar means for providing message 1561 authentication and integrity. 1563 SEAL, along with the Virtual Enterprise Traversal (VET) 1564 [I-D.templin-intarea-vet] tunnel virtual interface abstraction, are 1565 the functional building blocks for the Interior Routing Overlay 1566 Network (IRON) [I-D.templin-ironbis] and Routing and Addressing in 1567 Networks with Global Enterprise Recursion (RANGER) [RFC5720][RFC6139] 1568 architectures. 1570 The concepts of path MTU determination through the report of 1571 fragmentation and extending the IPv4 Identification field were first 1572 proposed in deliberations of the TCP-IP mailing list and the Path MTU 1573 Discovery Working Group (MTUDWG) during the late 1980's and early 1574 1990's. An historical analysis of the evolution of these concepts, 1575 as well as the development of the eventual PMTUD mechanism, appears 1576 in [RFC5320]. 1578 15. Implementation Status 1580 An early implementation of the first revision of SEAL [RFC5320] is 1581 available at: http://isatap.com/seal. 1583 16. Acknowledgments 1585 The following individuals are acknowledged for helpful comments and 1586 suggestions: Jari Arkko, Fred Baker, Iljitsch van Beijnum, Oliver 1587 Bonaventure, Teco Boot, Bob Braden, Brian Carpenter, Steve Casner, 1588 Ian Chakeres, Noel Chiappa, Remi Denis-Courmont, Remi Despres, Ralph 1589 Droms, Aurnaud Ebalard, Gorry Fairhurst, Washam Fan, Dino Farinacci, 1590 Joel Halpern, Sam Hartman, John Heffner, Thomas Henderson, Bob 1591 Hinden, Christian Huitema, Eliot Lear, Darrel Lewis, Joe Macker, Matt 1592 Mathis, Erik Nordmark, Dan Romascanu, Dave Thaler, Joe Touch, Mark 1593 Townsley, Ole Troan, Margaret Wasserman, Magnus Westerlund, Robin 1594 Whittle, James Woodyatt, and members of the Boeing Research & 1595 Technology NST DC&NT group. 1597 Discussions with colleagues following the publication of [RFC5320] 1598 have provided useful insights that have resulted in significant 1599 improvements to this, the Second Edition of SEAL. 1601 This document received substantial review input from the IESG and 1602 IETF area directorates in the February 2013 timeframe. IESG members 1603 and IETF area directorate representatives who contributed helpful 1604 comments and suggestions are gratefully acknowledged. 1606 Path MTU determination through the report of fragmentation was first 1607 proposed by Charles Lynn on the TCP-IP mailing list in 1987. 1608 Extending the IP identification field was first proposed by Steve 1609 Deering on the MTUDWG mailing list in 1989. 1611 17. References 1613 17.1. Normative References 1615 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1616 September 1981. 1618 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 1619 RFC 792, September 1981. 1621 [RFC1122] Braden, R., "Requirements for Internet Hosts - 1622 Communication Layers", STD 3, RFC 1122, October 1989. 1624 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1625 Requirement Levels", BCP 14, RFC 2119, March 1997. 1627 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1628 (IPv6) Specification", RFC 2460, December 1998. 1630 [RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure 1631 Neighbor Discovery (SEND)", RFC 3971, March 2005. 1633 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 1634 Message Protocol (ICMPv6) for the Internet Protocol 1635 Version 6 (IPv6) Specification", RFC 4443, March 2006. 1637 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 1638 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 1639 September 2007. 1641 17.2. Informative References 1643 [FOLK] Shannon, C., Moore, D., and k. claffy, "Beyond Folklore: 1644 Observations on Fragmented Traffic", December 2002. 1646 [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", 1647 October 1987. 1649 [I-D.ietf-6man-udpchecksums] 1650 Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and 1651 UDP Checksums for Tunneled Packets", 1652 draft-ietf-6man-udpchecksums-08 (work in progress), 1653 February 2013. 1655 [I-D.ietf-6man-udpzero] 1656 Fairhurst, G. and M. Westerlund, "Applicability Statement 1657 for the use of IPv6 UDP Datagrams with Zero Checksums", 1658 draft-ietf-6man-udpzero-12 (work in progress), 1659 February 2013. 1661 [I-D.massar-v6ops-ayiya] 1662 Massar, J., "AYIYA: Anything In Anything", 1663 draft-massar-v6ops-ayiya-02 (work in progress), July 2004. 1665 [I-D.taylor-v6ops-fragdrop] 1666 Jaeggli, J., Colitti, L., Kumari, W., Vyncke, E., Kaeo, 1667 M., and T. Taylor, "Why Operators Filter Fragments and 1668 What It Implies", draft-taylor-v6ops-fragdrop-00 (work in 1669 progress), October 2012. 1671 [I-D.templin-intarea-vet] 1672 Templin, F., "Virtual Enterprise Traversal (VET)", 1673 draft-templin-intarea-vet-38 (work in progress), 1674 April 2013. 1676 [I-D.templin-ironbis] 1677 Templin, F., "The Interior Routing Overlay Network 1678 (IRON)", draft-templin-ironbis-13 (work in progress), 1679 March 2013. 1681 [RFC0994] International Organization for Standardization (ISO) and 1682 American National Standards Institute (ANSI), "Final text 1683 of DIS 8473, Protocol for Providing the Connectionless- 1684 mode Network Service", RFC 994, March 1986. 1686 [RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP 1687 MTU discovery options", RFC 1063, July 1988. 1689 [RFC1070] Hagens, R., Hall, N., and M. Rose, "Use of the Internet as 1690 a subnetwork for experimentation with the OSI network 1691 layer", RFC 1070, February 1989. 1693 [RFC1146] Zweig, J. and C. Partridge, "TCP alternate checksum 1694 options", RFC 1146, March 1990. 1696 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1697 November 1990. 1699 [RFC1701] Hanks, S., Li, T., Farinacci, D., and P. Traina, "Generic 1700 Routing Encapsulation (GRE)", RFC 1701, October 1994. 1702 [RFC1812] Baker, F., "Requirements for IP Version 4 Routers", 1703 RFC 1812, June 1995. 1705 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 1706 for IP version 6", RFC 1981, August 1996. 1708 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 1709 October 1996. 1711 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 1712 Hashing for Message Authentication", RFC 2104, 1713 February 1997. 1715 [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in 1716 IPv6 Specification", RFC 2473, December 1998. 1718 [RFC2675] Borman, D., Deering, S., and R. Hinden, "IPv6 Jumbograms", 1719 RFC 2675, August 1999. 1721 [RFC2764] Gleeson, B., Heinanen, J., Lin, A., Armitage, G., and A. 1722 Malis, "A Framework for IP Based Virtual Private 1723 Networks", RFC 2764, February 2000. 1725 [RFC2780] Bradner, S. and V. Paxson, "IANA Allocation Guidelines For 1726 Values In the Internet Protocol and Related Headers", 1727 BCP 37, RFC 2780, March 2000. 1729 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 1730 Defeating Denial of Service Attacks which employ IP Source 1731 Address Spoofing", BCP 38, RFC 2827, May 2000. 1733 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", 1734 RFC 2923, September 2000. 1736 [RFC3232] Reynolds, J., "Assigned Numbers: RFC 1700 is Replaced by 1737 an On-line Database", RFC 3232, January 2002. 1739 [RFC3366] Fairhurst, G. and L. Wood, "Advice to link designers on 1740 link Automatic Repeat reQuest (ARQ)", BCP 62, RFC 3366, 1741 August 2002. 1743 [RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D., 1744 Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. 1745 Wood, "Advice for Internet Subnetwork Designers", BCP 89, 1746 RFC 3819, July 2004. 1748 [RFC4191] Draves, R. and D. Thaler, "Default Router Preferences and 1749 More-Specific Routes", RFC 4191, November 2005. 1751 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 1752 for IPv6 Hosts and Routers", RFC 4213, October 2005. 1754 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 1755 Internet Protocol", RFC 4301, December 2005. 1757 [RFC4302] Kent, S., "IP Authentication Header", RFC 4302, 1758 December 2005. 1760 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1761 Network Tunneling", RFC 4459, April 2006. 1763 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 1764 Discovery", RFC 4821, March 2007. 1766 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 1767 Errors at High Data Rates", RFC 4963, July 2007. 1769 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 1770 Mitigations", RFC 4987, August 2007. 1772 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1773 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 1774 May 2008. 1776 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 1777 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 1779 [RFC5320] Templin, F., "The Subnetwork Encapsulation and Adaptation 1780 Layer (SEAL)", RFC 5320, February 2010. 1782 [RFC5445] Watson, M., "Basic Forward Error Correction (FEC) 1783 Schemes", RFC 5445, March 2009. 1785 [RFC5720] Templin, F., "Routing and Addressing in Networks with 1786 Global Enterprise Recursion (RANGER)", RFC 5720, 1787 February 2010. 1789 [RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010. 1791 [RFC6139] Russert, S., Fleischman, E., and F. Templin, "Routing and 1792 Addressing in Networks with Global Enterprise Recursion 1793 (RANGER) Scenarios", RFC 6139, February 2011. 1795 [RFC6169] Krishnan, S., Thaler, D., and J. Hoagland, "Security 1796 Concerns with IP Tunneling", RFC 6169, April 2011. 1798 [RFC6335] Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S. 1799 Cheshire, "Internet Assigned Numbers Authority (IANA) 1800 Procedures for the Management of the Service Name and 1801 Transport Protocol Port Number Registry", BCP 165, 1802 RFC 6335, August 2011. 1804 [RFC6434] Jankiewicz, E., Loughney, J., and T. Narten, "IPv6 Node 1805 Requirements", RFC 6434, December 2011. 1807 [RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label 1808 for Equal Cost Multipath Routing and Link Aggregation in 1809 Tunnels", RFC 6438, November 2011. 1811 [RFC6864] Touch, J., "Updated Specification of the IPv4 ID Field", 1812 RFC 6864, February 2013. 1814 [RIPE] De Boer, M. and J. Bosma, "Discovering Path MTU Black 1815 Holes on the Internet using RIPE Atlas", July 2012. 1817 [SIGCOMM] Luckie, M. and B. Stasiewicz, "Measuring Path MTU 1818 Discovery Behavior", November 2010. 1820 [TBIT] Medina, A., Allman, M., and S. Floyd, "Measuring 1821 Interactions Between Transport Protocols and Middleboxes", 1822 October 2004. 1824 [WAND] Luckie, M., Cho, K., and B. Owens, "Inferring and 1825 Debugging Path MTU Discovery Failures", October 2005. 1827 Author's Address 1829 Fred L. Templin (editor) 1830 Boeing Research & Technology 1831 P.O. Box 3707 1832 Seattle, WA 98124 1833 USA 1835 Email: fltemplin@acm.org