idnits 2.17.1 draft-templin-intarea-seal-58.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == The 'Obsoletes: ' line in the draft header should list only the _numbers_ of the RFCs which will be obsoleted by this document (if approved); it should not include the word 'RFC' in the list. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 26, 2013) is 3929 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC3971' is defined on line 1631, but no explicit reference was found in the text == Unused Reference: 'RFC4861' is defined on line 1638, but no explicit reference was found in the text == Unused Reference: 'RFC1063' is defined on line 1674, but no explicit reference was found in the text == Unused Reference: 'RFC1146' is defined on line 1681, but no explicit reference was found in the text == Unused Reference: 'RFC2675' is defined on line 1706, but no explicit reference was found in the text == Unused Reference: 'RFC2780' is defined on line 1714, but no explicit reference was found in the text == Unused Reference: 'RFC4191' is defined on line 1737, but no explicit reference was found in the text == Unused Reference: 'RFC4987' is defined on line 1758, but no explicit reference was found in the text == Unused Reference: 'RFC5226' is defined on line 1761, but no explicit reference was found in the text == Unused Reference: 'RFC5246' is defined on line 1765, but no explicit reference was found in the text == Unused Reference: 'RFC5445' is defined on line 1771, but no explicit reference was found in the text == Unused Reference: 'RFC6335' is defined on line 1787, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-02) exists of draft-taylor-v6ops-fragdrop-01 == Outdated reference: A later version (-16) exists of draft-templin-ironbis-15 -- Obsolete informational reference (is this intentional?): RFC 1063 (Obsoleted by RFC 1191) -- Obsolete informational reference (is this intentional?): RFC 1146 (Obsoleted by RFC 6247) -- Obsolete informational reference (is this intentional?): RFC 1981 (Obsoleted by RFC 8201) -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) -- Obsolete informational reference (is this intentional?): RFC 6434 (Obsoleted by RFC 8504) Summary: 1 error (**), 0 flaws (~~), 17 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Templin, Ed. 3 Internet-Draft Boeing Research & Technology 4 Obsoletes: rfc5320 (if approved) June 26, 2013 5 Intended status: Informational 6 Expires: December 28, 2013 8 The Subnetwork Encapsulation and Adaptation Layer (SEAL) 9 draft-templin-intarea-seal-58.txt 11 Abstract 13 This document specifies a Subnetwork Encapsulation and Adaptation 14 Layer (SEAL). SEAL operates over virtual topologies configured over 15 connected IP network routing regions bounded by encapsulating border 16 nodes. These virtual topologies are manifested by tunnels that may 17 span multiple IP and/or sub-IP layer forwarding hops, where they may 18 incur packet duplication, packet reordering, source address spoofing 19 and traversal of links with diverse Maximum Transmission Units 20 (MTUs). SEAL addresses these issues through the encapsulation and 21 messaging mechanisms specified in this document. 23 Status of this Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on December 28, 2013. 40 Copyright Notice 42 Copyright (c) 2013 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 58 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . 4 59 1.2. Approach . . . . . . . . . . . . . . . . . . . . . . . . . 6 60 1.3. Differences with RFC5320 . . . . . . . . . . . . . . . . . 7 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 8 62 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 10 63 4. Applicability Statement . . . . . . . . . . . . . . . . . . . 10 64 5. SEAL Specification . . . . . . . . . . . . . . . . . . . . . . 11 65 5.1. SEAL Tunnel Model . . . . . . . . . . . . . . . . . . . . 11 66 5.2. SEAL Model of Operation . . . . . . . . . . . . . . . . . 11 67 5.3. SEAL Header and Trailer Format . . . . . . . . . . . . . . 13 68 5.4. ITE Specification . . . . . . . . . . . . . . . . . . . . 15 69 5.4.1. Tunnel Interface MTU . . . . . . . . . . . . . . . . . 15 70 5.4.2. Tunnel Neighbor Soft State . . . . . . . . . . . . . . 16 71 5.4.3. SEAL Layer Pre-Processing . . . . . . . . . . . . . . 17 72 5.4.4. SEAL Encapsulation and Segmentation . . . . . . . . . 18 73 5.4.5. Outer Encapsulation . . . . . . . . . . . . . . . . . 20 74 5.4.6. Path Probing and ETE Reachability Verification . . . . 21 75 5.4.7. Processing ICMP Messages . . . . . . . . . . . . . . . 21 76 5.4.8. IPv4 Middlebox Reassembly Testing . . . . . . . . . . 22 77 5.4.9. Stateful MTU Determination . . . . . . . . . . . . . . 23 78 5.4.10. Detecting Path MTU Changes . . . . . . . . . . . . . . 24 79 5.5. ETE Specification . . . . . . . . . . . . . . . . . . . . 24 80 5.5.1. Reassembly Buffer Requirements . . . . . . . . . . . . 24 81 5.5.2. Tunnel Neighbor Soft State . . . . . . . . . . . . . . 24 82 5.5.3. IP-Layer Reassembly . . . . . . . . . . . . . . . . . 25 83 5.5.4. Decapsulation, SEAL-Layer Reassembly, and 84 Re-Encapsulation . . . . . . . . . . . . . . . . . . . 25 85 5.6. The SEAL Control Message Protocol (SCMP) . . . . . . . . . 26 86 5.6.1. Generating SCMP Error Messages . . . . . . . . . . . . 27 87 5.6.2. Processing SCMP Error Messages . . . . . . . . . . . . 29 88 6. Link Requirements . . . . . . . . . . . . . . . . . . . . . . 31 89 7. End System Requirements . . . . . . . . . . . . . . . . . . . 31 90 8. Router Requirements . . . . . . . . . . . . . . . . . . . . . 32 91 9. Nested Encapsulation Considerations . . . . . . . . . . . . . 32 92 10. Reliability Considerations . . . . . . . . . . . . . . . . . . 33 93 11. Integrity Considerations . . . . . . . . . . . . . . . . . . . 33 94 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 33 95 13. Security Considerations . . . . . . . . . . . . . . . . . . . 34 96 14. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 34 97 15. Implementation Status . . . . . . . . . . . . . . . . . . . . 35 98 16. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 35 99 17. References . . . . . . . . . . . . . . . . . . . . . . . . . . 36 100 17.1. Normative References . . . . . . . . . . . . . . . . . . . 36 101 17.2. Informative References . . . . . . . . . . . . . . . . . . 36 102 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 40 104 1. Introduction 106 As Internet technology and communication has grown and matured, many 107 techniques have developed that use virtual topologies (manifested by 108 tunnels of one form or another) over an actual network that supports 109 the Internet Protocol (IP) [RFC0791][RFC2460]. Those virtual 110 topologies have elements that appear as one network layer hop, but 111 are actually multiple IP or sub-IP layer hops. These multiple hops 112 often have quite diverse properties that are often not even visible 113 to the endpoints of the virtual hop. This introduces failure modes 114 that are not dealt with well in current approaches. 116 The use of IP encapsulation (also known as "tunneling") has long been 117 considered as the means for creating such virtual topologies (e.g., 118 see [RFC2003][RFC2473]). However, the encapsulation headers often 119 include insufficiently provisioned per-packet identification values. 120 IP encapsulation also allows an attacker to produce encapsulated 121 packets with spoofed source addresses even if the source address in 122 the encapsulating header cannot be spoofed. A denial-of-service 123 vector that is not possible in non-tunneled subnetworks is therefore 124 presented. 126 Additionally, the insertion of an outer IP header reduces the 127 effective path MTU visible to the inner network layer. When IPv6 is 128 used as the encapsulation protocol, original sources expect to be 129 informed of the MTU limitation through IPv6 Path MTU discovery 130 (PMTUD) [RFC1981]. When IPv4 is used, this reduced MTU can be 131 accommodated through the use of IPv4 fragmentation, but unmitigated 132 in-the-network fragmentation has been found to be harmful through 133 operational experience and studies conducted over the course of many 134 years [FRAG][FOLK][RFC4963]. Additionally, classical IPv4 PMTUD 135 [RFC1191] has known operational issues that are exacerbated by in- 136 the-network tunnels [RFC2923][RFC4459]. 138 The following subsections present further details on the motivation 139 and approach for addressing these issues. 141 1.1. Motivation 143 Before discussing the approach, it is necessary to first understand 144 the problems. In both the Internet and private-use networks today, 145 IP is ubiquitously deployed as the Layer 3 protocol. The primary 146 functions of IP are to provide for routing, addressing, and a 147 fragmentation and reassembly capability used to accommodate links 148 with diverse MTUs. While it is well known that the IP address space 149 is rapidly becoming depleted, there is also a growing awareness that 150 other IP protocol limitations have already or may soon become 151 problematic. 153 First, the Internet historically provided no means for discerning 154 whether the source addresses of IP packets are authentic. This 155 shortcoming is being addressed more and more through the deployment 156 of site border router ingress filters [RFC2827], however the use of 157 encapsulation provides a vector for an attacker to circumvent 158 filtering for the encapsulated packet even if filtering is correctly 159 applied to the encapsulation header. Secondly, the IP header does 160 not include a well-behaved identification value unless the source has 161 included a fragment header for IPv6 or unless the source permits 162 fragmentation for IPv4. These limitations preclude an efficient 163 means for routers to detect duplicate packets and packets that have 164 been re-ordered within the subnetwork. Additionally, recent studies 165 have shown that the arrival of fragments at high data rates can cause 166 denial-of-service (DoS) attacks on performance-sensitive networking 167 gear, prompting some administrators to configure their equipment to 168 drop fragments unconditionally [I-D.taylor-v6ops-fragdrop]. 170 For IPv4 encapsulation, when fragmentation is permitted the header 171 includes a 16-bit Identification field, meaning that at most 2^16 172 unique packets with the same (source, destination, protocol)-tuple 173 can be active in the network at the same time [RFC6864]. (When 174 middleboxes such as Network Address Translators (NATs) re-write the 175 Identification field to random values, the number of unique packets 176 is even further reduced.) Due to the escalating deployment of high- 177 speed links, however, these numbers have become too small by several 178 orders of magnitude for high data rate packet sources such as tunnel 179 endpoints [RFC4963]. 181 Furthermore, there are many well-known limitations pertaining to IPv4 182 fragmentation and reassembly - even to the point that it has been 183 deemed "harmful" in both classic and modern-day studies (see above). 184 In particular, IPv4 fragmentation raises issues ranging from minor 185 annoyances (e.g., in-the-network router fragmentation [RFC1981]) to 186 the potential for major integrity issues (e.g., mis-association of 187 the fragments of multiple IP packets during reassembly [RFC4963]). 189 As a result of these perceived limitations, a fragmentation-avoiding 190 technique for discovering the MTU of the forward path from a source 191 to a destination node was devised through the deliberations of the 192 Path MTU Discovery Working Group (PMTUDWG) during the late 1980's 193 through early 1990's which resulted in the publication of [RFC1191]. 194 In this negative feedback-based method, the source node provides 195 explicit instructions to routers in the path to discard the packet 196 and return an ICMP error message if an MTU restriction is 197 encountered. However, this approach has several serious shortcomings 198 that lead to an overall "brittleness" [RFC2923]. 200 In particular, site border routers in the Internet have been known to 201 discard ICMP error messages coming from the outside world. This is 202 due in large part to the fact that malicious spoofing of error 203 messages in the Internet is trivial since there is no way to 204 authenticate the source of the messages [RFC5927]. Furthermore, when 205 a source node that requires ICMP error message feedback when a packet 206 is dropped due to an MTU restriction does not receive the messages, a 207 path MTU-related black hole occurs. This means that the source will 208 continue to send packets that are too large and never receive an 209 indication from the network that they are being discarded. This 210 behavior has been confirmed through documented studies showing clear 211 evidence of PMTUD failures for both IPv4 and IPv6 in the Internet 212 today [TBIT][WAND][SIGCOMM][RIPE]. 214 The issues with both IP fragmentation and this "classical" PMTUD 215 method are exacerbated further when IP tunneling is used [RFC4459]. 216 For example, an ingress tunnel endpoint (ITE) may be required to 217 forward encapsulated packets into the subnetwork on behalf of 218 hundreds, thousands, or even more original sources. If the ITE 219 allows IP fragmentation on the encapsulated packets, persistent 220 fragmentation could lead to undetected data corruption due to 221 Identification field wrapping and/or reassembly congestion at the 222 ETE. If the ITE instead uses classical IP PMTUD it must rely on ICMP 223 error messages coming from the subnetwork that may be suspect, 224 subject to loss due to filtering middleboxes, or insufficiently 225 provisioned for translation into error messages to be returned to the 226 original sources. 228 Although recent works have led to the development of a positive 229 feedback-based end-to-end MTU determination scheme [RFC4821], they do 230 not excuse tunnels from accounting for the encapsulation overhead 231 they add to packets. Moreover, in current practice existing 232 tunneling protocols mask the MTU issues by selecting a "lowest common 233 denominator" MTU that may be much smaller than necessary for most 234 paths and difficult to change at a later date. Therefore, a new 235 approach to accommodate tunnels over links with diverse MTUs is 236 necessary. 238 1.2. Approach 240 This document concerns subnetworks manifested through a virtual 241 topology configured over a connected network routing region and 242 bounded by encapsulating border nodes. Example connected network 243 routing regions include Mobile Ad hoc Networks (MANETs), enterprise 244 networks and the global public Internet itself. Subnetwork border 245 nodes forward unicast and multicast packets over the virtual topology 246 across multiple IP and/or sub-IP layer forwarding hops that may 247 introduce packet duplication and/or traverse links with diverse 248 Maximum Transmission Units (MTUs). 250 This document introduces a Subnetwork Encapsulation and Adaptation 251 Layer (SEAL) for tunneling inner network layer protocol packets over 252 IP subnetworks that connect Ingress and Egress Tunnel Endpoints 253 (ITEs/ETEs) of border nodes. It provides a modular specification 254 designed to be tailored to specific associated tunneling protocols. 255 (A transport-mode of operation is also possible, but out of scope for 256 this document.) 258 SEAL provides a mid-layer encapsulation that accommodates links with 259 diverse MTUs, and allows routers in the subnetwork to perform 260 efficient duplicate packet and packet reordering detection. The 261 encapsulation further ensures message origin authentication, packet 262 header integrity and anti-replay in environments in which these 263 functions are necessary. 265 SEAL treats tunnels that traverse the subnetwork as ordinary links 266 that must support network layer services. Moreover, SEAL provides 267 dynamic mechanisms (including limited segmentation and reassembly) to 268 ensure a maximal path MTU over the tunnel. This is in contrast to 269 static approaches which avoid MTU issues by selecting a lowest common 270 denominator MTU value that may be overly conservative for the vast 271 majority of tunnel paths and difficult to change even when larger 272 MTUs become available. 274 1.3. Differences with RFC5320 276 This specification of SEAL is descended from an experimental 277 independent RFC publication of the same name [RFC5320]. However, 278 this specification introduces a number of important differences from 279 the earlier publication. 281 First, this specification includes a protocol version field in the 282 SEAL header whereas [RFC5320] does not, and therefore cannot be 283 updated by future revisions. This specification therefore obsoletes 284 (i.e., and does not update) [RFC5320]. 286 Secondly, [RFC5320] forms a 32-bit Identification value by 287 concatenating the 16-bit IPv4 Identification field with a 16-bit 288 Identification "extension" field in the SEAL header. This means that 289 [RFC5320] can only operate over IPv4 networks (since IPv6 headers do 290 not include a 16-bit version number) and that the SEAL Identification 291 value can be corrupted if the Identification in the outer IPv4 header 292 is rewritten. In contrast, this specification includes a 32-bit 293 Identification value that is independent of any identification fields 294 found in the inner or outer IP headers, and is therefore compatible 295 with any inner and outer IP protocol version combinations. 297 Additionally, the SEAL segmentation and reassembly procedures defined 298 in [RFC5320] differ significantly from those found in this 299 specification. In particular, this specification defines an 8-bit 300 Offset field that allows for smaller segment sizes when SEAL 301 segmentation is necessary. In contrast, [RFC5320] includes a 3-bit 302 Segment field and performs reassembly through concatenation of 303 consecutive segments. 305 The SEAL header in this specification also includes an optional 306 Integrity Check Vector (ICV) that can be used to digitally sign the 307 SEAL header and the leading portion of the encapsulated inner packet. 308 This allows for a lightweight integrity check and a loose message 309 origin authentication capability. The header further includes new 310 control bits as well as a link identification and encapsulation level 311 field for additional control capabilities. 313 Finally, this version of SEAL includes a new messaging protocol known 314 as the SEAL Control Message Protocol (SCMP), whereas [RFC5320] 315 performs signalling through the use of SEAL-encapsulated ICMP 316 messages. The use of SCMP allows SEAL-specific departures from ICMP, 317 as well as a control messaging capability that extends to other 318 specifications, including Virtual Enterprise Traversal (VET) 319 [I-D.templin-intarea-vet]. 321 2. Terminology 323 The following terms are defined within the scope of this document: 325 subnetwork 326 a virtual topology configured over a connected network routing 327 region and bounded by encapsulating border nodes. 329 IP 330 used to generically refer to either Internet Protocol (IP) 331 version, i.e., IPv4 or IPv6. 333 Ingress Tunnel Endpoint (ITE) 334 a virtual interface over which an encapsulating border node (host 335 or router) sends encapsulated packets into the subnetwork. 337 Egress Tunnel Endpoint (ETE) 338 a virtual interface over which an encapsulating border node (host 339 or router) receives encapsulated packets from the subnetwork. 341 SEAL Path 342 a subnetwork path from an ITE to an ETE beginning with an 343 underlying link of the ITE as the first hop. Note that, if the 344 ITE's interface connection to the underlying link assigns multiple 345 IP addresses, each address represents a separate SEAL path. 347 inner packet 348 an unencapsulated network layer protocol packet (e.g., IPv4 349 [RFC0791], OSI/CLNP [RFC0994], IPv6 [RFC2460], etc.) before any 350 outer encapsulations are added. Internet protocol numbers that 351 identify inner packets are found in the IANA Internet Protocol 352 registry [RFC3232]. SEAL protocol packets that incur an 353 additional layer of SEAL encapsulation are also considered inner 354 packets. 356 outer IP packet 357 a packet resulting from adding an outer IP header (and possibly 358 other outer headers) to a SEAL-encapsulated inner packet. 360 packet-in-error 361 the leading portion of an invoking data packet encapsulated in the 362 body of an error control message (e.g., an ICMPv4 [RFC0792] error 363 message, an ICMPv6 [RFC4443] error message, etc.). 365 Packet Too Big (PTB) message 366 a control plane message indicating an MTU restriction (e.g., an 367 ICMPv6 "Packet Too Big" message [RFC4443], an ICMPv4 368 "Fragmentation Needed" message [RFC0792], etc.). 370 Don't Fragment (DF) bit 371 a bit that indicates whether the packet may be fragmented by the 372 network. The DF bit is explicitly included in the IPv4 header 373 [RFC0791] and may be set to '0' to allow fragmentation or '1' to 374 disallow further in-network fragmentation. The bit is absent from 375 the IPv6 header [RFC2460], but implicitly set to '1' becauuse 376 fragmentation can occur only at IPv6 sources. 378 The following abbreviations correspond to terms used within this 379 document and/or elsewhere in common Internetworking nomenclature: 381 HLEN - the length of the SEAL header plus outer headers 383 ICV - Integrity Check Vector 385 MAC - Message Authentication Code 387 MTU - Maximum Transmission Unit 389 SCMP - the SEAL Control Message Protocol 391 SDU - SCMP Destination Unreachable message 392 SPP - SCMP Parameter Problem message 394 SPTB - SCMP Packet Too Big message 396 SEAL - Subnetwork Encapsulation and Adaptation Layer 398 TE - Tunnel Endpoint (i.e., either ingress or egress) 400 VET - Virtual Enterprise Traversal 402 3. Requirements 404 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 405 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 406 document are to be interpreted as described in [RFC2119]. When used 407 in lower case (e.g., must, must not, etc.), these words MUST NOT be 408 interpreted as described in [RFC2119], but are rather interpreted as 409 they would be in common English. 411 4. Applicability Statement 413 SEAL was originally motivated by the specific case of subnetwork 414 abstraction for Mobile Ad hoc Networks (MANETs), however the domain 415 of applicability also extends to subnetwork abstractions over 416 enterprise networks, ISP networks, SO/HO networks, the global public 417 Internet itself, and any other connected network routing region. 419 SEAL provides a network sublayer for encapsulation of an inner 420 network layer packet within outer encapsulating headers. SEAL can 421 also be used as a sublayer within a transport layer protocol data 422 payload, where transport layer encapsulation is typically used for 423 Network Address Translator (NAT) traversal as well as operation over 424 subnetworks that give preferential treatment to certain "core" 425 Internet protocols, e.g., TCP, UDP, etc.. (However, note that TCP 426 encapsulation may not be appropriate for all use cases; particularly 427 those that require low delay and/or delay variance.) The SEAL header 428 is processed in a similar manner as for IPv6 extension headers, i.e., 429 it is not part of the outer IP header but rather allows for the 430 creation of an arbitrarily extensible chain of headers in the same 431 way that IPv6 does. 433 To accommodate MTU diversity, the Ingress Tunnel Endpoint (ITE) may 434 need to perform limited segmentation which the Egress Tunnel Endpoint 435 (ETE) reassembles. The ETE further acts as a passive observer that 436 informs the ITE of any packet size limitations. This allows the ITE 437 to return appropriate PMTUD feedback even if the network path between 438 the ITE and ETE filters ICMP messages. 440 SEAL further provides mechanisms to ensure message origin 441 authentication, packet header integrity, and anti-replay. The SEAL 442 framework is therefore similar to the IP Security (IPsec) 443 Authentication Header (AH) [RFC4301][RFC4302], however it provides 444 only minimal hop-by-hop authenticating services while leaving full 445 data integrity, authentication and confidentiality services as an 446 end-to-end consideration. 448 In many aspects, SEAL also very closely resembles the Generic Routing 449 Encapsulation (GRE) framework [RFC1701]. SEAL can therefore be 450 applied in the same use cases that are traditionally addressed by 451 GRE, but goes beyond GRE to also provide additional capabilities 452 (e.,g., path MTU accommodation, message origin authentication, etc.) 453 as described in this document. 455 5. SEAL Specification 457 The following sections specify the operation of SEAL: 459 5.1. SEAL Tunnel Model 461 SEAL is an encapsulation sublayer used within point-to-point, point- 462 to-multipoint, and non-broadcast, multiple access (NBMA) tunnels. 463 Each SEAL path is configured over one or more underlying interfaces 464 attached to subnetwork links. The SEAL tunnel connects an ITE to one 465 or more ETE "neighbors" via encapsulation across an underlying 466 subnetwork, where the tunnel neighbor relationship may be either 467 unidirectional or bidirectional. 469 A unidirectional tunnel neighbor relationship allows the near end ITE 470 to send data packets forward to the far end ETE, while the ETE only 471 returns control messages when necessary. A bidirectional tunnel 472 neighbor relationship is one over which both TEs can exchange both 473 data and control messages. 475 Implications of the SEAL unidirectional and bidirectional models are 476 the same as discussed in [I-D.templin-intarea-vet]. 478 5.2. SEAL Model of Operation 480 SEAL-enabled ITEs encapsulate each inner packet in a SEAL header and 481 any outer header encapsulations as shown in Figure 1: 483 +--------------------+ 484 ~ outer IP header ~ 485 +--------------------+ 486 ~ other outer hdrs ~ 487 +--------------------+ 488 ~ SEAL Header ~ 489 +--------------------+ +--------------------+ 490 | | --> | | 491 ~ Inner ~ --> ~ Inner ~ 492 ~ Packet ~ --> ~ Packet ~ 493 | | --> | | 494 +--------------------+ +----------+---------+ 496 Figure 1: SEAL Encapsulation 498 The ITE inserts the SEAL header according to the specific tunneling 499 protocol. For simple encapsulation of an inner network layer packet 500 within an outer IP header, the ITE inserts the SEAL header following 501 the outer IP header and before the inner packet as: IP/SEAL/{inner 502 packet}. 504 For encapsulations over transports such as UDP, the ITE inserts the 505 SEAL header following the outer transport layer header and before the 506 inner packet, e.g., as IP/UDP/SEAL/{inner packet}. In that case, the 507 UDP header is seen as an "other outer header" as depicted in Figure 1 508 and the outer IP and transport layer headers are together seen as the 509 outer encapsulation headers. 511 SEAL supports both "nested" tunneling and "re-encapsulating" 512 tunneling. Nested tunneling occurs when a first tunnel is 513 encapsulated within a second tunnel, which may then further be 514 encapsulated within additional tunnels. Nested tunneling can be 515 useful, and stands in contrast to "recursive" tunneling which is an 516 anomalous condition incurred due to misconfiguration or a routing 517 loop. Considerations for nested tunneling and avoiding recursive 518 tunneling are discussed in Section 4 of [RFC2473]. 520 Re-encapsulating tunneling occurs when a packet arrives at a first 521 ETE, which then acts as an ITE to re-encapsulate and forward the 522 packet to a second ETE connected to the same subnetwork. In that 523 case each ITE/ETE transition represents a segment of a bridged path 524 between the ITE nearest the source and the ETE nearest the 525 destination. Considerations for re-encapsulating tunneling are 526 discussed in[I-D.templin-ironbis]. Combinations of nested and re- 527 encapsulating tunneling are also naturally supported by SEAL. 529 The SEAL ITE considers each underlying interface as the ingress 530 attachment point to a SEAL path to the ETE. The ITE therefore may 531 experience different path MTUs on different SEAL paths. 533 Finally, the SEAL ITE ensures that the inner network layer protocol 534 will see a minimum MTU of 1500 bytes over each SEAL path regardless 535 of the outer network layer protocol version, i.e., even if a small 536 amount of segmentation and reassembly are necessary. This is to 537 avoid path MTU "black holes" for the minimum MTU configured by the 538 vast majority of links in the Internet. Note that in some scenarios, 539 however, reassembly may place a heavy burden on the ETE. In that 540 case, the ITE should avoid invoking segmentation and instead report 541 an MTU smaller than 1500 bytes to the original source. 543 5.3. SEAL Header and Trailer Format 545 The SEAL header is formatted as follows: 547 0 1 2 3 548 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 550 |VER|C|A|I|V|R|M| Offset | NEXTHDR | LINK_ID |LEVEL| 551 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 552 | Identification (optional) | 553 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 554 | Integrity Check Vector (optional) | 555 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... 557 Figure 2: SEAL Header Format 559 VER (2) 560 a 2-bit version field. This document specifies Version 0 of the 561 SEAL protocol, i.e., the VER field encodes the value 0. 563 C (1) 564 the "Control/Data" bit. Set to 1 by the ITE in SEAL Control 565 Message Protocol (SCMP) control messages, and set to 0 in ordinary 566 data packets. 568 A (1) 569 the "Acknowledgement Requested" bit. Set to 1 by the ITE in SEAL 570 data packets for which it wishes to receive an explicit 571 acknowledgement from the ETE. 573 I (1) 574 the "Identification Included" bit. 576 V (1) 577 the "Integrity Check Vector included" bit. 579 R (1) 580 the "Redirects Permitted" bit when used by VET (see: 581 [I-D.templin-intarea-vet]); reserved for future use in other 582 contexts. 584 M (1) the "More Segments" bit. Set to 1 in a non-final segment and 585 set to 0 in the final segment of the SEAL packet. 587 Offset (8) an 8-bit Offset field. Set to 0 in the first segment of 588 a segmented SEAL packet. Set to an integral number of 256 byte 589 blocks in subsequent segments (e.g., an Offset of 4 indicates a 590 block that begins at the 1024th byte in the packet). 592 NEXTHDR (8) an 8-bit field that encodes the next header Internet 593 Protocol number the same as for the IPv4 protocol and IPv6 next 594 header fields. 596 LINK_ID (5) 597 a 5-bit link identification value, set to a unique value by the 598 ITE for each SEAL path over which it will send encapsulated 599 packets to the ETE (up to 32 SEAL paths per ETE are therefore 600 supported). Note that, if the ITE's interface connection to the 601 underlying link assigns multiple IP addresses, each address 602 represents a separate SEAL path that must be assigned a separate 603 LINK_ID. 605 LEVEL (3) 606 a 3-bit nesting level; use to limit the number of tunnel nesting 607 levels. Set to an integer value up to 7 in the innermost SEAL 608 encapsulation, and decremented by 1 for each successive additional 609 SEAL encapsulation nesting level. Up to 8 levels of nesting are 610 therefore supported. 612 Identification (32) 613 an optional 32-bit per-packet identification field; present when 614 I==1. Set to a 32-bit value (beginning with 0) that is 615 monotonically-incremented for each SEAL packet transmitted to this 616 ETE. 618 Integrity Check Vector (ICV) (variable) 619 an optional variable-length integrity check vector field; present 620 when V==1. 622 5.4. ITE Specification 624 5.4.1. Tunnel Interface MTU 626 The tunnel interface must present a constant MTU value to the inner 627 network layer as the size for admission of inner packets into the 628 interface. Since NBMA tunnel virtual interfaces may support a large 629 set of SEAL paths that accept widely varying maximum packet sizes, 630 however, a number of factors should be taken into consideration when 631 selecting a tunnel interface MTU. 633 Due to the ubiquitous deployment of standard Ethernet and similar 634 networking gear, the nominal Internet cell size has become 1500 635 bytes; this is the de facto size that end systems have come to expect 636 will either be delivered by the network without loss due to an MTU 637 restriction on the path or a suitable ICMP Packet Too Big (PTB) 638 message returned. When large packets sent by end systems incur 639 additional encapsulation at an ITE, however, they may be dropped 640 silently within the tunnel since the network may not always deliver 641 the necessary PTBs [RFC2923]. The ITE SHOULD therefore set a tunnel 642 interface MTU of at least 1500 bytes. 644 The inner network layer protocol consults the tunnel interface MTU 645 when admitting a packet into the interface. For non-SEAL inner IPv4 646 packets with the IPv4 Don't Fragment (DF) bit cleared (i.e, DF==0), 647 if the packet is larger than the tunnel interface MTU the inner IPv4 648 layer uses IPv4 fragmentation to break the packet into fragments no 649 larger than the tunnel interface MTU. The ITE then admits each 650 fragment into the interface as an independent packet. 652 For all other inner packets, the inner network layer admits the 653 packet if it is no larger than the tunnel interface MTU; otherwise, 654 it drops the packet and sends a PTB error message to the source with 655 the MTU value set to the tunnel interface MTU. The message contains 656 as much of the invoking packet as possible without the entire message 657 exceeding the network layer minimum MTU size. 659 The ITE can alternatively set an indefinite MTU on the tunnel 660 interface such that all inner packets are admitted into the interface 661 regardless of their size. For ITEs that host applications that use 662 the tunnel interface directly, this option must be carefully 663 coordinated with protocol stack upper layers since some upper layer 664 protocols (e.g., TCP) derive their packet sizing parameters from the 665 MTU of the outgoing interface and as such may select too large an 666 initial size. This is not a problem for upper layers that use 667 conservative initial maximum segment size estimates and/or when the 668 tunnel interface can reduce the upper layer's maximum segment size, 669 e.g., by reducing the size advertised in the MSS option of outgoing 670 TCP messages (sometimes known as "MSS clamping"). 672 In light of the above considerations, the ITE SHOULD configure an 673 indefinite MTU on tunnel *router* interfaces so that SEAL performs 674 all subnetwork adaptation from within the interface as specified in 675 Section 5.4.3. The ITE can instead set a smaller MTU on tunnel 676 *host* interfaces, e.g., the maximum of 1500 bytes and the smallest 677 MTU among all of the underlying links minus the size of the 678 encapsulation headers. 680 5.4.2. Tunnel Neighbor Soft State 682 The tunnel virtual interface maintains a number of soft state 683 variables for each ETE and for each SEAL path. 685 When per-packet identification is required, the ITE maintains a per 686 ETE window of Identification values for the packets it has recently 687 sent to this ETE. The ITE then sets a variable "USE_ID" to TRUE, and 688 includes an Identification in each packet it sends to this ETE; 689 otherwise, it sets USE_ID to FALSE. 691 When message origin authentication and integrity checking is 692 required, the ITE also includes an ICV in the packets it sends to the 693 ETE. The ICV format is shown in Figure 3: 695 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 696 |F|Key|Algorithm| Message Authentication Code (MAC) | 697 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... 699 Figure 3: Integrity Check Vector (ICV) Format 701 As shown in the figure, the ICV begins with a 1-octet control field 702 with a 1-bit (F)lag, a 2-bit Key identifier and a 5-bit Algorithm 703 identifier. The control octet is followed by a variable-length 704 Message Authentication Code (MAC). The ITE maintains a per ETE 705 algorithm and secret key to calculate the MAC in each packet it will 706 send to this ETE. (By default, the ITE sets the F bit and Algorithm 707 fields to 0 to indicate use of the HMAC-SHA-1 algorithm with a 160 708 bit shared secret key to calculate an 80 bit MAC per [RFC2104] over 709 the leading 128 bytes of the packet. Other values for F and 710 Algorithm are out of scope.) The ITE then sets a variable "USE_ICV" 711 to TRUE, and includes an ICV in each packet it sends to this ETE; 712 otherwise, it sets USE_ICV to FALSE. 714 For each SEAL path, the ITE must also account for encapsulation 715 header lengths. The ITE therefore maintains the per SEAL path 716 constant values "SHLEN" set to the length of the SEAL header, "THLEN" 717 set to the length of the outer encapsulating transport layer headers 718 (or 0 if outer transport layer encapsulation is not used), "IHLEN" 719 set to the length of the outer IP layer header, and "HLEN" set to 720 (SHLEN+THLEN+IHLEN). (The ITE must include the length of the 721 uncompressed headers even if header compression is enabled when 722 calculating these lengths.) In addition, the ITE maintains a per 723 SEAL path variable "MAXMTU" initialized to the maximum of 1500 bytes 724 and the MTU of the underlying link minus HLEN. 726 The ITE further sets a variable 'MINMTU' to the minimum MTU for the 727 SEAL path over which encapsulated packets will travel. For IPv6 728 paths, the ITE sets MINMTU=1280 (see: [RFC2460]). For IPv4 paths, 729 the ITE sets MINMTU=576 even though the true MINMTU for IPv4 is only 730 68 bytes [RFC0791] (based on community consensus that the minimum 731 IPv4 MTU is for all practical purposes now 576 bytes). 733 The ITE can also set MINMTU to a larger value if there is reason to 734 believe that the minimum path MTU is larger, or to a smaller value if 735 there is reason to believe the MTU is smaller, e.g., if there may be 736 additional encapsulations on the path. If this value proves too 737 large, the ITE will receive PTB message feedback either from the ETE 738 or from a router on the path and will be able to reduce its MINMTU to 739 a smaller value. 741 The ITE may instead maintain the packet sizing variables and 742 constants as per ETE (rather than per SEAL path) values. In that 743 case, the values reflect the lowest-common-denominator size across 744 all of the SEAL paths associated with this ETE. 746 5.4.3. SEAL Layer Pre-Processing 748 The SEAL layer is logically positioned between the inner and outer 749 network protocol layers, where the inner layer is seen as the (true) 750 network layer and the outer layer is seen as the (virtual) data link 751 layer. Each packet to be processed by the SEAL layer is either 752 admitted into the tunnel interface by the inner network layer 753 protocol as described in Section 5.4.1 or is undergoing re- 754 encapsulation from within the tunnel interface. The SEAL layer sees 755 the former class of packets as inner packets that include inner 756 network and transport layer headers, and sees the latter class of 757 packets as transitional SEAL packets that include the outer and SEAL 758 layer headers that were inserted by the previous hop SEAL ITE. For 759 these transitional packets, the SEAL layer re-encapsulates the packet 760 with new outer and SEAL layer headers when it forwards the packet to 761 the next hop SEAL ITE. 763 We now discuss the SEAL layer pre-processing actions for these two 764 classes of packets. 766 5.4.3.1. Inner Packet Pre-Processing 768 For each inner packet admitted into the tunnel interface, if the 769 packet is itself a SEAL packet (i.e., one with the port number for 770 SEAL in the transport layer header or one with the protocol number 771 for SEAL in the IP layer header) and the LEVEL field of the SEAL 772 header contains the value 0, the ITE silently discards the packet. 774 Otherwise, for non-SEAL IPv4 inner packets with DF==0 in the IP 775 header and IPv6 inner packets with a fragment header and with (MF=0; 776 Offset=0), if the packet is larger than (MINMTU-HLEN) the ITE uses IP 777 fragmentation to fragment the packet into N roughly equal-length 778 pieces, where N is minimized and each fragment is significantly 779 smaller than (MINMTU-HLEN) to allow for additional encapsulations in 780 the path. The ITE then submits each fragment for SEAL encapsulation 781 as specified in Section 5.4.4. 783 For all other inner packets, if the packet is no larger than MAXMTU 784 for the corresponding SEAL path the ITE submits it for SEAL 785 encapsulation as specified in Section 5.4.4. Otherwise, the ITE 786 drops the packet and sends an ordinary PTB message appropriate to the 787 inner protocol version (subject to rate limiting) with the MTU field 788 set to MAXMTU. (For IPv4 SEAL packets with DF==0, the ITE should set 789 DF=1 and re-calculate the IPv4 header checksum before generating the 790 PTB message in order to avoid bogon filters.) After sending the PTB 791 message, the ITE discards the inner packet. 793 5.4.3.2. Transitional SEAL Packet Pre-Processing 795 For each transitional packet that is to be processed by the SEAL 796 layer from within the tunnel interface, the ITE sets aside the SEAL 797 encapsulation headers that were received from the previous hop. 798 Next, if the packet is no larger than MAXMTU for the next hop SEAL 799 path the ITE submits it for SEAL encapsulation as specified in 800 Section 5.4.4. Otherwise, the ITE drops the packet and sends an SCMP 801 Packet Too Big (SPTB) message to the previous hop subject to rate 802 limiting (see: Section 5.6.1.1) with the MTU field set to MAXMTU. 803 After sending the SPTB message, the ITE discards the packet. 805 5.4.4. SEAL Encapsulation and Segmentation 807 For each inner packet/fragment submitted for SEAL encapsulation, the 808 ITE next encapsulates the packet in a SEAL header formatted as 809 specified in Section 5.3. The SEAL header includes an Identification 810 field when USE_ID is TRUE, followed by an ICV field when USE_ICV is 811 TRUE. 813 The ITE next sets C=0 in the SEAL header. The ITE also sets A=1 if 814 ETE reachability determination is necessary (see: Section 5.4.6) or 815 for stateful MTU determination (see Section 5.4.9). Otherwise, the 816 ITE sets A=0. 818 The ITE then sets LINK_ID to the value assigned to the underlying 819 SEAL path, and sets NEXTHDR to the protocol number corresponding to 820 the address family of the encapsulated inner packet. For example, 821 the ITE sets NEXTHDR to the value '4' for encapsulated IPv4 packets 822 [RFC2003], '41' for encapsulated IPv6 packets [RFC2473][RFC4213], 823 '80' for encapsulated OSI/CLNP packets [RFC1070], etc. 825 Next, if the inner packet is not itself a SEAL packet the ITE sets 826 LEVEL to an integer value between 0 and 7 as a specification of the 827 number of additional layers of nested SEAL encapsulations permitted. 828 If the inner packet is a SEAL packet that is undergoing nested 829 encapsulation, the ITE instead sets LEVEL to the value that appears 830 in the inner packet's SEAL header minus 1. If the inner packet is 831 undergoing SEAL re-encapsulation, the ITE instead copies the LEVEL 832 value from the SEAL header of the packet to be re-encapsulated. 834 Next, if the inner packet is no larger than (MINMTU-HLEN) or larger 835 than 1500, the ITE sets (M=0; Offset=0). Otherwise, the ITE breaks 836 the inner packet into N non-overlapping segments (where N is 837 minimized and each segment is significantly smaller than (MINMTU- 838 HLEN) to allow for additional encapsulations in the path) then 839 appends a clone of the SEAL header from the first segment onto the 840 head of each additional segment. In this process, the ITE MUST 841 ensure that the inner packet's network and transport layer headers 842 are included in the first segment if they can be accommodated within 843 the available MINMTU. The ITE MUST also include an Identification 844 field and set USE_ID=TRUE for each segment. The ITE then sets (M=1; 845 Offset=0) in the first segment, sets (M=0/1; Offset=O(1)) in the 846 second segment, sets (M=0/1; Offset=O(2)) in the third segment (if 847 needed), etc., then finally sets (M=0; Offset=O(n)) in the final 848 segment (where O(i) is the number of 256 byte blocks that preceded 849 this segment). 851 When USE_ID is FALSE, the ITE next sets I=0. Otherwise, the ITE sets 852 I=1 and writes a monotonically-incrementing integer value for this 853 ETE in the Identification field beginning with 0 in the first packet 854 transmitted. (For SEAL packets that have been split into multiple 855 pieces, the ITE writes the same Identification value in each piece.) 856 The monotonically-incrementing requirement is to satisfy ETEs that 857 use this value for anti-replay purposes. The value is incremented 858 modulo 2^32, i.e., it wraps back to 0 when the previous value was 859 (2^32 - 1). 861 When USE_ICV is FALSE, the ITE next sets V=0. Otherwise, the ITE 862 sets V=1, includes an ICV and calculates the MAC using HMAC-SHA-1 863 with a 160 bit secret key and 80 bit MAC field. Beginning with the 864 SEAL header, the ITE sets the ICV field to 0, calculates the MAC over 865 the leading 128 bytes of the packet (or up to the end of the packet 866 if there are fewer than 128 bytes) and places the result in the MAC 867 field. (For SEAL packets that have been split into multiple pieces, 868 each piece calculates its own MAC.) The ITE then writes the value 0 869 in the F flag and 0x00 in the Algorithm field of the ICV control 870 octet (other values for these fields, and other MAC calculation 871 disciplines, are outside the scope of this document and may be 872 specified in future documents.) 874 The ITE then adds the outer encapsulating headers as specified in 875 Section 5.4.5. 877 5.4.5. Outer Encapsulation 879 Following SEAL encapsulation, the ITE next encapsulates each segment 880 in the requisite outer transport (when necessary) and IP layer 881 headers. When a transport layer header such as UDP or TCP is 882 included, the ITE writes the port number for SEAL in the transport 883 destination service port field. 885 When UDP encapsulation is used, the ITE sets the UDP checksum field 886 to zero for IPv4 packets and also sets the UDP checksum field to zero 887 for IPv6 packets even though IPv6 generally requires UDP checksums. 888 Further considerations for setting the UDP checksum field for IPv6 889 packets are discussed in [RFC6935][RFC6936]. 891 The ITE then sets the outer IP layer headers the same as specified 892 for ordinary IP encapsulation (e.g., [RFC1070][RFC2003], [RFC2473], 893 [RFC4213], etc.) except that for ordinary SEAL packets the ITE copies 894 the "TTL/Hop Limit", "Type of Service/Traffic Class" and "Congestion 895 Experienced" values in the inner network layer header into the 896 corresponding fields in the outer IP header. For transitional SEAL 897 packets undergoing re-encapsulation, the ITE instead copies the "TTL/ 898 Hop Limit", "Type of Service/Traffic Class" and "Congestion 899 Experienced" values in the outer IP header of the received packet 900 into the corresponding fields in the outer IP header of the packet to 901 be forwarded (i.e., the values are transferred between outer headers 902 and *not* copied from the inner network layer header). 904 The ITE also sets the IP protocol number to the appropriate value for 905 the first protocol layer within the encapsulation (e.g., UDP, TCP, 906 SEAL, etc.). When IPv6 is used as the outer IP protocol, the ITE 907 then sets the flow label value in the outer IPv6 header the same as 908 described in [RFC6438]. When IPv4 is used as the outer IP protocol, 909 the ITE instead sets DF=0 in the IPv4 header to allow the packet to 910 be fragmented if it encounters a restricting link (for IPv6 SEAL 911 paths, the DF bit is implicitly set to 1). 913 The ITE finally sends each outer packet via the underlying link 914 corresponding to LINK_ID. 916 5.4.6. Path Probing and ETE Reachability Verification 918 All SEAL data packets sent by the ITE are considered implicit probes. 919 SEAL data packets will elicit an SCMP message from the ETE if it 920 needs to acknowledge a probe and/or report an error condition. SEAL 921 data packets may also be dropped by either the ETE or a router on the 922 path, which may or may not result in an ICMP message being returned 923 to the ITE. 925 The ITE processes ICMP messages as specified in Section 5.4.7. 927 The ITE processes SCMP messages as specified in Section 5.6.2. 929 5.4.7. Processing ICMP Messages 931 When the ITE sends SEAL packets, it may receive ICMP error messages 932 [RFC0792][RFC4443] from an ordinary router within the subnetwork. 933 Each ICMP message includes an outer IP header, followed by an ICMP 934 header, followed by a portion of the SEAL data packet that generated 935 the error (also known as the "packet-in-error") beginning with the 936 outer IP header. 938 The ITE should process ICMPv4 Protocol Unreachable messages and 939 ICMPv6 Parameter Problem messages with Code "Unrecognized Next Header 940 type encountered" as a hint that the IP destination address does not 941 implement SEAL. The ITE can optionally ignore ICMP messages that do 942 not include sufficient information in the packet-in-error, or process 943 them as a hint that the SEAL path may be failing. 945 For other ICMP messages, the ITE should use any outer header 946 information available as a first-pass authentication filter (e.g., to 947 determine if the source of the message is within the same 948 administrative domain as the ITE) and discards the message if first 949 pass filtering fails. 951 Next, the ITE examines the packet-in-error beginning with the SEAL 952 header. If the value in the Identification field (if present) is not 953 within the window of packets the ITE has recently sent to this ETE, 954 or if the MAC value in the SEAL header ICV field (if present) is 955 incorrect, the ITE discards the message. 957 Next, if the received ICMP message is a PTB the ITE sets the 958 temporary variable "PMTU" for this SEAL path to the MTU value in the 959 PTB message. If PMTU==0, the ITE consults a plateau table (e.g., as 960 described in [RFC1191]) to determine PMTU based on the length field 961 in the outer IP header of the packet-in-error. For example, if the 962 ITE receives a PTB message with MTU==0 and length 4KB, it can set 963 PMTU=2KB. If the ITE subsequently receives a PTB message with MTU==0 964 and length 2KB, it can set PMTU=1792, etc. to a minimum value of 965 PMTU=(1500+HLEN). If the ITE is performing stateful MTU 966 determination for this SEAL path (see Section 5.4.9), the ITE next 967 sets MAXMTU=MAX((PMTU-HLEN), 1500). 969 If the ICMP message was not discarded, the ITE then transcribes it 970 into a message to return to the previous hop. If the inner packet 971 was a SEAL data packet, the ITE transcribes the ICMP message into an 972 SCMP message. Otherwise, the ITE transcribes the ICMP message into a 973 message appropriate for the inner protocol version. 975 To transcribe the message, the ITE extracts the inner packet from 976 within the ICMP message packet-in-error field and uses it to generate 977 a new message corresponding to the type of the received ICMP message. 978 For SCMP messages, the ITE generates the message the same as 979 described for ETE generation of SCMP messages in Section 5.6.1. For 980 (S)PTB messages, the ITE writes (PMTU-HLEN) in the MTU field. 982 The ITE finally forwards the transcribed message to the previous hop 983 toward the inner source address. 985 5.4.8. IPv4 Middlebox Reassembly Testing 987 The ITE can perform a qualification exchange to ensure that the 988 subnetwork correctly delivers fragments to the ETE. This procedure 989 can be used, e.g., to determine whether there are middleboxes on the 990 path that violate the [RFC1812], Section 5.2.6 requirement that: "A 991 router MUST NOT reassemble any datagram before forwarding it". 993 The ITE should use knowledge of its topological arrangement as an aid 994 in determining when middlebox reassembly testing is necessary. For 995 example, if the ITE is aware that the ETE is located somewhere in the 996 public Internet, middlebox reassembly testing should not be 997 necessary. If the ITE is aware that the ETE is located behind a NAT 998 or a firewall, however, then reassembly testing can be used to detect 999 middleboxes that do not conform to specifications. 1001 The ITE can perform a middlebox reassembly test by selecting a data 1002 packet to be used as a probe. While performing the test with real 1003 data packets, the ITE should select only inner packets that are no 1004 larger than (1500-HLEN) bytes for testing purposes. The ITE can also 1005 construct an explicit probe packet instead of using ordinary SEAL 1006 data packets. 1008 To generate an explicit probe packet, the ITE creates a packet buffer 1009 beginning with the same outer headers, SEAL header and inner network 1010 layer header that would appear in an ordinary data packet, then pads 1011 the packet with random data to a length that is at least 128 bytes 1012 but no longer than (1500-HLEN) bytes. The ITE then writes the value 1013 '0' in the inner network layer TTL (for IPv4) or Hop Limit (for IPv6) 1014 field. 1016 The ITE then sets C=0 in the SEAL header of the probe packet and sets 1017 the NEXTHDR field to the inner network layer protocol type. (The ITE 1018 may also set A=1 if it requires a positive acknowledgement; 1019 otherwise, it sets A=0.) Next, the ITE sets LINK_ID and LEVEL to the 1020 appropriate values for this SEAL path, sets Identification and I=1 1021 (when USE_ID is TRUE), then finally calculates the ICV and sets V=1 1022 (when USE_ICV is TRUE). 1024 The ITE then encapsulates the probe packet in the appropriate outer 1025 headers, splits it into two outer IPv4 fragments, then sends both 1026 fragments over the same SEAL path. 1028 The ITE should send a series of probe packets (e.g., 3-5 probes with 1029 1sec intervals between tests) instead of a single isolated probe in 1030 case of packet loss. If the ETE returns an SCMP PTB message with MTU 1031 != 0, then the SEAL path correctly supports fragmentation; otherwise, 1032 the ITE enables stateful MTU determination for this SEAL path as 1033 specified in Section 5.4.9. 1035 (Examples of middleboxes that may perform reassembly include stateful 1036 NATs and firewalls. Such devices could still allow for stateless MTU 1037 determination if they gather the fragments of a fragmented IPv4 SEAL 1038 data packet for packet analysis purposes but then forward the 1039 fragments on to the final destination rather than forwarding the 1040 reassembled packet.) 1042 5.4.9. Stateful MTU Determination 1044 SEAL supports a stateless MTU determination capability, however the 1045 ITE may in some instances wish to impose a stateful MTU limit on a 1046 particular SEAL path. For example, when the ETE is situated behind a 1047 middlebox that performs IPv4 reassembly (see: Section 5.4.8) it is 1048 imperative that fragmentation be avoided. In other instances (e.g., 1049 when the SEAL path includes performance-constrained links), the ITE 1050 may deem it necessary to cache a conservative static MTU in order to 1051 avoid sending large packets that would only be dropped due to an MTU 1052 restriction somewhere on the path. 1054 To determine a static MTU value, the ITE sends a series of probe 1055 packets of various sizes to the ETE with A=1 in the SEAL header and 1056 DF=1 in the outer IP header. The ITE then caches the size 'S' of the 1057 largest packet for which it receives a probe reply from the ETE by 1058 setting MAXMTU=MAX((S-HLEN), 1500) for this SEAL path. 1060 For example, the ITE could send probe packets of 4KB, followed by 1061 2KB, followed by 1792 bytes, etc. While probing, the ITE processes 1062 any ICMP PTB message it receives as a potential indication of probe 1063 failure then discards the message. 1065 5.4.10. Detecting Path MTU Changes 1067 When stateful MTU determination is used, the ITE SHOULD periodically 1068 reset MAXMTU and/or re-probe the path to determine whether MAXMTU has 1069 increased. If the path still has a too-small MTU, the ITE will 1070 receive a PTB message that reports a smaller size. 1072 5.5. ETE Specification 1074 5.5.1. Reassembly Buffer Requirements 1076 For IPv6, the ETE configures a minimum reassembly buffer size of 1077 (1500 + HLEN) bytes for the reassembly of outer IPv6 packets, i.e., 1078 even though the true minimum reassembly size for IPv6 is only 1500 1079 bytes [RFC2460]. For IPv4, the ETE also configures a minimum 1080 reassembly buffer size of (1500 + HLEN) bytes for the reassembly of 1081 outer IPv4 packets, i.e., even though the true minimum reassembly 1082 size for IPv4 is only 576 bytes [RFC1122]. 1084 In addition to this outer reassembly buffer requirement, the ETE 1085 further configures a minimum SEAL reassembly buffer size of (1500 + 1086 HLEN) bytes for the reassembly of segmented SEAL packets (see: 1087 Section 5.5.4). 1089 5.5.2. Tunnel Neighbor Soft State 1091 When message origin authentication and integrity checking is 1092 required, the ETE maintains a per-ITE MAC calculation algorithm and a 1093 symmetric secret key to verify the MAC. When per-packet 1094 identification is required, the ETE also maintains a window of 1095 Identification values for the packets it has recently received from 1096 this ITE. 1098 When the tunnel neighbor relationship is bidirectional, the ETE 1099 further maintains a per SEAL path mapping of outer IP and transport 1100 layer addresses to the LINK_ID that appears in packets received from 1101 the ITE. 1103 5.5.3. IP-Layer Reassembly 1105 The ETE reassembles fragmented IP packets that are explcitly 1106 addressed to itself. For IP fragments that are received via a SEAL 1107 tunnel, the ETE SHOULD maintain conservative reassembly cache high- 1108 and low-water marks. When the size of the reassembly cache exceeds 1109 this high-water mark, the ETE SHOULD actively discard stale 1110 incomplete reassemblies (e.g., using an Active Queue Management (AQM) 1111 strategy) until the size falls below the low-water mark. The ETE 1112 SHOULD also actively discard any pending reassemblies that clearly 1113 have no opportunity for completion, e.g., when a considerable number 1114 of new fragments have arrived before a fragment that completes a 1115 pending reassembly arrives. 1117 The ETE processes non-SEAL IP packets as specified in the normative 1118 references, i.e., it performs any necessary IP reassembly then 1119 discards the packet if it is larger than the reassembly buffer size 1120 or delivers the (fully-reassembled) packet to the appropriate upper 1121 layer protocol module. 1123 For SEAL packets, the ETE performs any necessary IP reassembly then 1124 submits the packet for SEAL decapsulation as specified in Section 1125 5.5.4. (Note that if the packet is larger than the reassembly buffer 1126 size, the ETE still examines the leading portion of the (partially) 1127 reassembled packet during decapsulation.) 1129 5.5.4. Decapsulation, SEAL-Layer Reassembly, and Re-Encapsulation 1131 For each SEAL packet accepted for decapsulation, when I==1 the ETE 1132 first examines the Identification field. If the Identification is 1133 not within the window of acceptable values for this ITE, the ETE 1134 silently discards the packet. 1136 Next, if V==1 the ETE SHOULD verify the MAC value (with the MAC field 1137 itself reset to 0) and silently discard the packet if the value is 1138 incorrect. 1140 Next, if the packet arrived as multiple IP fragments, the ETE sends 1141 an SPTB message back to the ITE with MTU set to the size of the 1142 largest fragment received minus HLEN (see: Section 5.6.1.1). 1144 Next, if the packet arrived as multiple IP fragments and the inner 1145 packet is larger than 1500 bytes, the ETE silently discards the 1146 packet; otherwise, it continues to process the packet. 1148 Next, if there is an incorrect value in a SEAL header field (e.g., an 1149 incorrect "VER" field value), the ETE discards the packet. If the 1150 SEAL header has C==0, the ETE also returns an SCMP "Parameter 1151 Problem" (SPP) message (see Section 5.6.1.2). 1153 Next, if the SEAL header has C==1, the ETE processes the packet as an 1154 SCMP packet as specified in Section 5.6.2. Otherwise, the ETE 1155 continues to process the packet as a SEAL data packet. 1157 Next, if the SEAL header has (M==1 || Offset!=0) the ETE checks to 1158 see if the other segments of this already-segmented SEAL packet have 1159 arrived, i.e., by looking for additional segments that have the same 1160 outer IP source address, destination address, source transport port 1161 number (if present) and SEAL Identification value. If the other 1162 segments have already arrived, the ETE discards the SEAL header and 1163 other outer headers from the non-initial segments and appends them 1164 onto the end of the first segment according to their offset value. 1165 Otherwise, the ETE caches the segment for at most 60 seconds while 1166 awaiting the arrival of its partners. During this process, the ETE 1167 discards any segments that are overlapping with respect to segments 1168 that have already been received. The ETE further SHOULD manage the 1169 SEAL reassembly cache the same as described for the IP-Layer 1170 Reassembly cache in Section 5.5.3, i.e., it SHOULD perform an early 1171 discard for any pending reassemblies that have low probability of 1172 completion. 1174 Next, if the SEAL header in the (reassembled) packet has A==1, the 1175 ETE sends an SPTB message back to the ITE with MTU=0 (see: Section 1176 5.6.1.1). 1178 Finally, the ETE discards the outer headers and processes the inner 1179 packet according to the header type indicated in the SEAL NEXTHDR 1180 field. If the inner (TTL / Hop Limit) field encodes the value 0, the 1181 ETE silently discards the packet. Otherwise, if the next hop toward 1182 the inner destination address is via a different interface than the 1183 SEAL packet arrived on, the ETE discards the SEAL header and delivers 1184 the inner packet either to the local host or to the next hop 1185 interface if the packet is not destined to the local host. 1187 If the next hop is on the same interface the SEAL packet arrived on, 1188 however, the ETE submits the packet for SEAL re-encapsulation 1189 beginning with the specification in Section 5.4.3 above and without 1190 decrementing the value in the inner (TTL / Hop Limit) field. In this 1191 process, the packet remains within the tunnel (i.e., it does not exit 1192 and then re-enter the tunnel); hence, the packet is not discarded if 1193 the LEVEL field in the SEAL header contains the value 0. 1195 5.6. The SEAL Control Message Protocol (SCMP) 1197 SEAL provides a companion SEAL Control Message Protocol (SCMP) that 1198 uses the same message types and formats as for the Internet Control 1199 Message Protocol for IPv6 (ICMPv6) [RFC4443]. As for ICMPv6, each 1200 SCMP message includes a 32-bit header and a variable-length body. 1201 The ITE encapsulates the SCMP message in a SEAL header and outer 1202 headers as shown in Figure 4: 1204 +--------------------+ 1205 ~ outer IP header ~ 1206 +--------------------+ 1207 ~ other outer hdrs ~ 1208 +--------------------+ 1209 ~ SEAL Header ~ 1210 +--------------------+ +--------------------+ 1211 | SCMP message header| --> | SCMP message header| 1212 +--------------------+ +--------------------+ 1213 | | --> | | 1214 ~ SCMP message body ~ --> ~ SCMP message body ~ 1215 | | --> | | 1216 +--------------------+ +--------------------+ 1218 SCMP Message SCMP Packet 1219 before encapsulation after encapsulation 1221 Figure 4: SCMP Message Encapsulation 1223 The following sections specify the generation, processing and 1224 relaying of SCMP messages. 1226 5.6.1. Generating SCMP Error Messages 1228 ETEs generate SCMP error messages in response to receiving certain 1229 SEAL data packets using the format shown in Figure 5: 1231 0 1 2 3 1232 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1233 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1234 | Type | Code | Checksum | 1235 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1236 | Type-Specific Data | 1237 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1238 | As much of the invoking SEAL data packet as possible | 1239 ~ (beginning with the SEAL header) without the SCMP ~ 1240 | packet exceeding MINMTU bytes (*) | 1242 (*) also known as the "packet-in-error" 1244 Figure 5: SCMP Error Message Format 1246 The error message includes the 32-bit SCMP message header, followed 1247 by a 32-bit Type-Specific Data field, followed by the leading portion 1248 of the invoking SEAL data packet beginning with the SEAL header as 1249 the "packet-in-error". The packet-in-error includes as much of the 1250 invoking packet as possible extending to a length that would not 1251 cause the entire SCMP packet following outer encapsulation to exceed 1252 MINMTU bytes. 1254 When the ETE processes a SEAL data packet for which the 1255 Identification and ICV values are correct but an error must be 1256 returned, it prepares an SCMP error message as shown in Figure 5. 1257 The ETE sets the Type and Code fields to the same values that would 1258 appear in the corresponding ICMPv6 message [RFC4443], but calculates 1259 the Checksum beginning with the SCMP message header using the 1260 algorithm specified for ICMPv4 in [RFC0792]. 1262 The ETE next encapsulates the SCMP message in the requisite SEAL and 1263 outer headers as shown in Figure 4. During encapsulation, the ETE 1264 sets the outer destination address/port numbers of the SCMP packet to 1265 the values associated with the ITE and sets the outer source address/ 1266 port numbers to its own outer address/port numbers. 1268 The ETE then sets (C=1; A=0; M=0; Offset=0) in the SEAL header, then 1269 sets I, V, NEXTHDR and LEVEL to the same values that appeared in the 1270 SEAL header of the data packet. If the neighbor relationship between 1271 the ITE and ETE is unidirectional, the ETE next sets the LINK_ID 1272 field to the same value that appeared in the SEAL header of the data 1273 packet. Otherwise, the ETE sets the LINK_ID field to the value it 1274 would use in sending a SEAL packet to this ITE. 1276 When I==1, the ETE next sets the Identification field to an 1277 appropriate value for the ITE. If the neighbor relationship between 1278 the ITE and ETE is unidirectional, the ETE sets the Identification 1279 field to the same value that appeared in the SEAL header of the data 1280 packet. Otherwise, the ETE sets the Identification field to the 1281 value it would use in sending the next SEAL packet to this ITE. 1283 When V==1, the ETE then prepares the ICV field the same as specified 1284 for SEAL data packet encapsulation in Section 5.4.4. 1286 Finally, the ETE sends the resulting SCMP packet to the ITE the same 1287 as specified for SEAL data packets in Section 5.4.5. 1289 The following sections describe additional considerations for various 1290 SCMP error messages: 1292 5.6.1.1. Generating SCMP Packet Too Big (SPTB) Messages 1294 An ETE generates an SPTB message when it receives a SEAL data packet 1295 that arrived as multiple outer IP fragments. The ETE prepares the 1296 SPTB message the same as for the corresponding ICMPv6 PTB message, 1297 and writes the length of the largest outer IP fragment received minus 1298 HLEN in the MTU field of the message. 1300 The ETE also generates an SPTB message when it accepts a SEAL 1301 protocol data packet with A==1 in the SEAL header. The ETE prepares 1302 the SPTB message the same as above, except that it writes the value 0 1303 in the MTU field. 1305 5.6.1.2. Generating Other SCMP Error Messages 1307 An ETE generates an SCMP "Destination Unreachable" (SDU) message 1308 under the same circumstances that an IPv6 system would generate an 1309 ICMPv6 Destination Unreachable message. 1311 An ETE generates an SCMP "Parameter Problem" (SPP) message when it 1312 receives a SEAL packet with an incorrect value in the SEAL header. 1314 TEs generate other SCMP message types using methods and procedures 1315 specified in other documents. For example, SCMP message types used 1316 for tunnel neighbor coordinations are specified in VET 1317 [I-D.templin-intarea-vet]. 1319 5.6.2. Processing SCMP Error Messages 1321 An ITE may receive SCMP messages with C==1 in the SEAL header after 1322 sending packets to an ETE. The ITE first verifies that the outer 1323 addresses of the SCMP packet are correct, and (when I==1) that the 1324 Identification field contains an acceptable value. The ITE next 1325 verifies that the SEAL header fields are set correctly as specified 1326 in Section 5.6.1. When V==1, the ITE then verifies the ICV. The ITE 1327 next verifies the Checksum value in the SCMP message header. If any 1328 of these values are incorrect, the ITE silently discards the message; 1329 otherwise, it processes the message as follows: 1331 5.6.2.1. Processing SCMP PTB Messages 1333 After an ITE sends a SEAL data packet to an ETE, it may receive an 1334 SPTB message with a packet-in-error containing the leading portion of 1335 the packet (see: Section 5.6.1.1). For SPTB messages with MTU==0, 1336 the ITE processes the message as confirmation that the ETE received a 1337 SEAL data packet with A==1 in the SEAL header. The ITE then discards 1338 the message. 1340 For SPTB messages with MTU!=0, the ITE processes the message as an 1341 indication of a packet size limitation as follows. If the inner 1342 packet is no larger than 1500 bytes, the ITE reduces its MINMTU value 1343 for this ITE. If the inner packet length is larger than 1500 and the 1344 MTU value is not substantially less than MINMTU bytes, the value is 1345 likely to reflect the true MTU of the restricting link on the path to 1346 the ETE; otherwise, a router on the path may be generating runt 1347 fragments. 1349 In that case, the ITE can consult a plateau table (e.g., as described 1350 in [RFC1191]) to rewrite the MTU value to a reduced size. For 1351 example, if the ITE receives an IPv4 SPTB message with MTU==256 and 1352 inner packet length 4KB, it can rewrite the MTU to 2KB. If the ITE 1353 subsequently receives an IPv4 SPTB message with MTU==256 and inner 1354 packet length 2KB, it can rewrite the MTU to 1792, etc., to a minimum 1355 of 1500 bytes. If the ITE is performing stateful MTU determination 1356 for this SEAL path, it then writes the new MTU value minus HLEN in 1357 MAXMTU. 1359 The ITE then checks its forwarding tables to discover the previous 1360 hop toward the source address of the inner packet. If the previous 1361 hop is reached via the same tunnel interface the SPTB message arrived 1362 on, the ITE relays the message to the previous hop. In order to 1363 relay the message, the first writes zero in the Identification and 1364 ICV fields of the SEAL header within the packet-in-error. The ITE 1365 next rewrites the outer SEAL header fields with values corresponding 1366 to the previous hop and recalculates the MAC using the MAC 1367 calculation parameters associated with the previous hop. Next, the 1368 ITE replaces the SPTB's outer headers with headers of the appropriate 1369 protocol version and fills in the header fields as specified in 1370 Section 5.4.5, where the destination address/port correspond to the 1371 previous hop and the source address/port correspond to the ITE. The 1372 ITE then sends the message to the previous hop the same as if it were 1373 issuing a new SPTB message. (Note that, in this process, the values 1374 within the SEAL header of the packet-in-error are meaningless to the 1375 previous hop and therefore cannot be used by the previous hop for 1376 authentication purposes.) 1378 If the previous hop is not reached via the same tunnel interface, the 1379 ITE instead transcribes the message into a format appropriate for the 1380 inner packet (i.e., the same as described for transcribing ICMP 1381 messages in Section 5.4.7) and sends the resulting transcribed 1382 message to the original source. (NB: if the inner packet within the 1383 SPTB message is an IPv4 SEAL packet with DF==0, the ITE should set 1384 DF=1 and re-calculate the IPv4 header checksum while transcribing the 1385 message in order to avoid bogon filters.) The ITE then discards the 1386 SPTB message. 1388 Note that the ITE may receive an SPTB message from another ITE that 1389 is at the head end of a nested level of encapsulation. The ITE has 1390 no security associations with this nested ITE, hence it should 1391 consider this SPTB message the same as if it had received an ICMP PTB 1392 message from an ordinary router on the path to the ETE. That is, the 1393 ITE should examine the packet-in-error field of the SPTB message and 1394 only process the message if it is able to recognize the packet as one 1395 it had previously sent. 1397 5.6.2.2. Processing Other SCMP Error Messages 1399 An ITE may receive an SDU message with an appropriate code under the 1400 same circumstances that an IPv6 node would receive an ICMPv6 1401 Destination Unreachable message. The ITE either transcribes or 1402 relays the message toward the source address of the inner packet 1403 within the packet-in-error the same as specified for SPTB messages in 1404 Section 5.6.2.1. 1406 An ITE may receive an SPP message when the ETE receives a SEAL packet 1407 with an incorrect value in the SEAL header. The ITE should examine 1408 the SEAL header within the packet-in-error to determine whether a 1409 different setting should be used in subsequent packets, but does not 1410 relay the message further. 1412 TEs process other SCMP message types using methods and procedures 1413 specified in other documents. For example, SCMP message types used 1414 for tunnel neighbor coordinations are specified in VET 1415 [I-D.templin-intarea-vet]. 1417 6. Link Requirements 1419 Subnetwork designers are expected to follow the recommendations in 1420 Section 2 of [RFC3819] when configuring link MTUs. 1422 7. End System Requirements 1424 End systems are encouraged to implement end-to-end MTU assurance 1425 (e.g., using Packetization Layer Path MTU Discovery (PLPMTUD) per 1426 [RFC4821]) even if the subnetwork is using SEAL. 1428 When end systems use PLPMTUD, SEAL will ensure that the tunnel 1429 behaves as a link in the path that assures an MTU of at least 1500 1430 bytes while not precluding discovery of larger MTUs. The PMPMTUD 1431 mechanism will therefore be able to function as designed in order to 1432 discover and utilize larger MTUs. 1434 8. Router Requirements 1436 Routers within the subnetwork are expected to observe the standard IP 1437 router requirements, including the implementation of IP fragmentation 1438 and reassembly as well as the generation of ICMP messages 1439 [RFC0792][RFC1122][RFC1812][RFC2460][RFC4443][RFC6434]. 1441 Note that, even when routers support existing requirements for the 1442 generation of ICMP messages, these messages are often filtered and 1443 discarded by middleboxes on the path to the original source of the 1444 message that triggered the ICMP. It is therefore not possible to 1445 assume delivery of ICMP messages even when routers are correctly 1446 implemented. 1448 9. Nested Encapsulation Considerations 1450 SEAL supports nested tunneling for up to 8 layers of encapsulation. 1451 In this model, the SEAL ITE has a tunnel neighbor relationship only 1452 with ETEs at its own nesting level, i.e., it does not have a tunnel 1453 neighbor relationship with other ITEs, nor with ETEs at other nesting 1454 levels. 1456 Therefore, when an ITE 'A' within an outer nesting level needs to 1457 return an error message to an ITE 'B' within an inner nesting level, 1458 it generates an ordinary ICMP error message the same as if it were an 1459 ordinary router within the subnetwork. 'B' can then perform message 1460 validation as specified in Section 5.4.7, but full message origin 1461 authentication is not possible. 1463 Since ordinary ICMP messages are used for coordinations between ITEs 1464 at different nesting levels, nested SEAL encapsulations should only 1465 be used when the ITEs are within a common administrative domain 1466 and/or when there is no ICMP filtering middlebox such as a firewall 1467 or NAT between them. An example would be a recursive nesting of 1468 mobile networks, where the first network receives service from an 1469 ISP, the second network receives service from the first network, the 1470 third network receives service from the second network, etc. 1472 NB: As an alternative, the SCMP protocol could be extended to allow 1473 ITE 'A' to return an SCMP message to ITE 'B' rather than return an 1474 ICMP message. This would conceptually allow the control messages to 1475 pass through firewalls and NATs, however it would give no more 1476 message origin authentication assurance than for ordinary ICMP 1477 messages. It was therefore determined that the complexity of 1478 extending the SCMP protocol was of little value within the context of 1479 the anticipated use cases for nested encapsulations. 1481 10. Reliability Considerations 1483 Although a SEAL tunnel may span an arbitrarily-large subnetwork 1484 expanse, the IP layer sees the tunnel as a simple link that supports 1485 the IP service model. Links with high bit error rates (BERs) (e.g., 1486 IEEE 802.11) use Automatic Repeat-ReQuest (ARQ) mechanisms [RFC3366] 1487 to increase packet delivery ratios, while links with much lower BERs 1488 typically omit such mechanisms. Since SEAL tunnels may traverse 1489 arbitrarily-long paths over links of various types that are already 1490 either performing or omitting ARQ as appropriate, it would therefore 1491 be inefficient to require the tunnel endpoints to also perform ARQ. 1493 11. Integrity Considerations 1495 The SEAL header includes an integrity check field that covers the 1496 SEAL header and at least the inner packet headers. This provides for 1497 header integrity verification on a segment-by-segment basis for a 1498 segmented re-encapsulating tunnel path. 1500 Fragmentation and reassembly schemes must also consider packet- 1501 splicing errors, e.g., when two fragments from the same packet are 1502 concatenated incorrectly, when a fragment from packet X is 1503 reassembled with fragments from packet Y, etc. The primary sources 1504 of such errors include implementation bugs and wrapping IPv4 ID 1505 fields. 1507 In particular, the IPv4 16-bit ID field can wrap with only 64K 1508 packets with the same (src, dst, protocol)-tuple alive in the system 1509 at a given time [RFC4963]. When the IPv4 ID field is re-written by a 1510 middlebox such as a NAT or Firewall, ID field wrapping can occur with 1511 even fewer packets alive in the system. It is therefore essential 1512 that IPv4 fragmentation and reassembly be avoided. 1514 12. IANA Considerations 1516 The IANA is requested to allocate a User Port number for "SEAL" in 1517 the 'port-numbers' registry. The Service Name is "SEAL", and the 1518 Transport Protocols are TCP and UDP. The Assignee is the IESG 1519 (iesg@ietf.org) and the Contact is the IETF Chair (chair@ietf.org). 1520 The Description is "Subnetwork Encapsulation and Adaptation Layer 1521 (SEAL)", and the Reference is the RFC-to-be currently known as 1522 'draft-templin-intarea.seal'. 1524 13. Security Considerations 1526 SEAL provides a segment-by-segment message origin authentication, 1527 integrity and anti-replay service. The SEAL header is sent in-the- 1528 clear the same as for the outer IP and other outer headers. In this 1529 respect, the threat model is no different than for IPv6 extension 1530 headers. Unlike IPv6 extension headers, however, the SEAL header can 1531 be protected by an integrity check that also covers the inner packet 1532 headers. 1534 An amplification/reflection/buffer overflow attack is possible when 1535 an attacker sends IP fragments with spoofed source addresses to an 1536 ETE in an attempt to clog the ETE's reassembly buffer and/or cause 1537 the ETE to generate a stream of SCMP messages returned to a victim 1538 ITE. The SCMP message ICV, Identification, as well as the inner 1539 headers of the packet-in-error, provide mitigation for the ETE to 1540 detect and discard SEAL segments with spoofed source addresses. 1542 Security issues that apply to tunneling in general are discussed in 1543 [RFC6169]. 1545 14. Related Work 1547 Section 3.1.7 of [RFC2764] provides a high-level sketch for 1548 supporting large tunnel MTUs via a tunnel-level segmentation and 1549 reassembly capability to avoid IP level fragmentation. 1551 Section 3 of [RFC4459] describes inner and outer fragmentation at the 1552 tunnel endpoints as alternatives for accommodating the tunnel MTU. 1554 Section 4 of [RFC2460] specifies a method for inserting and 1555 processing extension headers between the base IPv6 header and 1556 transport layer protocol data. The SEAL header is inserted and 1557 processed in exactly the same manner. 1559 IPsec/AH is [RFC4301][RFC4301] is used for full message integrity 1560 verification between tunnel endpoints, whereas SEAL only ensures 1561 integrity for the inner packet headers. The AYIYA proposal 1562 [I-D.massar-v6ops-ayiya] uses similar means for providing message 1563 authentication and integrity. 1565 SEAL, along with the Virtual Enterprise Traversal (VET) 1566 [I-D.templin-intarea-vet] tunnel virtual interface abstraction, are 1567 the functional building blocks for the Interior Routing Overlay 1568 Network (IRON) [I-D.templin-ironbis] and Routing and Addressing in 1569 Networks with Global Enterprise Recursion (RANGER) [RFC5720][RFC6139] 1570 architectures. 1572 The concepts of path MTU determination through the report of 1573 fragmentation and extending the IPv4 Identification field were first 1574 proposed in deliberations of the TCP-IP mailing list and the Path MTU 1575 Discovery Working Group (MTUDWG) during the late 1980's and early 1576 1990's. An historical analysis of the evolution of these concepts, 1577 as well as the development of the eventual PMTUD mechanism, appears 1578 in [RFC5320]. 1580 15. Implementation Status 1582 An early implementation of the first revision of SEAL [RFC5320] is 1583 available at: http://isatap.com/seal. 1585 16. Acknowledgments 1587 The following individuals are acknowledged for helpful comments and 1588 suggestions: Jari Arkko, Fred Baker, Iljitsch van Beijnum, Oliver 1589 Bonaventure, Teco Boot, Bob Braden, Brian Carpenter, Steve Casner, 1590 Ian Chakeres, Noel Chiappa, Remi Denis-Courmont, Remi Despres, Ralph 1591 Droms, Aurnaud Ebalard, Gorry Fairhurst, Washam Fan, Dino Farinacci, 1592 Joel Halpern, Sam Hartman, John Heffner, Thomas Henderson, Bob 1593 Hinden, Christian Huitema, Eliot Lear, Darrel Lewis, Joe Macker, Matt 1594 Mathis, Erik Nordmark, Dan Romascanu, Dave Thaler, Joe Touch, Mark 1595 Townsley, Ole Troan, Margaret Wasserman, Magnus Westerlund, Robin 1596 Whittle, James Woodyatt, and members of the Boeing Research & 1597 Technology NST DC&NT group. 1599 Discussions with colleagues following the publication of [RFC5320] 1600 have provided useful insights that have resulted in significant 1601 improvements to this, the Second Edition of SEAL. 1603 This document received substantial review input from the IESG and 1604 IETF area directorates in the February 2013 timeframe. IESG members 1605 and IETF area directorate representatives who contributed helpful 1606 comments and suggestions are gratefully acknowledged. 1608 Path MTU determination through the report of fragmentation was first 1609 proposed by Charles Lynn on the TCP-IP mailing list in 1987. 1610 Extending the IP identification field was first proposed by Steve 1611 Deering on the MTUDWG mailing list in 1989. 1613 17. References 1614 17.1. Normative References 1616 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 1617 September 1981. 1619 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 1620 RFC 792, September 1981. 1622 [RFC1122] Braden, R., "Requirements for Internet Hosts - 1623 Communication Layers", STD 3, RFC 1122, October 1989. 1625 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1626 Requirement Levels", BCP 14, RFC 2119, March 1997. 1628 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1629 (IPv6) Specification", RFC 2460, December 1998. 1631 [RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure 1632 Neighbor Discovery (SEND)", RFC 3971, March 2005. 1634 [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control 1635 Message Protocol (ICMPv6) for the Internet Protocol 1636 Version 6 (IPv6) Specification", RFC 4443, March 2006. 1638 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 1639 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 1640 September 2007. 1642 17.2. Informative References 1644 [FOLK] Shannon, C., Moore, D., and k. claffy, "Beyond Folklore: 1645 Observations on Fragmented Traffic", December 2002. 1647 [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", 1648 October 1987. 1650 [I-D.massar-v6ops-ayiya] 1651 Massar, J., "AYIYA: Anything In Anything", 1652 draft-massar-v6ops-ayiya-02 (work in progress), July 2004. 1654 [I-D.taylor-v6ops-fragdrop] 1655 Jaeggli, J., Colitti, L., Kumari, W., Vyncke, E., Kaeo, 1656 M., and T. Taylor, "Why Operators Filter Fragments and 1657 What It Implies", draft-taylor-v6ops-fragdrop-01 (work in 1658 progress), June 2013. 1660 [I-D.templin-intarea-vet] 1661 Templin, F., "Virtual Enterprise Traversal (VET)", 1662 draft-templin-intarea-vet-40 (work in progress), May 2013. 1664 [I-D.templin-ironbis] 1665 Templin, F., "The Interior Routing Overlay Network 1666 (IRON)", draft-templin-ironbis-15 (work in progress), 1667 May 2013. 1669 [RFC0994] International Organization for Standardization (ISO) and 1670 American National Standards Institute (ANSI), "Final text 1671 of DIS 8473, Protocol for Providing the Connectionless- 1672 mode Network Service", RFC 994, March 1986. 1674 [RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP 1675 MTU discovery options", RFC 1063, July 1988. 1677 [RFC1070] Hagens, R., Hall, N., and M. Rose, "Use of the Internet as 1678 a subnetwork for experimentation with the OSI network 1679 layer", RFC 1070, February 1989. 1681 [RFC1146] Zweig, J. and C. Partridge, "TCP alternate checksum 1682 options", RFC 1146, March 1990. 1684 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1685 November 1990. 1687 [RFC1701] Hanks, S., Li, T., Farinacci, D., and P. Traina, "Generic 1688 Routing Encapsulation (GRE)", RFC 1701, October 1994. 1690 [RFC1812] Baker, F., "Requirements for IP Version 4 Routers", 1691 RFC 1812, June 1995. 1693 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 1694 for IP version 6", RFC 1981, August 1996. 1696 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 1697 October 1996. 1699 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 1700 Hashing for Message Authentication", RFC 2104, 1701 February 1997. 1703 [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in 1704 IPv6 Specification", RFC 2473, December 1998. 1706 [RFC2675] Borman, D., Deering, S., and R. Hinden, "IPv6 Jumbograms", 1707 RFC 2675, August 1999. 1709 [RFC2764] Gleeson, B., Heinanen, J., Lin, A., Armitage, G., and A. 1711 Malis, "A Framework for IP Based Virtual Private 1712 Networks", RFC 2764, February 2000. 1714 [RFC2780] Bradner, S. and V. Paxson, "IANA Allocation Guidelines For 1715 Values In the Internet Protocol and Related Headers", 1716 BCP 37, RFC 2780, March 2000. 1718 [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: 1719 Defeating Denial of Service Attacks which employ IP Source 1720 Address Spoofing", BCP 38, RFC 2827, May 2000. 1722 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", 1723 RFC 2923, September 2000. 1725 [RFC3232] Reynolds, J., "Assigned Numbers: RFC 1700 is Replaced by 1726 an On-line Database", RFC 3232, January 2002. 1728 [RFC3366] Fairhurst, G. and L. Wood, "Advice to link designers on 1729 link Automatic Repeat reQuest (ARQ)", BCP 62, RFC 3366, 1730 August 2002. 1732 [RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D., 1733 Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. 1734 Wood, "Advice for Internet Subnetwork Designers", BCP 89, 1735 RFC 3819, July 2004. 1737 [RFC4191] Draves, R. and D. Thaler, "Default Router Preferences and 1738 More-Specific Routes", RFC 4191, November 2005. 1740 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 1741 for IPv6 Hosts and Routers", RFC 4213, October 2005. 1743 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 1744 Internet Protocol", RFC 4301, December 2005. 1746 [RFC4302] Kent, S., "IP Authentication Header", RFC 4302, 1747 December 2005. 1749 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1750 Network Tunneling", RFC 4459, April 2006. 1752 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 1753 Discovery", RFC 4821, March 2007. 1755 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 1756 Errors at High Data Rates", RFC 4963, July 2007. 1758 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 1759 Mitigations", RFC 4987, August 2007. 1761 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1762 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 1763 May 2008. 1765 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 1766 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 1768 [RFC5320] Templin, F., "The Subnetwork Encapsulation and Adaptation 1769 Layer (SEAL)", RFC 5320, February 2010. 1771 [RFC5445] Watson, M., "Basic Forward Error Correction (FEC) 1772 Schemes", RFC 5445, March 2009. 1774 [RFC5720] Templin, F., "Routing and Addressing in Networks with 1775 Global Enterprise Recursion (RANGER)", RFC 5720, 1776 February 2010. 1778 [RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010. 1780 [RFC6139] Russert, S., Fleischman, E., and F. Templin, "Routing and 1781 Addressing in Networks with Global Enterprise Recursion 1782 (RANGER) Scenarios", RFC 6139, February 2011. 1784 [RFC6169] Krishnan, S., Thaler, D., and J. Hoagland, "Security 1785 Concerns with IP Tunneling", RFC 6169, April 2011. 1787 [RFC6335] Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S. 1788 Cheshire, "Internet Assigned Numbers Authority (IANA) 1789 Procedures for the Management of the Service Name and 1790 Transport Protocol Port Number Registry", BCP 165, 1791 RFC 6335, August 2011. 1793 [RFC6434] Jankiewicz, E., Loughney, J., and T. Narten, "IPv6 Node 1794 Requirements", RFC 6434, December 2011. 1796 [RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label 1797 for Equal Cost Multipath Routing and Link Aggregation in 1798 Tunnels", RFC 6438, November 2011. 1800 [RFC6864] Touch, J., "Updated Specification of the IPv4 ID Field", 1801 RFC 6864, February 2013. 1803 [RFC6935] Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and 1804 UDP Checksums for Tunneled Packets", RFC 6935, April 2013. 1806 [RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement 1807 for the Use of IPv6 UDP Datagrams with Zero Checksums", 1808 RFC 6936, April 2013. 1810 [RIPE] De Boer, M. and J. Bosma, "Discovering Path MTU Black 1811 Holes on the Internet using RIPE Atlas", July 2012. 1813 [SIGCOMM] Luckie, M. and B. Stasiewicz, "Measuring Path MTU 1814 Discovery Behavior", November 2010. 1816 [TBIT] Medina, A., Allman, M., and S. Floyd, "Measuring 1817 Interactions Between Transport Protocols and Middleboxes", 1818 October 2004. 1820 [WAND] Luckie, M., Cho, K., and B. Owens, "Inferring and 1821 Debugging Path MTU Discovery Failures", October 2005. 1823 Author's Address 1825 Fred L. Templin (editor) 1826 Boeing Research & Technology 1827 P.O. Box 3707 1828 Seattle, WA 98124 1829 USA 1831 Email: fltemplin@acm.org