idnits 2.17.1 

draft-templin-intarea-seal-68.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == The 'Obsoletes: ' line in the draft header should list only the
     _numbers_ of the RFCs which will be obsoleted by this document (if
     approved); it should not include the word 'RFC' in the list.

  == The 'Updates: ' line in the draft header should list only the _numbers_
     of the RFCs which will be updated by this document (if approved); it
     should not include the word 'RFC' in the list.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (January 03, 2014) is 3758 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'I-D.templin-aerolink' is mentioned on line 503, but
     not defined

  == Unused Reference: 'RFC0768' is defined on line 1170, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200)

  == Outdated reference: A later version (-02) exists of
     draft-taylor-v6ops-fragdrop-01

  -- Obsolete informational reference (is this intentional?): RFC 1981
     (Obsoleted by RFC 8201)

  -- Obsolete informational reference (is this intentional?): RFC 6434
     (Obsoleted by RFC 8504)


     Summary: 1 error (**), 0 flaws (~~), 6 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                    F. Templin, Ed.
3	Internet-Draft                              Boeing Research & Technology
4	Obsoletes: rfc5320 (if approved)                        January 03, 2014
5	Updates: rfc2460 (if approved)
6	Intended status: Standards Track
7	Expires: July 7, 2014

9	        The Subnetwork Encapsulation and Adaptation Layer (SEAL)
10	                   draft-templin-intarea-seal-68.txt

12	Abstract

14	   This document specifies a Subnetwork Encapsulation and Adaptation
15	   Layer (SEAL).  SEAL operates over virtual topologies configured over
16	   connected IP network routing regions bounded by encapsulating border
17	   nodes.  These virtual topologies are manifested by tunnels that may
18	   span multiple IP and/or sub-IP layer forwarding hops, where they may
19	   incur packet duplication, packet reordering, source address spoofing
20	   and traversal of links with diverse Maximum Transmission Units
21	   (MTUs).  SEAL addresses these issues through the encapsulation and
22	   messaging mechanisms specified in this document.

24	Status of this Memo

26	   This Internet-Draft is submitted in full conformance with the
27	   provisions of BCP 78 and BCP 79.

29	   Internet-Drafts are working documents of the Internet Engineering
30	   Task Force (IETF).  Note that other groups may also distribute
31	   working documents as Internet-Drafts.  The list of current Internet-
32	   Drafts is at http://datatracker.ietf.org/drafts/current/.

34	   Internet-Drafts are draft documents valid for a maximum of six months
35	   and may be updated, replaced, or obsoleted by other documents at any
36	   time.  It is inappropriate to use Internet-Drafts as reference
37	   material or to cite them other than as "work in progress."

39	   This Internet-Draft will expire on July 7, 2014.

41	Copyright Notice

43	   Copyright (c) 2014 IETF Trust and the persons identified as the
44	   document authors.  All rights reserved.

46	   This document is subject to BCP 78 and the IETF Trust's Legal
47	   Provisions Relating to IETF Documents
48	   (http://trustee.ietf.org/license-info) in effect on the date of
49	   publication of this document.  Please review these documents
50	   carefully, as they describe your rights and restrictions with respect
51	   to this document.  Code Components extracted from this document must
52	   include Simplified BSD License text as described in Section 4.e of
53	   the Trust Legal Provisions and are provided without warranty as
54	   described in the Simplified BSD License.

56	Table of Contents

58	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
59	     1.1.  Motivation . . . . . . . . . . . . . . . . . . . . . . . .  4
60	     1.2.  Approach . . . . . . . . . . . . . . . . . . . . . . . . .  6
61	     1.3.  Differences with RFC5320 . . . . . . . . . . . . . . . . .  7
62	   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  8
63	   3.  Requirements . . . . . . . . . . . . . . . . . . . . . . . . .  9
64	   4.  Applicability Statement  . . . . . . . . . . . . . . . . . . .  9
65	   5.  SEAL Specification . . . . . . . . . . . . . . . . . . . . . . 10
66	     5.1.  SEAL Tunnel Model  . . . . . . . . . . . . . . . . . . . . 10
67	     5.2.  SEAL Model of Operation  . . . . . . . . . . . . . . . . . 11
68	     5.3.  SEAL Encapsulation Format  . . . . . . . . . . . . . . . . 12
69	     5.4.  ITE Specification  . . . . . . . . . . . . . . . . . . . . 13
70	       5.4.1.  Tunnel MTU . . . . . . . . . . . . . . . . . . . . . . 13
71	       5.4.2.  Tunnel Neighbor Soft State . . . . . . . . . . . . . . 14
72	       5.4.3.  SEAL Layer Pre-Processing  . . . . . . . . . . . . . . 15
73	       5.4.4.  SEAL Encapsulation and Fragmentation . . . . . . . . . 16
74	       5.4.5.  Outer Encapsulation  . . . . . . . . . . . . . . . . . 16
75	       5.4.6.  Path MTU Probing and ETE Reachability Verification . . 17
76	       5.4.7.  Processing ICMP Messages . . . . . . . . . . . . . . . 18
77	       5.4.8.  Detecting Path MTU Changes . . . . . . . . . . . . . . 19
78	     5.5.  ETE Specification  . . . . . . . . . . . . . . . . . . . . 19
79	       5.5.1.  Reassembly Buffer Requirements . . . . . . . . . . . . 19
80	       5.5.2.  Tunnel Neighbor Soft State . . . . . . . . . . . . . . 19
81	       5.5.3.  IPv4-Layer Reassembly  . . . . . . . . . . . . . . . . 19
82	       5.5.4.  Decapsulation, SEAL-Layer Reassembly, and
83	               Re-Encapsulation . . . . . . . . . . . . . . . . . . . 20
84	   6.  Link Requirements  . . . . . . . . . . . . . . . . . . . . . . 21
85	   7.  End System Requirements  . . . . . . . . . . . . . . . . . . . 21
86	   8.  Router Requirements  . . . . . . . . . . . . . . . . . . . . . 21
87	   9.  Multicast/Anycast Considerations . . . . . . . . . . . . . . . 21
88	   10. Compatibility Considerations . . . . . . . . . . . . . . . . . 22
89	   11. Nested Encapsulation Considerations  . . . . . . . . . . . . . 22
90	   12. Reliability Considerations . . . . . . . . . . . . . . . . . . 23
91	   13. Integrity Considerations . . . . . . . . . . . . . . . . . . . 23
92	   14. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 23
93	   15. Security Considerations  . . . . . . . . . . . . . . . . . . . 24
94	   16. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 24
95	   17. Implementation Status  . . . . . . . . . . . . . . . . . . . . 24
96	   18. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 25
97	   19. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25
98	     19.1. Normative References . . . . . . . . . . . . . . . . . . . 25
99	     19.2. Informative References . . . . . . . . . . . . . . . . . . 26
100	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 29

102	1.  Introduction

104	   As Internet technology and communication has grown and matured, many
105	   techniques have developed that use virtual topologies (manifested by
106	   tunnels of one form or another) over an actual network that supports
107	   the Internet Protocol (IP) [RFC0791][RFC2460].  Those virtual
108	   topologies have elements that appear as one network layer hop, but
109	   are actually multiple IP or sub-IP layer hops which comprise the
110	   "subnetwork" over which the tunnel operates.

112	   The use of IP encapsulation (also known as "tunneling") has long been
113	   considered as the means for creating such virtual topologies (e.g.,
114	   see [RFC2003][RFC2473]).  Tunnels serve a wide variety of purposes,
115	   including mobility, security, routing control, traffic engineering,
116	   multihoming, etc., and will remain an integral part of the
117	   architecture moving forward.  However, the encapsulation headers
118	   often include insufficiently provisioned per-packet identification
119	   values.  IP encapsulation also allows an attacker to produce
120	   encapsulated packets with spoofed source addresses even if the source
121	   address in the encapsulating header cannot be spoofed.  A denial-of-
122	   service vector that is not possible in non-tunneled subnetworks is
123	   therefore presented.

125	   Additionally, the insertion of an outer IP header reduces the
126	   effective Maximum Transmission Unit (MTU) visible to the inner
127	   network layer.  When IPv6 is used as the encapsulation protocol,
128	   original sources expect to be informed of the MTU limitation through
129	   IPv6 Path MTU discovery (PMTUD) [RFC1981].  When IPv4 is used, this
130	   reduced MTU can be accommodated through the use of IPv4
131	   fragmentation, but unmitigated in-the-network fragmentation has been
132	   deemed harmful through operational experience and studies conducted
133	   over the course of many years [FRAG][FOLK][RFC4963].  Additionally,
134	   classical IPv4 PMTUD [RFC1191] has known operational issues that are
135	   exacerbated by in-the-network tunnels [RFC2923][RFC4459].

137	   The following subsections present further details on the motivation
138	   and approach for addressing these issues.

140	1.1.  Motivation

142	   Before discussing the approach, it is necessary to first understand
143	   the problems.  In both the Internet and private-use networks today,
144	   IP is ubiquitously deployed as the Layer 3 protocol.  The primary
145	   functions of IP are to provide for routing, addressing, and a
146	   fragmentation and reassembly capability used to accommodate links
147	   with diverse MTUs.  While it is well known that the IP address space
148	   is rapidly becoming depleted, there is also a growing awareness that
149	   other IP protocol limitations have already or may soon become
150	   problematic.

152	   First, the Internet historically provided no means for discerning
153	   whether the source addresses of IP packets are authentic.  This
154	   shortcoming is being addressed more and more through the deployment
155	   of site border router ingress filters [RFC2827], however the use of
156	   encapsulation provides a vector for an attacker to circumvent
157	   filtering for the encapsulated packet even if filtering is correctly
158	   applied to the encapsulation header.  Secondly, the IP header does
159	   not include a well-behaved identification value unless the source has
160	   included a fragment header for IPv6 or unless the source permits
161	   fragmentation for IPv4.  These limitations preclude an efficient
162	   means for routers to detect duplicate packets and packets that have
163	   been re-ordered within the subnetwork.  Additionally, recent studies
164	   have shown that the arrival of fragments at high data rates can cause
165	   denial-of-service (DoS) attacks on performance-sensitive networking
166	   gear, prompting some administrators to configure their equipment to
167	   drop fragments unconditionally [I-D.taylor-v6ops-fragdrop].

169	   For IPv4 encapsulation, when fragmentation is permitted the header
170	   includes a 16-bit Identification field, meaning that at most 2^16
171	   unique packets with the same (source, destination, protocol)-tuple
172	   can be active in the network at the same time [RFC6864].  (When
173	   middleboxes such as Network Address Translators (NATs) re-write the
174	   Identification field to random values, the number of unique packets
175	   is even further reduced.)  Due to the escalating deployment of high-
176	   speed links, however, these numbers have become too small by several
177	   orders of magnitude for high data rate packet sources such as tunnel
178	   endpoints [RFC4963].

180	   Furthermore, there are many well-known limitations pertaining to IPv4
181	   fragmentation and reassembly - even to the point that it has been
182	   deemed "harmful" in both classic and modern-day studies (see above).
183	   In particular, IPv4 fragmentation raises issues ranging from minor
184	   annoyances (e.g., in-the-network router fragmentation [RFC1981]) to
185	   the potential for major integrity issues (e.g., mis-association of
186	   the fragments of multiple IP packets during reassembly [RFC4963]).

188	   As a result of these perceived limitations, a fragmentation-avoiding
189	   technique for discovering the MTU of the forward path from a source
190	   to a destination node was devised through the deliberations of the
191	   Path MTU Discovery Working Group (MTUDWG) during the late 1980's
192	   through early 1990's which resulted in the publication of [RFC1191].
193	   In this negative feedback-based method, the source node provides
194	   explicit instructions to routers in the path to discard the packet
195	   and return an ICMP error message if an MTU restriction is
196	   encountered.  However, this approach has several serious shortcomings
197	   that lead to an overall "brittleness" [RFC2923].

199	   In particular, site border routers in the Internet have been known to
200	   discard ICMP error messages coming from the outside world.  This is
201	   due in large part to the fact that malicious spoofing of error
202	   messages in the Internet is trivial since there is no way to
203	   authenticate the source of the messages [RFC5927].  Furthermore, when
204	   a source node that requires ICMP error message feedback when a packet
205	   is dropped due to an MTU restriction does not receive the messages, a
206	   path MTU-related black hole occurs.  This means that the source will
207	   continue to send packets that are too large and never receive an
208	   indication from the network that they are being discarded.  This
209	   behavior has been confirmed through documented studies showing clear
210	   evidence of PMTUD failures for both IPv4 and IPv6 in the Internet
211	   today [TBIT][WAND][SIGCOMM][RIPE].

213	   The issues with both IP fragmentation and this "classical" PMTUD
214	   method are exacerbated further when IP tunneling is used [RFC4459].
215	   For example, a tunnel ingress may be required to forward encapsulated
216	   packets into the subnetwork on behalf of hundreds, thousands, or even
217	   more original sources.  If the ITE allows IP fragmentation on the
218	   encapsulated packets, persistent fragmentation could lead to
219	   undetected data corruption due to Identification field wrapping
220	   and/or reassembly congestion at the tunnel egress.  If the ingress
221	   instead uses classical IP PMTUD it must rely on ICMP error messages
222	   coming from the subnetwork that may be suspect, subject to loss due
223	   to filtering middleboxes, or insufficiently provisioned for
224	   translation into error messages to be returned to the original
225	   sources.

227	   Although recent works have led to the development of a positive
228	   feedback-based end-to-end MTU determination scheme [RFC4821], they do
229	   not excuse tunnels from accounting for the encapsulation overhead
230	   they add to packets.  Moreover, in current practice existing
231	   tunneling protocols mask the MTU issues by selecting a "lowest common
232	   denominator" MTU that may be much smaller than necessary for most
233	   paths and difficult to change at a later date.  Therefore, a new
234	   approach to accommodate tunnels over links with diverse MTUs is
235	   necessary.

237	1.2.  Approach

239	   This document concerns subnetworks manifested through a virtual
240	   topology configured over a connected network routing region and
241	   bounded by encapsulating border nodes.  Example connected network
242	   routing regions include Mobile Ad hoc Networks (MANETs), enterprise
243	   networks, aviation networks and the global public Internet itself.
244	   Subnetwork border nodes forward unicast and multicast packets over
245	   the virtual topology across multiple IP and/or sub-IP layer
246	   forwarding hops that may introduce packet duplication and/or traverse
247	   links with diverse Maximum Transmission Units (MTUs).

249	   This document introduces a Subnetwork Encapsulation and Adaptation
250	   Layer (SEAL) for tunneling inner network layer protocol packets over
251	   IP subnetworks that connect Ingress and Egress Tunnel Endpoints
252	   (ITEs/ETEs) of border nodes.  It provides a modular specification
253	   designed to be tailored to specific associated tunneling protocols.
254	   (A transport-mode of operation is also possible but out of scope for
255	   this document.)

257	   SEAL treats tunnels that traverse the subnetwork as ordinary links
258	   that must support network layer services.  Moreover, SEAL provides
259	   dynamic mechanisms (including limited fragmentation and reassembly)
260	   to ensure a maximal path MTU over the tunnel.  This is in contrast to
261	   static approaches which avoid MTU issues by selecting a lowest common
262	   denominator MTU value that may be overly conservative for the vast
263	   majority of tunnel paths and difficult to change even when larger
264	   MTUs become available.

266	1.3.  Differences with RFC5320

268	   This specification of SEAL is descended from an experimental
269	   independent RFC publication of the same name [RFC5320].  However,
270	   this specification introduces a number of fundamental differences
271	   from the earlier publication.  This specification therefore obsoletes
272	   (i.e., and does not update) [RFC5320].

274	   First, [RFC5320] forms a 32-bit Identification value by concatenating
275	   the 16-bit IPv4 Identification field with a 16-bit Identification
276	   "extension" field in the SEAL header.  This means that [RFC5320] can
277	   only operate over IPv4 networks (since IPv6 headers do not include a
278	   16-bit version number) and that the SEAL Identification value can be
279	   corrupted if the Identification in the outer IPv4 header is
280	   rewritten.  In contrast, this specification includes a 32-bit
281	   Identification value that is independent of any identification fields
282	   found in the inner or outer IP headers, and is therefore compatible
283	   with any inner and outer IP protocol version combinations.

285	   Additionally, the SEAL fragmentation and reassembly procedures
286	   defined in [RFC5320] differ significantly from those found in this
287	   specification.  In particular, this specification defines an 13-bit
288	   Offset field that allows for finer-grained fragment sizes when SEAL
289	   fragmentation and reassembly is necessary.  In contrast, [RFC5320]
290	   includes only a 3-bit Segment field and performs reassembly through
291	   concatenation of consecutive segments.

293	   Finally, SEAL no longer uses the IPv4 fragmentation sensing method
294	   specified in [RFC5320] as well as in earlier versions of this
295	   document.  This departure is based on the fact that there is no way
296	   for the ITE or ETE to control the way in which middleboxes perform
297	   IPv4 fragmentation (e.g., largest fragment first, smallest fragment
298	   first, all fragments the same size, etc.).  Moreover, there may be
299	   middleboxes in the path that reassemble IPv4 fragmented packets
300	   before delivering them to the ETE as the final destination.  Use of
301	   IPv4 fragmentation sensing in the ETE also greatly complicated the
302	   specification and proved difficult to implement.  Therefore, although
303	   the IPv4 fragmentation sensing method is conceptually elegant and
304	   natural, it is no longer included.

306	2.  Terminology

308	   The following terms are defined within the scope of this document:

310	   subnetwork
311	      a virtual topology configured over a connected network routing
312	      region and bounded by encapsulating border nodes.

314	   IP
315	      used to generically refer to either Internet Protocol (IP)
316	      version, i.e., IPv4 or IPv6.

318	   Ingress Tunnel Endpoint (ITE)
319	      a portal over which an encapsulating border node (host or router)
320	      sends encapsulated packets into the subnetwork.

322	   Egress Tunnel Endpoint (ETE)
323	      a portal over which an encapsulating border node (host or router)
324	      receives encapsulated packets from the subnetwork.

326	   inner packet
327	      an unencapsulated network layer protocol packet (e.g., IPv4
328	      [RFC0791], OSI/CLNP [RFC0994], IPv6 [RFC2460], etc.) before any
329	      outer encapsulations are added.  Internet protocol numbers that
330	      identify inner packets are found in the IANA Internet Protocol
331	      registry [RFC3232].  SEAL protocol packets that incur an
332	      additional layer of SEAL encapsulation are also considered inner
333	      packets.

335	   outer IP packet
336	      a packet resulting from adding an outer IP header (and possibly
337	      other outer headers) to a SEAL-encapsulated inner packet.

339	   packet-in-error
340	      the leading portion of an invoking data packet encapsulated in the
341	      body of an error control message (e.g., an ICMPv4 [RFC0792] error
342	      message, an ICMPv6 [RFC4443] error message, etc.).

344	   Packet Too Big (PTB) message
345	      a control plane message indicating an MTU restriction (e.g., an
346	      ICMPv6 "Packet Too Big" message [RFC4443], an ICMPv4
347	      "Fragmentation Needed" message [RFC0792], etc.).

349	   Don't Fragment (DF) bit
350	      a bit that indicates whether the packet may be fragmented by the
351	      network.  The DF bit is explicitly included in the IPv4 header
352	      [RFC0791] and may be set to '0' to allow fragmentation or '1' to
353	      disallow further in-network fragmentation.  The bit is absent from
354	      the IPv6 header [RFC2460], but implicitly set to '1' because
355	      fragmentation can occur only at IPv6 sources.

357	3.  Requirements

359	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
360	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
361	   document are to be interpreted as described in [RFC2119].  When used
362	   in lower case (e.g., must, must not, etc.), these words MUST NOT be
363	   interpreted as described in [RFC2119], but are rather interpreted as
364	   they would be in common English.

366	4.  Applicability Statement

368	   SEAL was originally motivated by the specific case of subnetwork
369	   abstraction for Mobile Ad hoc Networks (MANETs), however the domain
370	   of applicability also extends to subnetwork abstractions over
371	   enterprise networks, mobile networks, aviation networks, ISP
372	   networks, SO/HO networks, the global public Internet itself, and any
373	   other connected network routing region.

375	   SEAL provides a network sublayer used during encapsulation of an
376	   inner network layer packet within outer encapsulating headers.  SEAL
377	   can also be used as a sublayer within a transport layer protocol data
378	   payload, where transport layer encapsulation is typically used for
379	   Network Address Translator (NAT) traversal as well as operation over
380	   subnetworks that give preferential treatment to certain "core"
381	   Internet protocols, e.g., TCP, UDP, etc.  (However, note that TCP
382	   encapsulation may not be appropriate for all use cases; particularly
383	   those that require low delay and/or delay variance.)  The SEAL header
384	   is processed in the same manner as for IPv6 extension headers, i.e.,
385	   it is not part of the outer IP header but rather allows for the
386	   creation of an arbitrarily extensible chain of headers in the same
387	   way that IPv6 does.

389	   To accommodate MTU diversity, the Ingress Tunnel Endpoint (ITE) may
390	   need to perform limited fragmentation which the Egress Tunnel
391	   Endpoint (ETE) reassembles.  The ITE and ETE further engage in
392	   minimal path probing to determine when the path can be traversed
393	   without fragmentation.  This allows the ITE to send whole packets
394	   instead of fragmented packets whenever possible.

396	   In practice, SEAL is typically used as an encapsulation sublayer in
397	   conjunction with existing tunnel types such as IPsec [RFC4301] ,
398	   GRE[RFC1701], IP-in-IPv6 [RFC2473], IP-in-IPv4 [RFC4213][RFC2003],
399	   etc.  When used with existing tunnel types that insert mid-layer
400	   headers between the inner and outer IP headers (e.g., IPsec, GRE,
401	   etc.), the SEAL header is inserted between the mid-layer headers and
402	   outer IP header.

404	5.  SEAL Specification

406	   The following sections specify the operation of SEAL:

408	5.1.  SEAL Tunnel Model

410	   SEAL is an encapsulation sublayer used within point-to-point, point-
411	   to-multipoint, and non-broadcast, multiple access (NBMA) tunnels.
412	   SEAL can also be used with multicast-capable tunnels, but the path
413	   probing mechanisms specified in the following sections may not always
414	   be sufficient to determine an optimal MTU for a multicast group.

416	   Each tunnel is configured over one or more underlying interfaces
417	   attached to subnetwork links, where each link represents a different
418	   subnetwork path.  The tunnel connects an ITE to one or more ETE
419	   "neighbors" via encapsulation across an underlying subnetwork, where
420	   each tunnel neighbor relationship is maintained over one or more
421	   subnetwork paths.  The tunnel neighbor relationship may be
422	   bidirectional, partially unidirectional or fully unidirectional.

424	   A bidirectional tunnel neighbor relationship is one over which both
425	   tunnel endpoints can exchange both data and control messages.  A
426	   partially unidirectional tunnel neighbor relationship allows the near
427	   end ITE to send data packets forward to the far end ETE, while the
428	   far end only returns control messages when necessary.  Finally, a
429	   fully unidirectional mode of operation is one in which the near end
430	   ITE can receive neither data nor control messages from the far end
431	   ETE.

433	5.2.  SEAL Model of Operation

435	   SEAL-enabled ITEs encapsulate each inner packet in any ancillary
436	   tunnel protocol headers and trailers, a SEAL header, and any outer
437	   header encapsulations as shown in Figure 1:

439	                                +--------------------+
440	                                ~   outer IP header  ~
441	                                +--------------------+
442	                                ~  other outer hdrs  ~
443	                                +--------------------+
444	                                ~    SEAL header     ~
445	                                +--------------------+
446	                                ~   tunnel headers   ~
447	   +--------------------+       +--------------------+
448	   |                    |  -->  |                    |
449	   ~        Inner       ~  -->  ~        Inner       ~
450	   ~       Packet       ~  -->  ~       Packet       ~
451	   |                    |  -->  |                    |
452	   +--------------------+       +--------------------+
453	                                ~   tunnel trailers  ~
454	                                +--------------------+

456	                       Figure 1: SEAL Encapsulation

458	   The ITE inserts the SEAL header according to the specific tunneling
459	   protocol.  For simple encapsulation of an inner network layer packet
460	   within an outer IP header, the ITE inserts the SEAL header following
461	   the outer IP header and before the inner packet as: IP/SEAL/{inner
462	   packet}.

464	   For encapsulations over transports such as UDP, the ITE inserts the
465	   SEAL header following the outer transport layer header and before the
466	   inner packet, e.g., as IP/UDP/SEAL/{inner packet}.  In that case, the
467	   UDP header is seen as an "other outer header" as depicted in Figure 1
468	   and the outer IP and transport layer headers are together seen as the
469	   outer encapsulation headers.  (Note that outer transport layer
470	   headers such as UDP must sometimes be included to ensure that SEAL
471	   packets will traverse the path to the ETE without loss due filtering
472	   middleboxes.  The ETE MUST accept both IP/SEAL and IP/UDP/SEAL as
473	   equivalent packets so that the ITE can discontinue outer transport
474	   layer encapsulation if the path supports raw IP/SEAL encapsulation.)

476	   For SEAL encapsulations that involve tunnel types that include
477	   ancillary tunnel headers (e.g., GRE, IPsec, etc.) the ITE inserts the
478	   SEAL header as a leading extension to the tunnel headers, i.e., the
479	   SEAL encapsulation appears as part of the same tunnel and not a
480	   separate tunnel.  For example, for GRE the ITE inserts the SEAL
481	   header as IP/SEAL/GRE/{inner packet}, and for IPsec the ITE inserts
482	   the SEAL header as IP/SEAL/IPsec-header/{inner packet}/IPsec-trailer.
483	   In such cases, SEAL considers the length of the inner packet only
484	   (i.e., and not the other tunnel headers and trailers) when performing
485	   its packet size calculations.

487	   SEAL supports both "nested" tunneling and "re-encapsulating"
488	   tunneling.  Nested tunneling occurs when a first tunnel is
489	   encapsulated within a second tunnel, which may then further be
490	   encapsulated within additional tunnels.  Nested tunneling can be
491	   useful, and stands in contrast to "recursive" tunneling which is an
492	   anomalous condition incurred due to misconfiguration or a routing
493	   loop.  Considerations for nested tunneling and avoiding recursive
494	   tunneling are discussed in Section 4 of [RFC2473] as well as in
495	   Section 9 of this document.

497	   Re-encapsulating tunneling occurs when a packet arrives at a first
498	   ETE, which then acts as an ITE to re-encapsulate and forward the
499	   packet to a second ETE connected to the same subnetwork.  In that
500	   case each ITE/ETE transition represents a segment of a bridged path
501	   between the ITE nearest the source and the ETE nearest the
502	   destination.  Uses for re-encapsulating tunneling are discussed in
503	   [I-D.templin-aerolink].  Combinations of nested and re-encapsulating
504	   tunneling are also naturally supported by SEAL.

506	   The SEAL ITE considers each underlying interface as the ingress
507	   attachment point to a separate subnetwork path to the ETE.  The ITE
508	   therefore may experience different path MTUs on different subnetwork
509	   paths.

511	   Finally, the SEAL ITE ensures that the inner network layer protocol
512	   will see a minimum MTU of 1500 bytes over each subnetwork path
513	   regardless of the outer network layer protocol version, i.e., even if
514	   a small amount of fragmentation and reassembly are necessary.  This
515	   is to avoid path MTU "black holes" for the minimum MTU configured by
516	   the vast majority of links in the Internet.

518	5.3.  SEAL Encapsulation Format

520	   The SEAL header shares the same format and IP protocol number ('44')
521	   as the IPv6 Fragment Header specified in Section 4.5 of [RFC2460].
522	   The SEAL header is differentiated from the IPv6 Fragment Header by
523	   defining bit number 30 as the "SEAL (S)" bit which is set to 1 when
524	   SEAL encapsulation is used and set to 0 for ordinary IPv6
525	   fragmentation.  SEAL therefore updates the IPv6 Fragment Header
526	   specification as shown in Figure 2:

528	       0                   1                   2                   3
529	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
530	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
531	      |  Next Header  |    Reserved   |      Fragment Offset    |R|S|M|
532	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
533	      |                         Identification                        |
534	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

536	                    Figure 2: SEAL Encapsulation Format

538	5.4.  ITE Specification

540	5.4.1.  Tunnel MTU

542	   The tunnel must present a stable MTU value to the inner network layer
543	   as the size for admission of inner packets into the tunnel.  Since
544	   tunnels may support a large set of subnetwork paths that accept
545	   widely varying maximum packet sizes, however, a number of factors
546	   should be taken into consideration when selecting a tunnel MTU.

548	   Due to the ubiquitous deployment of standard Ethernet and similar
549	   networking gear, the nominal Internet cell size has become 1500
550	   bytes; this is the de facto size that end systems have come to expect
551	   will either be delivered by the network without loss due to an MTU
552	   restriction on the path or a suitable ICMP Packet Too Big (PTB)
553	   message returned.  When large packets sent by end systems incur
554	   additional encapsulation at an ITE, however, they may be dropped
555	   silently within the tunnel since the network may not always deliver
556	   the necessary PTBs [RFC2923].  The ITE SHOULD therefore set a tunnel
557	   MTU of at least 1500 bytes and provide accommodations to ensure that
558	   packets up to that size are successfully conveyed to the ETE.

560	   The inner network layer protocol consults the tunnel MTU when
561	   admitting a packet into the tunnel.  For non-SEAL inner IPv4 packets
562	   with the IPv4 Don't Fragment (DF) bit cleared (i.e., DF==0), if the
563	   packet is larger than the tunnel MTU the inner IPv4 layer uses IPv4
564	   fragmentation to break the packet into fragments no larger than the
565	   MTU.  The ITE then admits each fragment into the tunnel as an
566	   independent packet.

568	   For all other inner packets, the inner network layer admits the
569	   packet if it is no larger than the tunnel MTU; otherwise, it drops
570	   the packet and sends a PTB error message to the source with the MTU
571	   value set to the MTU.  The message contains as much of the invoking
572	   packet as possible without the entire message exceeding the network
573	   layer minimum MTU size.

575	   The ITE can alternatively set an indefinite tunnel MTU such that all
576	   inner packets are admitted into the tunnel regardless of their size
577	   (practical maximums are 64KB for IPv4 and 4GB for IPv6 [RFC2675]).
578	   For ITEs that host applications that use the tunnel directly, this
579	   option must be carefully coordinated with protocol stack upper layers
580	   since some upper layer protocols (e.g., TCP) derive their packet
581	   sizing parameters from the MTU of the outgoing interface and as such
582	   may select too large an initial size.  This is not a problem for
583	   upper layers that use conservative initial maximum segment size
584	   estimates and/or when the tunnel can reduce the upper layer's maximum
585	   segment size, e.g., by reducing the size advertised in the MSS option
586	   of outgoing TCP messages (sometimes known as "MSS clamping").

588	   In light of the above considerations, the ITE SHOULD configure an
589	   indefinite MTU on *router* tunnels so that SEAL performs all
590	   subnetwork adaptation from within the tunnel as specified in the
591	   following sections.  The ITE MAY instead set a smaller MTU on *host*
592	   tunnels; in that case, the RECOMMENDED MTU is the maximum of 1500
593	   bytes and the smallest MTU among all of the underlying links minus
594	   the size of the encapsulation headers.

596	5.4.2.  Tunnel Neighbor Soft State

598	   The ITE maintains a number of soft state variables and constants.

600	   The ITE maintains a per-ETE window of Identification values for the
601	   packets it sends to the ETE.  The ITE increments the current
602	   Identification value monotonically (modulo 2^32) for each packet it
603	   sends.

605	   For each subnetwork path, the ITE must also account for encapsulation
606	   header lengths.  The ITE therefore maintains the per subnetwork path
607	   constant values "SHLEN" set to the length of the SEAL header, "THLEN"
608	   set to the length of the outer encapsulating transport layer headers
609	   (or 0 if outer transport layer encapsulation is not used), "IHLEN"
610	   set to the length of the outer IP layer header, and "HLEN" set to
611	   (SHLEN+THLEN+IHLEN).  When calculating these lengths, the ITE must
612	   include the length of the uncompressed headers even if header
613	   compression is enabled.  When SEAL is used in conjunction with tunnel
614	   types that insert additional headers/trailers such as GRE or IPsec,
615	   the length of the additional headers and trailers is also included in
616	   the HLEN calculation.

618	   The ITE also sets a global constant value "MINMTU" to 1500 bytes and
619	   sets a per subnetwork path constant value 'FRAGMTU' to (1280-HLEN)
620	   bytes (where 1280 is the minimum path MTU for IPv6 [RFC2460]).  The
621	   value 1280 is used regardless of the outer IP protocol version even
622	   though the practical minimum MTU for IPv4 is only 576 bytes [RFC1122]
623	   and the theoretical minimum MTU for IPv4 is only 68 bytes [RFC0791].

625	   The value 1280 is applied also to IPv4 since IPv4 links with MTUs
626	   smaller than 1280 are presumably performance-constrained such that
627	   IPv4 fragmentation can be used to accommodate MTU underruns without
628	   risk of high data rate reassembly misassociations.

630	   The ITE also sets a per subnetwork path variable "MAXMTU" to the
631	   maximum of MINMTU and the MTU of the underlying interface minus HLEN.
632	   The ITE thereafter adjusts MAXMTU based on any PTB messages it
633	   receives from the subnetwork, but does not reduce MAXMTU below
634	   MINMTU.

636	   The ITE finally maintains a per subnetwork path boolean variable
637	   "DOFRAG", which is initially set to TRUE and may be reset to FALSE if
638	   the ITE discovers that the MTU on the path to the ETE is sufficient
639	   to accommodate packet sizes of MINMTU bytes or larger.

641	5.4.3.  SEAL Layer Pre-Processing

643	   The SEAL layer is logically positioned between the inner and outer
644	   network protocol layers, where the inner layer is seen as the (true)
645	   network layer and the outer layer is seen as the (virtual) data link
646	   layer.  Each packet to be processed by the SEAL layer is either
647	   admitted into the tunnel by the inner network layer protocol as
648	   described in Section 5.4.1 or is undergoing re-encapsulation from
649	   within the tunnel.  The SEAL layer sees the former class of packets
650	   as inner packets that include inner network and transport layer
651	   headers, and sees the latter class of packets as transitional SEAL
652	   packets that include the outer and SEAL layer headers that were
653	   inserted by the previous hop SEAL ITE.  For these transitional
654	   packets, the SEAL layer re-encapsulates the packet with new outer and
655	   SEAL layer headers when it forwards the packet to the next hop SEAL
656	   ITE.

658	   We now discuss the SEAL layer pre-processing actions for these two
659	   classes of packets.

661	5.4.3.1.  Inner Packet Pre-Processing

663	   For each IPv4 inner packet with DF==0 in the IP header, if the packet
664	   is larger than MINMTU bytes the ITE first uses standard IPv4
665	   fragmentation to fragment the packet into N pieces of at most MINMTU
666	   bytes each.  In this process, the ITE MUST additionally ensure that N
667	   is minimized, the first fragment is the largest fragment and no
668	   fragments are overlapping.  The ITE then submits each fragment for
669	   SEAL encapsulation as specified in Section 5.4.4.

671	   For all other inner packets, if the packet is no larger than MAXMTU
672	   the ITE submits it for SEAL encapsulation as specified in Section
673	   5.4.4.  Otherwise, the ITE discards the packet and sends a PTB
674	   message appropriate to the inner protocol version (subject to rate
675	   limiting) with the MTU field set to MAXMTU.

677	5.4.3.2.  Transitional SEAL Packet Pre-Processing

679	   For each transitional packet that is to be processed by the SEAL
680	   layer from within the tunnel, if the packet is larger than MAXMTU for
681	   the next hop subnetwork path the ITE discards the packet and sends a
682	   PTB message appropriate to the inner protocol version (subject to
683	   rate limiting) with the MTU field set to MAXMTU.  Otherwise, the ITE
684	   sets aside the encapsulating SEAL and outer headers for later
685	   reference (see Section 5.4.5) and submits the inner packet for SEAL
686	   re-encapsulation as discussed in the following sections.

688	5.4.4.  SEAL Encapsulation and Fragmentation

690	   For each inner packet/fragment submitted for SEAL encapsulation, the
691	   ITE next encapsulates the packet in a SEAL header formatted as
692	   specified in Section 5.3.  The ITE next sets S=1 and sets the Next
693	   Header field to the protocol number corresponding to the address
694	   family of the encapsulated inner packet.  For example, the ITE sets
695	   the Next Header field to the value '4' for encapsulated IPv4 packets
696	   [RFC2003], '41' for encapsulated IPv6 packets [RFC2473][RFC4213],
697	   '47' for GRE [RFC1701], '80' for encapsulated OSI/CLNP packets
698	   [RFC1070], etc.

700	   Next, if the inner packet is no larger than FRAGMTU, or if the inner
701	   packet is larger than MINMTU, or if the DOFRAG flag is FALSE, the ITE
702	   sets (M=0; Offset=0) and considers the packet an "atomic fragment"
703	   (see: [RFC6946]).  Otherwise, the ITE fragments the inner packet
704	   using the fragmentation procedures specified in Section 4.5 of
705	   [RFC2460].  In this process, the ITE breaks the inner packet into two
706	   non-overlapping fragments, where the encapsulated SEAL packet
707	   containing the first fragment MUST be as large as possible without
708	   exceeding 1280 bytes (i.e., the IPv6 minimum MTU) and the
709	   encapsulated SEAL packet containing the second fragment MUST include
710	   the remainder of the inner packet.  This ensures that the entire IP
711	   header (plus extensions) is likely to fit within the first fragment
712	   and that the number of fragments is minimized.  The ITE then adds the
713	   outer encapsulating headers as specified in Section 5.4.5.

715	5.4.5.  Outer Encapsulation

717	   Following SEAL encapsulation and fragmentation, the ITE next
718	   encapsulates each fragment in the requisite outer transport (when
719	   necessary) and IP layer headers.  When a transport layer header such
720	   as UDP or TCP is included, the ITE writes the port number for SEAL in
721	   the transport destination service port field.

723	   When UDP encapsulation is used, the ITE sets the UDP checksum field
724	   to zero for both IPv4 and IPv6 packets (see: [RFC6935][RFC6936]).

726	   The ITE then sets the outer IP layer headers the same as specified
727	   for ordinary IP encapsulation (e.g., [RFC1070][RFC2003], [RFC2473],
728	   [RFC4213], etc.) except that for ordinary SEAL packets the ITE copies
729	   the "TTL/Hop Limit", "Type of Service/Traffic Class" and "Congestion
730	   Experienced" values in the inner network layer header into the
731	   corresponding fields in the outer IP header.  For transitional SEAL
732	   packets undergoing re-encapsulation, the ITE instead copies the "TTL/
733	   Hop Limit", "Type of Service/Traffic Class" and "Congestion
734	   Experienced" values in the original outer IP header of the
735	   transitional packet into the corresponding fields in the new outer IP
736	   header of the packet to be forwarded (i.e., the values are
737	   transferred between outer headers and *not* copied from the inner
738	   network layer header).

740	   The ITE also sets the IP protocol number to the appropriate value for
741	   the first protocol layer within the encapsulation (e.g., UDP, TCP,
742	   IPv6 Fragment Header, etc.).  When IPv6 is used as the outer IP
743	   protocol, the ITE then sets the flow label value in the outer IPv6
744	   header the same as described in [RFC6438].  When IPv4 is used as the
745	   outer IP protocol, if the encapsulated SEAL packet is no larger than
746	   1280 bytes the ITE sets DF=0 in the IPv4 header to allow the packet
747	   to be fragmented if it encounters a restricting link; otherwise, the
748	   ITE sets DF=1 (for IPv6 subnetwork paths, the DF bit is absent but
749	   implicitly set to 1).  The ITE finally sends each outer packet via
750	   the corresponding underlying subnetwork path.

752	5.4.6.  Path MTU Probing and ETE Reachability Verification

754	   When the ITE is actively sending packets over a subnetwork path to an
755	   ETE, it also sends explicit probes subject to rate limiting to test
756	   the path MTU.  To generate a probe, the ITE creates an ICMPv6 Echo
757	   Request message [RFC4443] of length MINMTU bytes and encapsulates the
758	   message in a SEAL header and any other outer headers, i.e., with the
759	   length of the resulting SEAL packet being (MINMTU+HLEN) bytes.  It
760	   then sets (Offset=0; S=1; M=0) in the SEAL header, and also sets DF=1
761	   in the outer IP header when IPv4 is used.  It finally writes the
762	   value '58' in the Next Header field of the SEAL header to indicate
763	   that the message is a SEAL-encapsulated ICMPv6 message.

765	   The ITE sends such MINMTU probes to determine whether SEAL
766	   fragmentation is still necessary (see Section 5.4.4).  In particular,
767	   if the ITE sends a probe and receives a SEAL-encapsulated ICMPv6 Echo
768	   Reply message probe reply (see: section 5.5.4), it SHOULD set DOFRAG
769	   for this subnetwork path to FALSE.  Note that the nominal probe size
770	   of MINMTU bytes is RECOMMENDED since probes slightly smaller than
771	   this size may be fragmented by the ITE of a nested tunnel further
772	   down the path.  For example, a successful probe size of 1400 bytes
773	   does not guarantee that fragmentation is not occurring at the ITE of
774	   another tunnel nesting level.  While this would not necessarily
775	   result in communication failure, it could yield poor performance not
776	   only for the other tunnel nesting levels but also for the ITE itself.

778	   The ITE can also send smaller probes to determine whether the ETE is
779	   still reachable over this subnetwork path.  The ITE prepares the
780	   probe as described above then sends the message to the ETE.  If the
781	   ITE receives a probe reply, its upper layers can consider the message
782	   as a reachability indication.  The ITE can also send larger probes to
783	   test for larger MTU sizes; however, SEAL considers probing for MTU
784	   sizes larger than MINMTU as an end-to-end consideration to be
785	   addressed by end systems (see: Section 7).

787	   Finally, the ITE can also send probes to detect whether an outer
788	   transport layer header is no longer necessary to reach this ETE.  For
789	   example, if the ITE sends its initial packets as IP/UDP/SEAL/*, it
790	   can send probes constructed as IP/SEAL/[probe] to determine whether
791	   the ETE is reachable without the use of UDP encapsulation.  If so,
792	   the ITE should also send a new MINMTU probe since switching to a new
793	   encapsulation format may result in a path change.

795	   While probing, the ITE processes ICMP messages as specified in
796	   Section 5.4.7.

798	5.4.7.  Processing ICMP Messages

800	   When the ITE sends SEAL packets, it may receive ICMP error messages
801	   [RFC0792][RFC4443] from a router on the path to the ETE.  Each ICMP
802	   message includes an outer IP header, followed by an ICMP header,
803	   followed by a portion of the SEAL packet that generated the error
804	   (also known as the "packet-in-error").  Note that the ITE may receive
805	   an ICMP message from either an ordinary router on the path or from
806	   another ITE that is at the head end of a nested level of
807	   encapsulation.  The ITE has no security associations with this nested
808	   ITE, hence it should consider the message the same as if it
809	   originated from an ordinary router.

811	   The ITE should process ICMP Protocol/Port Unreachable messages as a
812	   hint that the ETE does not implement SEAL.  The ITE can optionally
813	   ignore other ICMP messages that do not include sufficient information
814	   in the packet-in-error, or process them as a hint that the subnetwork
815	   path to the ETE may be failing.  The ITE then discards these types of
816	   messages.

818	   For other ICMP messages, the ITE SHOULD examine the SEAL data packet
819	   within the packet-in-error field.  If the IP source and/or
820	   destination addresses are invalid, or if the value in the SEAL header
821	   Identification field (if present) is not within the window of packets
822	   the ITE has recently sent to this ETE, the ITE discards the message.

824	   Next, if the received ICMP message is a PTB the ITE sets MAXMTU to
825	   the maximum of MINMTU and the MTU value in the message minus HLEN.
826	   If the MTU value in the message is smaller than (MINMTU+HLEN), the
827	   ITE also resets DOFRAG to TRUE and discards the message.

829	   If the ICMP message was not discarded, the ITE transcribes it into a
830	   message appropriate for the inner protocol version (e.g., ICMPv4 for
831	   IPv4, ICMPv6 for IPv6, etc.) and forwards the transcribed message to
832	   the previous hop toward the inner source address.

834	5.4.8.  Detecting Path MTU Changes

836	   The ITE SHOULD periodically reset MAXMTU to the MTU of the underlying
837	   subnetwork interface to determine whether the subnetwork path MTU has
838	   increased.  If the path still has a too-small MTU, the ITE will
839	   receive a PTB message that reports a smaller size.

841	5.5.  ETE Specification

843	5.5.1.  Reassembly Buffer Requirements

845	   The ETE MUST configure a minimum SEAL reassembly buffer size of
846	   (MINMTU+HLEN) bytes for the reassembly of fragmented SEAL packets
847	   (see: Section 5.5.4).  Note that the value "HLEN" may be variable and
848	   initially unknown to the ETE.  It is therefore RECOMMENDED that the
849	   ETE configure a slightly larger SEAL reassembly buffer size of 2048
850	   bytes (2KB).

852	   When IPv4 is used as the outer layer of encapsulation, the ETE MUST
853	   also configure a minimum IPv4 reassembly buffer size of 1280 bytes.

855	5.5.2.  Tunnel Neighbor Soft State

857	   The ETE maintains a window of Identification values for the packets
858	   it has recently received from this ITE as well as a window of
859	   Identification values for the packets it has recently sent to this
860	   ITE.

862	5.5.3.  IPv4-Layer Reassembly

864	   The ETE reassembles fragmented IPv4 packets that are explicitly
865	   addressed to itself.  For IPv4 fragments of SEAL packets, the ETE
866	   SHOULD maintain conservative reassembly cache high- and low-water
867	   marks.  When the size of the reassembly cache exceeds this high-water
868	   mark, the ETE SHOULD actively discard stale incomplete reassemblies
869	   (e.g., using an Active Queue Management (AQM) strategy) until the
870	   size falls below the low-water mark.  The ETE SHOULD also actively
871	   discard any pending reassemblies that clearly have no opportunity for
872	   completion, e.g., when a considerable number of new fragments have
873	   arrived before a fragment that completes a pending reassembly
874	   arrives.

876	   The ETE processes IPv4 fragments as specified in the normative
877	   references, i.e., it performs any necessary IPv4 reassembly then
878	   submits the packet to the appropriate upper layer protocol module.
879	   For SEAL packets, the ETE then performs SEAL decapsulation as
880	   specified in Section 5.5.4.

882	5.5.4.  Decapsulation, SEAL-Layer Reassembly, and Re-Encapsulation

884	   For each SEAL packet accepted for decapsulation, the ETE first
885	   examines the Identification field.  If the Identification is not
886	   within the window of acceptable values for this ITE, the ETE silently
887	   discards the packet..

889	   Next, if the SEAL header has (Offset!=0 || M=1) the ETE submits the
890	   packet for reassembly as specified for IPv6 reassembly in Section 4.5
891	   of [RFC2460].  During the reassembly process, the ETE discards any
892	   fragments that are overlapping with respect to fragments that have
893	   already been received (see: [RFC5722]), and also discards any
894	   fragments that have M=1 in the SEAL header but do not contain an
895	   integer multiple of 8 bytes.  The ETE further SHOULD manage the SEAL
896	   reassembly cache the same as described for the IPv4-Layer Reassembly
897	   cache in Section 5.5.3, i.e., it SHOULD perform an early discard for
898	   any pending reassemblies that have low probability of completion.

900	   Next, if the (reassembled) packet is an ICMPv6 Echo Request probe
901	   message, the ETE prepares an ICMPv6 Echo Reply probe reply message to
902	   send back to the ITE.  The ETE then encapsulates the probe reply as
903	   specified in Section 5.4.4 and fragments the message if necessary
904	   according to the DOFRAG flag (i.e., to ensure that the probe reply is
905	   delivered to the ITE).  The ETE then sends the probe reply to the ITE
906	   and discards the probe.  When the ITE receives the probe reply, it
907	   reassembles the message if necessary and processes it as specified in
908	   Section 5.4.6.

910	   Finally, the ETE discards the outer headers of the (reassembled)
911	   packet and processes the inner packet according to the header type
912	   indicated in the SEAL Next Header field.  If the next hop toward the
913	   inner destination address is via a different interface than the SEAL
914	   packet arrived on, the ETE discards the SEAL and outer headers and
915	   delivers the inner packet either to the local host or to the next hop
916	   if the packet is not destined to the local host.

918	   If the next hop is on the same tunnel the SEAL packet arrived on,
919	   however, the ETE submits the packet for SEAL re-encapsulation
920	   beginning with the specification in Section 5.4.3 above and without
921	   decrementing the value in the inner (TTL / Hop Limit) field.

923	6.  Link Requirements

925	   Subnetwork designers are expected to follow the recommendations in
926	   Section 2 of [RFC3819] when configuring link MTUs.

928	7.  End System Requirements

930	   End systems are encouraged to implement end-to-end MTU assurance
931	   (e.g., using Packetization Layer Path MTU Discovery (PLPMTUD) per
932	   [RFC4821]) even if the subnetwork is using SEAL.

934	   When end systems use PLPMTUD, SEAL will ensure that the tunnel
935	   behaves as a link in the path that assures an MTU of at least 1500
936	   bytes while still allowing end systems to discover larger MTUs.  The
937	   PLPMTUD mechanism will therefore be able to function as designed in
938	   order to discover and utilize larger MTUs.

940	8.  Router Requirements

942	   Routers within the subnetwork are expected to observe the standard IP
943	   router requirements, including the implementation of IP fragmentation
944	   and reassembly as well as the generation of ICMP messages
945	   [RFC0792][RFC1122][RFC1812][RFC2460][RFC4443][RFC6434].

947	   Note that, even when routers support existing requirements for the
948	   generation of ICMP messages, these messages are often filtered and
949	   discarded by middleboxes on the path to the original source of the
950	   message that triggered the ICMP.  It is therefore not possible to
951	   assume delivery of ICMP messages even when routers are correctly
952	   implemented.

954	9.  Multicast/Anycast Considerations

956	   On multicast-capable tunnels, encapsulated packets sent by an ITE may
957	   be received by potentially many ETEs.  In that case, the ITE can
958	   still send unicast probe messages to receive probe replies from a
959	   specific ETE, or it can send multicast probe messages to receive
960	   replies from all ETEs in the multicast group that receive the probe.
961	   If the ITE were to send a multicast MINMTU probe message as described
962	   in Section 5.4.6, however, it would be unable to discern whether all
963	   ETEs received the probe unless it had some way of tracking the full
964	   constituency of the multicast group.  For multicast ETE addresses,
965	   the ITE would therefore ordinarily set MAXMTU=MINMTU and DOFRAG=TRUE.
966	   But, the setting of these values may be situation-dependent and based
967	   on whether the ITE can tolerate packet loss to ETEs that may be
968	   reached by subnetwork paths having small MTUs.

970	   For ETEs that configure an anycast address, if the ITE sends a MINMTU
971	   probe message it may receive a probe reply from a first ETE but then
972	   be re-routed to a second ETE.  It is therefore necessary for the ITE
973	   to continue to send periodic probes (subject to rate limiting) as
974	   described in Section 5.4.6 so that any path oscillations between ETEs
975	   that configure the same anycast address will not result in a
976	   sustained path MTU black hole.

978	10.  Compatibility Considerations

980	   Since SEAL is based on the standard IPv6 fragment header, the ITE can
981	   implement the scheme independently of any ETE implementations.
982	   Therefore, if the ITE uses SEAL but the ETE does not the ITE can
983	   still send a MINMTU probe as specified in Section 5.4.6 but may
984	   receive an ordinary (i.e., non SEAL-encapsulated) probe reply.  If
985	   so, it SHOULD reset DOFRAG to FALSE the same as if the ETE returned a
986	   SEAL-encapsulated probe reply.

988	   In some cases, a non-SEAL ETE may not be able to reassemble
989	   fragmented SEAL packets up to (MINMTU+HLEN) bytes, since [RFC2460]
990	   only requires IPv6 nodes to reassemble packets up to 1500 bytes in
991	   length.  To test for this condition, the ITE can create a MINMTU
992	   probe message, fragment the message into two pieces, then send both
993	   fragments to the ETE.  If the ETE returns a probe reply, the ITE has
994	   assurance that the ETE is capable of reassembly.  Otherwise, the ITE
995	   SHOULD reset MAXMTU for this subnetwork path to (MINMTU-HLEN) or even
996	   smaller if the ETE still cannot accept packets of this size.

998	11.  Nested Encapsulation Considerations

1000	   SEAL supports nested tunneling - an example would be a recursive
1001	   nesting of mobile networks, where the first network receives service
1002	   from an ISP, the second network receives service from the first
1003	   network, the third network receives service from the second network,
1004	   etc.  Since it is imperative that such nesting not extend
1005	   indefinitely, tunnels that use SEAL SHOULD honor the Encapsulation
1006	   Limit option defined in [RFC2473].

1008	12.  Reliability Considerations

1010	   Although a tunnel may span an arbitrarily-large subnetwork expanse,
1011	   the IP layer sees the tunnel as a simple link that supports the IP
1012	   service model.  Links with high bit error rates (BERs) (e.g., IEEE
1013	   802.11) use Automatic Repeat-ReQuest (ARQ) mechanisms [RFC3366] to
1014	   increase packet delivery ratios, while links with much lower BERs
1015	   typically omit such mechanisms.  Since Tunnels may traverse
1016	   arbitrarily-long paths over links of various types that are already
1017	   either performing or omitting ARQ as appropriate, it would therefore
1018	   be inefficient to require the tunnel endpoints to also perform ARQ.

1020	13.  Integrity Considerations

1022	   Fragmentation and reassembly schemes must consider packet-splicing
1023	   errors, e.g., when two fragments from the same packet are
1024	   concatenated incorrectly, when a fragment from packet X is
1025	   reassembled with fragments from packet Y, etc.  The primary sources
1026	   of such errors include implementation bugs and wrapping ID fields.

1028	   In particular, the IPv4 16-bit ID field can wrap with only 64K
1029	   packets with the same (src, dst, protocol)-tuple alive in the system
1030	   at a given time [RFC4963].  When the IPv4 ID field is re-written by a
1031	   middlebox such as a NAT or Firewall, ID field wrapping can occur with
1032	   even fewer packets alive in the system.

1034	   Fortunately, SEAL includes a 32-bit ID field the same as for IPv6
1035	   fragmentation and also only employs SEAL fragmentation for packets up
1036	   to 1500 bytes in length.  SEAL also only allows IPv4 network
1037	   fragmentation for packets up to 1280 bytes in length, but this size
1038	   is small enough to fit within the MTU of modern high-speed IPv4 links
1039	   without fragmentation.  IPv4 links with smaller MTUs certainly exist,
1040	   but typically support data rates that are slow enough to preclude
1041	   high data rate reassembly misassociations errors; hence, a small
1042	   amount of IPv4 fragmentation is deemed acceptable.

1044	14.  IANA Considerations

1046	   The IANA is requested to allocate a User Port number for "SEAL" in
1047	   the 'port-numbers' registry.  The Service Name is "SEAL", and the
1048	   Transport Protocols are TCP and UDP.  The Assignee is the IESG
1049	   (iesg@ietf.org) and the Contact is the IETF Chair (chair@ietf.org).
1050	   The Description is "Subnetwork Encapsulation and Adaptation Layer
1051	   (SEAL)", and the Reference is the RFC-to-be currently known as
1052	   'draft-templin-intarea-seal'.

1054	15.  Security Considerations

1056	   Neighbor relationships between the ITE and ETE should be secured in
1057	   environments where authentication and/or confidentiality are a matter
1058	   of concern.  Securing mechanisms such as Secure Neighbor Discovery
1059	   (SeND) [RFC3971] and IPsec [RFC4301] can be used for this purpose,
1060	   however the tunnel neighbor relationship is managed by the tunnel
1061	   protocols that ride over SEAL (as an encapsulation sublayer) rather
1062	   than by SEAL itself.

1064	   Security issues that apply to tunneling in general are discussed in
1065	   [RFC6169].

1067	16.  Related Work

1069	   Section 3.1.7 of [RFC2764] provides a high-level sketch for
1070	   supporting large tunnel MTUs via a tunnel-layer fragmentation and
1071	   reassembly capability to avoid IP layer fragmentation.

1073	   Section 3 of [RFC4459] describes inner and outer fragmentation at the
1074	   tunnel endpoints as alternatives for accommodating the tunnel MTU.

1076	   Section 4 of [RFC2460] specifies a method for inserting and
1077	   processing extension headers between the base IPv6 header and
1078	   transport layer protocol data.  The SEAL header is inserted and
1079	   processed in exactly the same manner.

1081	   The concepts of path MTU determination through the report of
1082	   fragmentation and extending the IPv4 Identification field were first
1083	   proposed in deliberations of the TCP-IP mailing list and the Path MTU
1084	   Discovery Working Group (MTUDWG) during the late 1980's and early
1085	   1990's.  An historical analysis of the evolution of these concepts,
1086	   as well as the development of the eventual PMTUD mechanism, appears
1087	   in [RFC5320].

1089	17.  Implementation Status

1091	   An early implementation of the first revision of SEAL [RFC5320] is
1092	   available at: http://isatap.com/seal.

1094	   An implementation of the current version of SEAL is available at:
1095	   http://linkupnetworks.com/seal/sealv2-1.0.tgz.

1097	18.  Acknowledgments

1099	   The following individuals are acknowledged for helpful comments and
1100	   suggestions: Jari Arkko, Fred Baker, Iljitsch van Beijnum, Oliver
1101	   Bonaventure, Teco Boot, Bob Braden, Brian Carpenter, Steve Casner,
1102	   Ian Chakeres, Noel Chiappa, Remi Denis-Courmont, Remi Despres, Ralph
1103	   Droms, Aurnaud Ebalard, Gorry Fairhurst, Washam Fan, Dino Farinacci,
1104	   Joel Halpern, Brian Haberman, Sam Hartman, John Heffner, Thomas
1105	   Henderson, Bob Hinden, Christian Huitema, Eliot Lear, Darrel Lewis,
1106	   Joe Macker, Matt Mathis, Erik Nordmark, Dan Romascanu, Dave Thaler,
1107	   Joe Touch, Mark Townsley, Ole Troan, Margaret Wasserman, Magnus
1108	   Westerlund, Robin Whittle, James Woodyatt, and members of the Boeing
1109	   Research & Technology NST DC&NT group.

1111	   Discussions with colleagues following the publication of [RFC5320]
1112	   have provided useful insights that have resulted in significant
1113	   improvements to this, the Second Edition of SEAL.  In particular,
1114	   this work has been encouraged and supported by Boeing colleagues
1115	   including Balaguruna Chidambaram, Jeff Holland, Cam Brodie, Yueli
1116	   Yang, Wen Fang, Ed King, Mike Slane, Kent Shuey, Gen MacLean, and
1117	   other members of the BR&T and BIT mobile networking teams.

1119	   This document received substantial review input from the IESG and
1120	   IETF area directorates in the February 2013 timeframe.  IESG members
1121	   and IETF area directorate representatives who contributed helpful
1122	   comments and suggestions are gratefully acknowledged.  Discussions on
1123	   the IETF IPv6 and Intarea mailing lists in the summer 2013 timeframe
1124	   also stimulated several useful ideas.

1126	   Path MTU determination through the report of fragmentation was first
1127	   proposed by Charles Lynn on the TCP-IP mailing list in 1987.
1128	   Extending the IP identification field was first proposed by Steve
1129	   Deering on the MTUDWG mailing list in 1989.  Steve Deering also
1130	   proposed the IPv6 minimum MTU of 1280 bytes on the IPng mailing list
1131	   in 1997.

1133	19.  References

1135	19.1.  Normative References

1137	   [RFC0791]  Postel, J., "Internet Protocol", STD 5, RFC 791,
1138	              September 1981.

1140	   [RFC0792]  Postel, J., "Internet Control Message Protocol", STD 5,
1141	              RFC 792, September 1981.

1143	   [RFC1122]  Braden, R., "Requirements for Internet Hosts -
1144	              Communication Layers", STD 3, RFC 1122, October 1989.

1146	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1147	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1149	   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
1150	              (IPv6) Specification", RFC 2460, December 1998.

1152	   [RFC4443]  Conta, A., Deering, S., and M. Gupta, "Internet Control
1153	              Message Protocol (ICMPv6) for the Internet Protocol
1154	              Version 6 (IPv6) Specification", RFC 4443, March 2006.

1156	19.2.  Informative References

1158	   [FOLK]     Shannon, C., Moore, D., and k. claffy, "Beyond Folklore:
1159	              Observations on Fragmented Traffic", December 2002.

1161	   [FRAG]     Kent, C. and J. Mogul, "Fragmentation Considered Harmful",
1162	              October 1987.

1164	   [I-D.taylor-v6ops-fragdrop]
1165	              Jaeggli, J., Colitti, L., Kumari, W., Vyncke, E., Kaeo,
1166	              M., and T. Taylor, "Why Operators Filter Fragments and
1167	              What It Implies", draft-taylor-v6ops-fragdrop-01 (work in
1168	              progress), June 2013.

1170	   [RFC0768]  Postel, J., "User Datagram Protocol", STD 6, RFC 768,
1171	              August 1980.

1173	   [RFC0994]  International Organization for Standardization (ISO) and
1174	              American National Standards Institute (ANSI), "Final text
1175	              of DIS 8473, Protocol for Providing the Connectionless-
1176	              mode Network Service", RFC 994, March 1986.

1178	   [RFC1070]  Hagens, R., Hall, N., and M. Rose, "Use of the Internet as
1179	              a subnetwork for experimentation with the OSI network
1180	              layer", RFC 1070, February 1989.

1182	   [RFC1191]  Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
1183	              November 1990.

1185	   [RFC1701]  Hanks, S., Li, T., Farinacci, D., and P. Traina, "Generic
1186	              Routing Encapsulation (GRE)", RFC 1701, October 1994.

1188	   [RFC1812]  Baker, F., "Requirements for IP Version 4 Routers",
1189	              RFC 1812, June 1995.

1191	   [RFC1981]  McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery
1192	              for IP version 6", RFC 1981, August 1996.

1194	   [RFC2003]  Perkins, C., "IP Encapsulation within IP", RFC 2003,
1195	              October 1996.

1197	   [RFC2473]  Conta, A. and S. Deering, "Generic Packet Tunneling in
1198	              IPv6 Specification", RFC 2473, December 1998.

1200	   [RFC2675]  Borman, D., Deering, S., and R. Hinden, "IPv6 Jumbograms",
1201	              RFC 2675, August 1999.

1203	   [RFC2764]  Gleeson, B., Heinanen, J., Lin, A., Armitage, G., and A.
1204	              Malis, "A Framework for IP Based Virtual Private
1205	              Networks", RFC 2764, February 2000.

1207	   [RFC2827]  Ferguson, P. and D. Senie, "Network Ingress Filtering:
1208	              Defeating Denial of Service Attacks which employ IP Source
1209	              Address Spoofing", BCP 38, RFC 2827, May 2000.

1211	   [RFC2923]  Lahey, K., "TCP Problems with Path MTU Discovery",
1212	              RFC 2923, September 2000.

1214	   [RFC3232]  Reynolds, J., "Assigned Numbers: RFC 1700 is Replaced by
1215	              an On-line Database", RFC 3232, January 2002.

1217	   [RFC3366]  Fairhurst, G. and L. Wood, "Advice to link designers on
1218	              link Automatic Repeat reQuest (ARQ)", BCP 62, RFC 3366,
1219	              August 2002.

1221	   [RFC3819]  Karn, P., Bormann, C., Fairhurst, G., Grossman, D.,
1222	              Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L.
1223	              Wood, "Advice for Internet Subnetwork Designers", BCP 89,
1224	              RFC 3819, July 2004.

1226	   [RFC3971]  Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure
1227	              Neighbor Discovery (SEND)", RFC 3971, March 2005.

1229	   [RFC4213]  Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms
1230	              for IPv6 Hosts and Routers", RFC 4213, October 2005.

1232	   [RFC4301]  Kent, S. and K. Seo, "Security Architecture for the
1233	              Internet Protocol", RFC 4301, December 2005.

1235	   [RFC4459]  Savola, P., "MTU and Fragmentation Issues with In-the-
1236	              Network Tunneling", RFC 4459, April 2006.

1238	   [RFC4821]  Mathis, M. and J. Heffner, "Packetization Layer Path MTU
1239	              Discovery", RFC 4821, March 2007.

1241	   [RFC4963]  Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly
1242	              Errors at High Data Rates", RFC 4963, July 2007.

1244	   [RFC5320]  Templin, F., "The Subnetwork Encapsulation and Adaptation
1245	              Layer (SEAL)", RFC 5320, February 2010.

1247	   [RFC5722]  Krishnan, S., "Handling of Overlapping IPv6 Fragments",
1248	              RFC 5722, December 2009.

1250	   [RFC5927]  Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010.

1252	   [RFC6169]  Krishnan, S., Thaler, D., and J. Hoagland, "Security
1253	              Concerns with IP Tunneling", RFC 6169, April 2011.

1255	   [RFC6434]  Jankiewicz, E., Loughney, J., and T. Narten, "IPv6 Node
1256	              Requirements", RFC 6434, December 2011.

1258	   [RFC6438]  Carpenter, B. and S. Amante, "Using the IPv6 Flow Label
1259	              for Equal Cost Multipath Routing and Link Aggregation in
1260	              Tunnels", RFC 6438, November 2011.

1262	   [RFC6864]  Touch, J., "Updated Specification of the IPv4 ID Field",
1263	              RFC 6864, February 2013.

1265	   [RFC6935]  Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and
1266	              UDP Checksums for Tunneled Packets", RFC 6935, April 2013.

1268	   [RFC6936]  Fairhurst, G. and M. Westerlund, "Applicability Statement
1269	              for the Use of IPv6 UDP Datagrams with Zero Checksums",
1270	              RFC 6936, April 2013.

1272	   [RFC6946]  Gont, F., "Processing of IPv6 "Atomic" Fragments",
1273	              RFC 6946, May 2013.

1275	   [RIPE]     De Boer, M. and J. Bosma, "Discovering Path MTU Black
1276	              Holes on the Internet using RIPE Atlas", July 2012.

1278	   [SIGCOMM]  Luckie, M. and B. Stasiewicz, "Measuring Path MTU
1279	              Discovery Behavior", November 2010.

1281	   [TBIT]     Medina, A., Allman, M., and S. Floyd, "Measuring
1282	              Interactions Between Transport Protocols and Middleboxes",
1283	              October 2004.

1285	   [WAND]     Luckie, M., Cho, K., and B. Owens, "Inferring and
1286	              Debugging Path MTU Discovery Failures", October 2005.

1288	Author's Address

1290	   Fred L. Templin (editor)
1291	   Boeing Research & Technology
1292	   P.O. Box 3707
1293	   Seattle, WA  98124
1294	   USA

1296	   Email: fltemplin@acm.org