idnits 2.17.1 draft-templin-intarea-seal-24.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 29, 2010) is 4890 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '
' and
     '' lines.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'RFC3971' is defined on line 1680, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC4987' is defined on line 1792, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200)

  == Outdated reference: A later version (-07) exists of
     draft-ietf-intarea-ipv4-id-update-01

  == Outdated reference: A later version (-40) exists of
     draft-templin-intarea-vet-16

  == Outdated reference: A later version (-17) exists of draft-templin-iron-13

  -- Obsolete informational reference (is this intentional?): RFC 1063
     (Obsoleted by RFC 1191)

  -- Obsolete informational reference (is this intentional?): RFC 1981
     (Obsoleted by RFC 8201)


     Summary: 1 error (**), 0 flaws (~~), 6 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                    F. Templin, Ed.
3	Internet-Draft                              Boeing Research & Technology
4	Intended status: Standards Track                       November 29, 2010
5	Expires: June 2, 2011

7	        The Subnetwork Encapsulation and Adaptation Layer (SEAL)
8	                   draft-templin-intarea-seal-24.txt

10	Abstract

12	   For the purpose of this document, a subnetwork is defined as a
13	   virtual topology configured over a connected IP network routing
14	   region and bounded by encapsulating border nodes.  These virtual
15	   topologies are manifested by tunnels that may span multiple IP and/or
16	   sub-IP layer forwarding hops, and can introduce failure modes due to
17	   packet duplication and/or links with diverse Maximum Transmission
18	   Units (MTUs).  This document specifies a Subnetwork Encapsulation and
19	   Adaptation Layer (SEAL) that accommodates such virtual topologies
20	   over diverse underlying link technologies.

22	Status of this Memo

24	   This Internet-Draft is submitted in full conformance with the
25	   provisions of BCP 78 and BCP 79.

27	   Internet-Drafts are working documents of the Internet Engineering
28	   Task Force (IETF).  Note that other groups may also distribute
29	   working documents as Internet-Drafts.  The list of current Internet-
30	   Drafts is at http://datatracker.ietf.org/drafts/current/.

32	   Internet-Drafts are draft documents valid for a maximum of six months
33	   and may be updated, replaced, or obsoleted by other documents at any
34	   time.  It is inappropriate to use Internet-Drafts as reference
35	   material or to cite them other than as "work in progress."

37	   This Internet-Draft will expire on June 2, 2011.

39	Copyright Notice

41	   Copyright (c) 2010 IETF Trust and the persons identified as the
42	   document authors.  All rights reserved.

44	   This document is subject to BCP 78 and the IETF Trust's Legal
45	   Provisions Relating to IETF Documents
46	   (http://trustee.ietf.org/license-info) in effect on the date of
47	   publication of this document.  Please review these documents
48	   carefully, as they describe your rights and restrictions with respect
49	   to this document.  Code Components extracted from this document must
50	   include Simplified BSD License text as described in Section 4.e of
51	   the Trust Legal Provisions and are provided without warranty as
52	   described in the Simplified BSD License.

54	Table of Contents

56	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
57	     1.1.  Motivation . . . . . . . . . . . . . . . . . . . . . . . .  4
58	     1.2.  Approach . . . . . . . . . . . . . . . . . . . . . . . . .  6
59	   2.  Terminology and Requirements . . . . . . . . . . . . . . . . .  7
60	   3.  Applicability Statement  . . . . . . . . . . . . . . . . . . .  9
61	   4.  SEAL Protocol Specification  . . . . . . . . . . . . . . . . . 10
62	     4.1.  VET Interface Model  . . . . . . . . . . . . . . . . . . . 10
63	     4.2.  SEAL Model of Operation  . . . . . . . . . . . . . . . . . 11
64	     4.3.  SEAL Header Format . . . . . . . . . . . . . . . . . . . . 13
65	     4.4.  ITE Specification  . . . . . . . . . . . . . . . . . . . . 14
66	       4.4.1.  Tunnel Interface MTU . . . . . . . . . . . . . . . . . 14
67	       4.4.2.  Tunnel Interface Soft State  . . . . . . . . . . . . . 16
68	       4.4.3.  Admitting Packets into the Tunnel  . . . . . . . . . . 17
69	       4.4.4.  Mid-Layer Encapsulation  . . . . . . . . . . . . . . . 18
70	       4.4.5.  SEAL Segmentation  . . . . . . . . . . . . . . . . . . 18
71	       4.4.6.  SEAL Encapsulation . . . . . . . . . . . . . . . . . . 18
72	       4.4.7.  Outer Encapsulation  . . . . . . . . . . . . . . . . . 19
73	       4.4.8.  Sending SEAL Protocol Packets  . . . . . . . . . . . . 20
74	       4.4.9.  Probing Strategy . . . . . . . . . . . . . . . . . . . 20
75	       4.4.10. Processing ICMP Messages . . . . . . . . . . . . . . . 21
76	       4.4.11. Black Hole Detection . . . . . . . . . . . . . . . . . 21
77	     4.5.  ETE Specification  . . . . . . . . . . . . . . . . . . . . 21
78	       4.5.1.  Reassembly Buffer Requirements . . . . . . . . . . . . 21
79	       4.5.2.  Tunnel Interface Soft State  . . . . . . . . . . . . . 22
80	       4.5.3.  IP-Layer Reassembly  . . . . . . . . . . . . . . . . . 23
81	       4.5.4.  SEAL-Layer Reassembly  . . . . . . . . . . . . . . . . 23
82	       4.5.5.  Decapsulation and Delivery to Upper Layers . . . . . . 24
83	     4.6.  The SEAL Control Message Protocol (SCMP) . . . . . . . . . 25
84	       4.6.1.  Generating SCMP Messages . . . . . . . . . . . . . . . 25
85	       4.6.2.  Processing SCMP Messages . . . . . . . . . . . . . . . 29
86	     4.7.  Tunnel Endpoint Synchronization  . . . . . . . . . . . . . 32
87	   5.  Link Requirements  . . . . . . . . . . . . . . . . . . . . . . 33
88	   6.  End System Requirements  . . . . . . . . . . . . . . . . . . . 33
89	   7.  Router Requirements  . . . . . . . . . . . . . . . . . . . . . 34
90	   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 34
91	   9.  Security Considerations  . . . . . . . . . . . . . . . . . . . 34
92	   10. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 35
93	   11. SEAL Advantages over Classical Methods . . . . . . . . . . . . 36
94	   12. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 36
95	   13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 37
96	     13.1. Normative References . . . . . . . . . . . . . . . . . . . 37
97	     13.2. Informative References . . . . . . . . . . . . . . . . . . 37
98	   Appendix A.  Reliability . . . . . . . . . . . . . . . . . . . . . 40
99	   Appendix B.  Integrity . . . . . . . . . . . . . . . . . . . . . . 41
100	   Appendix C.  Transport Mode  . . . . . . . . . . . . . . . . . . . 41
101	   Appendix D.  Historic Evolution of PMTUD . . . . . . . . . . . . . 42
102	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 43

104	1.  Introduction

106	   As Internet technology and communication has grown and matured, many
107	   techniques have developed that use virtual topologies (including
108	   tunnels of one form or another) over an actual network that supports
109	   the Internet Protocol (IP) [RFC0791][RFC2460].  Those virtual
110	   topologies have elements that appear as one hop in the virtual
111	   topology, but are actually multiple IP or sub-IP layer hops.  These
112	   multiple hops often have quite diverse properties that are often not
113	   even visible to the endpoints of the virtual hop.  This introduces
114	   failure modes that are not dealt with well in current approaches.

116	   The use of IP encapsulation (also known as "tunneling") has long been
117	   considered as the means for creating such virtual topologies.
118	   However, the insertion of an outer IP header reduces the effective
119	   path MTU visible to the inner network layer.  When IPv4 is used, this
120	   reduced MTU can be accommodated through the use of IPv4
121	   fragmentation, but unmitigated in-the-network fragmentation has been
122	   found to be harmful through operational experience and studies
123	   conducted over the course of many years [FRAG][FOLK][RFC4963].
124	   Additionally, classical path MTU discovery [RFC1191] has known
125	   operational issues that are exacerbated by in-the-network tunnels
126	   [RFC2923][RFC4459].  The following subsections present further
127	   details on the motivation and approach for addressing these issues.

129	1.1.  Motivation

131	   Before discussing the approach, it is necessary to first understand
132	   the problems.  In both the Internet and private-use networks today,
133	   IPv4 is ubiquitously deployed as the Layer 3 protocol.  The two
134	   primary functions of IPv4 are to provide for 1) addressing, and 2) a
135	   fragmentation and reassembly capability used to accommodate links
136	   with diverse MTUs.  While it is well known that the IPv4 address
137	   space is rapidly becoming depleted, there is a lesser-known but
138	   growing consensus that other IPv4 protocol limitations have already
139	   or may soon become problematic.

141	   First, the IPv4 header Identification field is only 16 bits in
142	   length, meaning that at most 2^16 unique packets with the same
143	   (source, destination, protocol)-tuple may be active in the Internet
144	   at a given time [I-D.ietf-intarea-ipv4-id-update].  Due to the
145	   escalating deployment of high-speed links (e.g., 1Gbps Ethernet),
146	   however, this number may soon become too small by several orders of
147	   magnitude for high data rate packet sources such as tunnel endpoints
148	   [RFC4963].  Furthermore, there are many well-known limitations
149	   pertaining to IPv4 fragmentation and reassembly - even to the point
150	   that it has been deemed "harmful" in both classic and modern-day
151	   studies (see above).  In particular, IPv4 fragmentation raises issues
152	   ranging from minor annoyances (e.g., in-the-network router
153	   fragmentation [RFC1981]) to the potential for major integrity issues
154	   (e.g., mis-association of the fragments of multiple IP packets during
155	   reassembly [RFC4963]).

157	   As a result of these perceived limitations, a fragmentation-avoiding
158	   technique for discovering the MTU of the forward path from a source
159	   to a destination node was devised through the deliberations of the
160	   Path MTU Discovery Working Group (PMTUDWG) during the late 1980's
161	   through early 1990's (see Appendix D).  In this method, the source
162	   node provides explicit instructions to routers in the path to discard
163	   the packet and return an ICMP error message if an MTU restriction is
164	   encountered.  However, this approach has several serious shortcomings
165	   that lead to an overall "brittleness" [RFC2923].

167	   In particular, site border routers in the Internet are being
168	   configured more and more to discard ICMP error messages coming from
169	   the outside world.  This is due in large part to the fact that
170	   malicious spoofing of error messages in the Internet is trivial since
171	   there is no way to authenticate the source of the messages [RFC5927].
172	   Furthermore, when a source node that requires ICMP error message
173	   feedback when a packet is dropped due to an MTU restriction does not
174	   receive the messages, a path MTU-related black hole occurs.  This
175	   means that the source will continue to send packets that are too
176	   large and never receive an indication from the network that they are
177	   being discarded.  This behavior has been confirmed through documented
178	   studies showing clear evidence of path MTU discovery failures in the
179	   Internet today [TBIT][WAND][SIGCOMM].

181	   The issues with both IPv4 fragmentation and this "classical" method
182	   of path MTU discovery are exacerbated further when IP tunneling is
183	   used [RFC4459].  For example, ingress tunnel endpoints (ITEs) may be
184	   required to forward encapsulated packets into the subnetwork on
185	   behalf of hundreds, thousands, or even more original sources in the
186	   end site.  If the ITE allows IPv4 fragmentation on the encapsulated
187	   packets, persistent fragmentation could lead to undetected data
188	   corruption due to Identification field wrapping.  If the ITE instead
189	   uses classical IPv4 path MTU discovery, it may be inconvenienced by
190	   excessive ICMP error messages coming from the subnetwork that may be
191	   either suspect or contain insufficient information for translation
192	   into error messages to be returned to the original sources.

194	   Although recent works have led to the development of a robust end-to-
195	   end MTU determination scheme [RFC4821], this approach requires
196	   tunnels to present a consistent MTU the same as for ordinary links on
197	   the end-to-end path.  Moreover, in current practice existing
198	   tunneling protocols mask the MTU issues by selecting a "lowest common
199	   denominator" MTU that may be much smaller than necessary for most
200	   paths and difficult to change at a later date.  Due to these many
201	   consideration, a new approach to accommodate tunnels over links with
202	   diverse MTUs is necessary.

204	1.2.  Approach

206	   For the purpose of this document, a subnetwork is defined as a
207	   virtual topology configured over a connected network routing region
208	   and bounded by encapsulating border nodes.  Example connected network
209	   routing regions include Mobile Ad hoc Networks (MANETs), enterprise
210	   networks and the global public Internet itself.  Subnetwork border
211	   nodes forward unicast and multicast packets over the virtual topology
212	   across multiple IP and/or sub-IP layer forwarding hops that may
213	   introduce packet duplication and/or traverse links with diverse
214	   Maximum Transmission Units (MTUs).

216	   This document introduces a Subnetwork Encapsulation and Adaptation
217	   Layer (SEAL) for tunneling network layer protocols (e.g., IP, OSI,
218	   etc.) over IP subnetworks that connect Ingress and Egress Tunnel
219	   Endpoints (ITEs/ETEs) of border nodes.  It provides a modular
220	   specification designed to be tailored to specific associated
221	   tunneling protocols.  A transport-mode of operation is also possible,
222	   and described in Appendix C.  SEAL accommodates links with diverse
223	   MTUs, protects against off-path denial-of-service attacks, and can be
224	   configured to enable efficient duplicate packet detection through the
225	   use of a minimal mid-layer encapsulation.

227	   SEAL specifically treats tunnels that traverse the subnetwork as
228	   ordinary links that must support network layer services.  As for any
229	   link, tunnels that use SEAL must provide suitable networking services
230	   including best-effort datagram delivery, integrity and consistent
231	   handling of packets of various sizes.  As for any link whose media
232	   cannot provide suitable services natively, tunnels that use SEAL
233	   employ link-level adaptation functions to meet the legitimate
234	   expectations of the network layer service.  As this is essentially a
235	   link level adaptation, SEAL is therefore permitted to alter packets
236	   within the subnetwork as long as it restores them to their original
237	   form when they exit the subnetwork.  The mechanisms described within
238	   this document are designed precisely for this purpose.

240	   SEAL encapsulation introduces an extended Identification field for
241	   per-packet and/or per-ETE identification as well as a mid-layer
242	   segmentation and reassembly capability that allows simplified cutting
243	   and pasting of packets.  Moreover, SEAL engages both tunnel endpoints
244	   in ensuring a functional path MTU on the path from the ITE to the
245	   ETE.  This is in contrast to "stateless" approaches which seek to
246	   avoid MTU issues by selecting a lowest common denominator MTU value
247	   that may be overly conservative for the vast majority of tunnel paths
248	   and difficult to change even when larger MTUs become available.

250	   The following sections provide the SEAL normative specifications,
251	   while the appendices present non-normative additional considerations.

253	2.  Terminology and Requirements

255	   The following terms are defined within the scope of this document:

257	   subnetwork
258	      a virtual topology configured over a connected network routing
259	      region and bounded by encapsulating border nodes.

261	   Ingress Tunnel Endpoint
262	      a virtual interface over which an encapsulating border node (host
263	      or router) sends encapsulated packets into the subnetwork.

265	   Egress Tunnel Endpoint
266	      a virtual interface over which an encapsulating border node (host
267	      or router) receives encapsulated packets from the subnetwork.

269	   inner packet
270	      an unencapsulated network layer protocol packet (e.g., IPv6
271	      [RFC2460], IPv4 [RFC0791], OSI/CLNP [RFC1070], etc.) before any
272	      mid-layer or outer encapsulations are added.  Internet protocol
273	      numbers that identify inner packets are found in the IANA Internet
274	      Protocol registry [RFC3232].

276	   mid-layer packet
277	      a packet resulting from adding mid-layer encapsulating headers to
278	      an inner packet.

280	   outer IP packet
281	      a packet resulting from adding an outer IP header (and possibly
282	      other outer headers) to a mid-layer packet.

284	   packet-in-error
285	      the leading portion of an invoking data packet encapsulated in the
286	      body of an error control message (e.g., an ICMPv4 [RFC0792] error
287	      message, an ICMPv6 [RFC4443] error message, etc.).

289	   Packet Too Big (PTB)
290	      a control plane message indicating an MTU restriction, e.g., an
291	      ICMPv6 "Packet Too Big" message [RFC4443], an ICMPv4
292	      "Fragmentation Needed" message [RFC0792], an SCMP "Packet Too Big"
293	      message (see: Section 4.5), etc.

295	   IP, IPvX, IPvY
296	      used to generically refer to either IP protocol version, i.e.,
297	      IPv4 or IPv6.

299	   The following abbreviations correspond to terms used within this
300	   document and elsewhere in common Internetworking nomenclature:

302	      DF - the IPv4 header "Don't Fragment" flag [RFC0791]

304	      ETE - Egress Tunnel Endpoint

306	      HLEN - the sum of MHLEN and OHLEN

308	      ITE - Ingress Tunnel Endpoint

310	      LINK_ID - a short integer that identifies an ITE's underlying link

312	      MHLEN - the length of any mid-layer headers and trailers

314	      MRU - Maximum Reassembly Unit

316	      MTU - Maximum Transmission Unit

318	      NONCE - a short integer nonce value that identifies an ETE

320	      OHLEN - the length of any outer encapsulating headers and trailers

322	      S_IFT - SEAL Inner Fragmentation Threshold

324	      S_MRU - SEAL Maximum Reassembly Unit

326	      S_MSS - SEAL Maximum Segment Size

328	      SCMP - the SEAL Control Message Protocol

330	      SEAL - Subnetwork Encapsulation and Adaptation Layer

332	      SEAL_ID - a SEAL packet and/or ETE identification value

334	      SEAL_PORT - a TCP/UDP service port number used for SEAL

336	      SEAL_PROTO - an IPv4 protocol number used for SEAL

338	      TE - Tunnel Endpoint (i.e., either ingress or egress)

340	      VET - Virtual Enterprise Traversal

342	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
343	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
344	   document are to be interpreted as described in [RFC2119].  When used
345	   in lower case (e.g., must, must not, etc.), these words MUST NOT be
346	   interpreted as described in [RFC2119], but are rather interpreted as
347	   they would be in common English.

349	3.  Applicability Statement

351	   SEAL was originally motivated by the specific case of subnetwork
352	   abstraction for Mobile Ad hoc Networks (MANETs), however it soon
353	   became apparent that the domain of applicability also extends to
354	   subnetwork abstractions over enterprise networks, ISP networks, SOHO
355	   networks, the global public Internet itself, and any other connected
356	   network routing region.  SEAL along with the Virtual Enterprise
357	   Traversal (VET) [I-D.templin-intarea-vet] tunnel virtual interface
358	   abstraction are the functional building blocks for a new
359	   Internetworking architecture based on Routing and Addressing in
360	   Networks with Global Enterprise Recursion (RANGER)
361	   [RFC5720][I-D.russert-rangers] and the Internet Routing Overlay
362	   Network (IRON) [I-D.templin-iron].

364	   SEAL provides a network sublayer for encapsulation of an inner
365	   network layer packet within outer encapsulating headers.  For
366	   example, for IPvX in IPvY encapsulation (e.g., as IPv4/SEAL/IPv6),
367	   the SEAL header appears as a subnetwork encapsulation as seen by the
368	   inner IP layer.  SEAL can also be used as a sublayer within a UDP
369	   data payload (e.g., as IPv4/UDP/SEAL/IPv6 similar to Teredo
370	   [RFC4380]), where UDP encapsulation is typically used for Network
371	   Address Translator (NAT) traversal as well as operation over
372	   subnetworks that give preferential treatment to the "core" Internet
373	   protocols (i.e., TCP and UDP).  The SEAL header is processed the same
374	   as for IPv6 extension headers, i.e., it is not part of the outer IP
375	   header but rather allows for the creation of an arbitrarily
376	   extensible chain of headers in the same way that IPv6 does.

378	   SEAL supports a segmentation and reassembly capability for adapting
379	   the network layer to the underlying subnetwork characteristics, where
380	   the Egress Tunnel Endpoint (ETE) determines how much or how little
381	   reassembly it is willing to support.  In the limiting case, the ETE
382	   can avoid reassembly altogether and act as a passive observer that
383	   simply informs the Ingress Tunnel Endpoint (ITE) of any MTU
384	   limitations and otherwise discards all packets that arrive as
385	   multiple fragments.  This mode is useful for determining an
386	   appropriate MTU for tunnels between performance-critical routers
387	   connected to high data rate subnetworks such as the Internet DFZ, for
388	   unidirectional tunnels in which the ETE is stateless, and for other
389	   uses in which reassembly would present too great of a burden for the
390	   routers or end systems.

392	   When the ETE supports reassembly, the tunnel can be used to transport
393	   packets that are too large to traverse the path without
394	   fragmentation.  In this mode, the ITE determines the tunnel MTU based
395	   on the largest packet the ETE is capable of reassembling rather than
396	   on the MTU of the smallest link in the path.  Therefore, tunnel
397	   endpoints that use SEAL can transport packets that are much larger
398	   than the underlying subnetwork links themselves can carry in a single
399	   piece.

401	   SEAL tunnels may be configured over paths that include not only
402	   ordinary physical links, but also virtual links that may include
403	   other tunnels.  An example application would be linking two
404	   geographically remote supercomputer centers with large MTU links by
405	   configuring a SEAL tunnel across the Internet.  A second example
406	   would be support for sub-IP segmentation over low-end links, i.e.,
407	   especially over wireless transmission media such as IEEE 802.15.4,
408	   broadcast radio links in Mobile Ad-hoc Networks (MANETs), Very High
409	   Frequency (VHF) civil aviation data links, etc.

411	   Many other use case examples are anticipated, and will be identified
412	   as further experience is gained.

414	4.  SEAL Protocol Specification

416	   The following sections specify the operation of the SEAL protocol.

418	4.1.  VET Interface Model

420	   SEAL is an encapsulation sublayer used within VET non-broadcast,
421	   multiple access (NBMA) virtual interfaces.  Each VET interface
422	   connects an ITE to one or more ETE "neighbors" via tunneling across
423	   an underlying enterprise network, or "subnetwork".  The tunnel
424	   neighbor relationship between the ITE and each ETE may be either
425	   unidirectional or bidirectional.

427	   A unidirectional tunnel neighbor relationship requires no prior
428	   coordination between the ITE and ETE; it allows the ITE to send both
429	   data and control messages forward to the ETE, but only allows the ETE
430	   to send back control messages.  A bidirectional tunnel neighbor
431	   relationship requires prior coordination between the TEs (see:
432	   Section 4.7), and is one over which both TEs can exchange both data
433	   and control messages.

435	   Implications of the VET unidirectional and bidirectional models for
436	   SEAL will be discussed in the following sections.

438	4.2.  SEAL Model of Operation

440	   SEAL supports a multi-level segmentation and reassembly capability
441	   for the transmission of unicast and multicast packets across an
442	   underlying IP subnetwork with heterogeneous links.  First, the ITE
443	   can use IPv4 fragmentation to fragment inner IPv4 packets before SEAL
444	   encapsulation if necessary.  Secondly, the SEAL layer itself provides
445	   a simple cutting-and-pasting capability for mid-layer packets that
446	   can be used to avoid IP fragmentation on the outer packet.  Finally,
447	   ordinary IP fragmentation is permitted on the outer packet after SEAL
448	   encapsulation and is used to detect and tune out any in-the-network
449	   fragmentation.

451	   SEAL-enabled ITEs encapsulate each inner packet in any mid-layer
452	   headers and trailers, segment the resulting mid-layer packet into
453	   multiple segments if necessary, then append a SEAL header and any
454	   outer encapsulations to each segment.  As an example, for IPv6 within
455	   IPv4 encapsulation a single-segment inner IPv6 packet encapsulated in
456	   any mid-layer headers and trailers, followed by the SEAL header,
457	   followed by any outer headers and trailers, followed by an outer IPv4
458	   header would appear as shown in Figure 1:

460	                                       +--------------------+
461	                                       ~  outer IPv4 header ~
462	                                       +--------------------+
463	   I                                   ~  other outer hdrs  ~
464	   n                                   +--------------------+
465	   n                                   ~    SEAL Header     ~
466	   e      +--------------------+       +--------------------+
467	   r      ~  mid-layer headers ~       ~  mid-layer headers ~
468	          +--------------------+       +--------------------+
469	   I -->  |                    |  -->  |                    |
470	   P -->  ~     inner IPv6     ~  -->  ~     inner IPv6     ~
471	   v -->  ~       Packet       ~  -->  ~       Packet       ~
472	   6 -->  |                    |  -->  |                    |
473	          +--------------------+       +--------------------+
474	   P      ~ mid-layer trailers ~       ~ mid-layer trailers ~
475	   a      +--------------------+       +--------------------+
476	   c                                   ~   outer trailers   ~
477	   k         Mid-layer packet          +--------------------+
478	   e      after mid-layer encaps.
479	   t                                      Outer IPv4 packet
480	                                     after SEAL and outer encaps.

482	               Figure 1: SEAL Encapsulation - Single Segment

484	   As a second example, for IPv4 within IPv6 encapsulation an inner IPv4
485	   packet requiring three SEAL segments would appear as three separate
486	   outer IPv6 packets, where the mid-layer headers are carried only in
487	   segment 0 and the mid-layer trailers are carried in segment 2 as
488	   shown in Figure 2:
489	   +------------------+                          +------------------+
490	   ~  outer IPv6 hdr  ~                          ~  outer IPv6 hdr  ~
491	   +------------------+   +------------------+   +------------------+
492	   ~ other outer hdrs ~   ~  outer IPv6 hdr  ~   ~ other outer hdrs ~
493	   +------------------+   +------------------+   +------------------+
494	   ~ SEAL hdr (SEG=0) ~   ~ other outer hdrs ~   ~ SEAL hdr (SEG=2) ~
495	   +------------------+   +------------------+   +------------------+
496	   ~  mid-layer hdrs  ~   ~ SEAL hdr (SEG=1) ~   |    inner IPv4    |
497	   +------------------+   +------------------+   ~      Packet      ~
498	   |    inner IPv4    |   |    inner IPv4    |   |    (Segment 2)   |
499	   ~      Packet      ~   ~      Packet      ~   +------------------+
500	   |    (Segment 0)   |   |    (Segment 1)   |   ~ mid-layer trails ~
501	   +------------------+   +------------------+   +------------------+
502	   ~  outer trailers  ~   ~  outer trailers  ~   ~  outer trailers  ~
503	   +------------------+   +------------------+   +------------------+

505	   Segment 0 (includes    Segment 1 (no mid-     Segment 2 (includes
506	     mid-layer hdrs)        layer encaps)         mid-layer trails)

508	             Figure 2: SEAL Encapsulation - Multiple Segments

510	   The ITE inserts the SEAL header according to the specific tunneling
511	   protocol.  Examples include the following:

513	   o  For simple encapsulation of an inner network layer packet within
514	      an outer IPvX header (e.g., [RFC1070][RFC2003][RFC2473][RFC4213],
515	      etc.), the ITE inserts the SEAL header between the inner packet
516	      and outer IPvX headers as: IPvX/SEAL/{inner packet}.

518	   o  For encapsulations over transports such as UDP (e.g., [RFC4380]),
519	      the ITE inserts the SEAL header between the outer transport layer
520	      header and the mid-layer packet, e.g., as IPvX/UDP/SEAL/{mid-layer
521	      packet}.  Here, the UDP header is seen as an "other outer header".

523	   The SEAL header includes a SEAL_ID that the ITE maintains as either a
524	   monotonically-incrementing per-packet identifier or as a constant
525	   per-ETE identifier.  When the ITE maintains the SEAL_ID as a packet
526	   identifier, routers within the subnetwork can use it for duplicate
527	   packet detection and both TEs can use it for SEAL segmentation/
528	   reassembly.  The SEAL header also includes a LINK_ID field that
529	   identifies the ITE's underlying link, and a NONCE field that provides
530	   a per-ETE identifier extension.

532	   The following sections specify the SEAL header format and SEAL-
533	   related operations of the ITE and ETE.

535	4.3.  SEAL Header Format

537	   The SEAL header is formatted as follows:

539	       0                   1                   2                   3
540	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
541	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
542	      |VER|C|A|I|R|F|M|  NEXTHDR/SEG  |    LINK_ID    |     NONCE     |
543	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
544	      |                            SEAL_ID                            |
545	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

547	                       Figure 3: SEAL Header Format

549	   where the header fields are defined as:

551	   VER (2)
552	      a 2-bit version field.  This document specifies Version 0 of the
553	      SEAL protocol, i.e., the VER field encodes the value 0.

555	   C (1)
556	      the "Control/Data" bit.  Set to 1 by the ITE in SEAL Control
557	      Message Protocol (SCMP) control messages, and set to 0 in ordinary
558	      data packets.

560	   A (1)
561	      the "Acknowledgement Requested" bit.  Set to 1 by the ITE in data
562	      packets for which it wishes to receive an explicit acknowledgement
563	      from the ETE.

565	   I (1)
566	      the "Identifier" bit.  Set to 1 if the SEAL_ID contains a
567	      monotonically-incrementing packet identifier; set to 0 if the
568	      SEAL_ID contains a constant ETE identifier.

570	   R (1)
571	      the "Redirects Permitted" bit.  Set to 1 if the ITE is willing to
572	      accept SCMP redirects (see: Section 4.6); set to 0 otherwise.

574	   F (1)
575	      the "First Segment" bit.  Set to 1 if this SEAL protocol packet
576	      contains the first segment (i.e., Segment #0) of a mid-layer
577	      packet.

579	   M (1)
580	      the "More Segments" bit.  Set to 1 if this SEAL protocol packet
581	      contains a non-final segment of a multi-segment mid-layer packet.

583	   NEXTHDR/SEG (8)  an 8-bit field.  When 'F'=1, encodes the next header
584	      Internet Protocol number the same as for the IPv4 protocol and
585	      IPv6 next header fields.  When 'F'=0, encodes a segment number of
586	      a multi-segment mid-layer packet.  (The segment number 0 is
587	      reserved.)

589	   LINK_ID (8)
590	      an 8-bit link identifier.  An integer value between 1 and 255 used
591	      by the ITE to identify the underlying link selected for tunneling
592	      the current packet.  The ITE may also use the value 0 to indicate
593	      "underlying link unspecified", e.g., when the ETE does not keep
594	      track of tunnel state.

596	   NONCE (8)
597	      an 8-bit nonce field.  Set to a random value by the ITE when the
598	      tunnel to the ETE is established, and used as a per-ETE
599	      identification adjunct to the SEAL_ID.

601	   SEAL_ID (32)
602	      a 32-bit Identification field.  Used as either a per-packet or
603	      per-ETE identifier.

605	   Setting of the various bits and fields of the SEAL header is
606	   specified in the following sections.

608	4.4.  ITE Specification

610	4.4.1.  Tunnel Interface MTU

612	   The tunnel interface must present a fixed MTU to the inner network
613	   layer as the size for admission of inner packets into the interface.
614	   Since VET NBMA tunnel virtual interfaces may support a large set of
615	   ETEs that accept widely varying maximum packet sizes, however, a
616	   number of factors should be taken into consideration when selecting a
617	   tunnel interface MTU.

619	   Due to the ubiquitous deployment of standard Ethernet and similar
620	   networking gear, the nominal Internet cell size has become 1500
621	   bytes; this is the de facto size that end systems have come to expect
622	   will either be delivered by the network without loss due to an MTU
623	   restriction on the path or a suitable ICMP Packet Too Big (PTB)
624	   message returned.  When the 1500 byte packets sent by end systems
625	   incur additional encapsulation at an ITE, however, they may be
626	   dropped silently since the network may not always deliver the
627	   necessary PTBs [RFC2923].

629	   The ITE should therefore set a tunnel interface MTU of at least 1500
630	   bytes plus extra room to accommodate any additional encapsulations
631	   that may occur on the path from the original source.  The ITE can
632	   also set smaller MTU values; however, care must be taken not to set
633	   so small a value that original sources would experience an MTU
634	   underflow.  In particular, IPv6 sources must see a minimum path MTU
635	   of 1280 bytes, and IPv4 sources should see a minimum path MTU of 576
636	   bytes.

638	   The ITE can alternatively set an indefinite MTU on the tunnel
639	   interface such that all inner packets are admitted into the interface
640	   without regard to size.  For ITEs that host applications that use the
641	   tunnel interface directly, this option must be carefully coordinated
642	   with protocol stack upper layers since some upper layer protocols
643	   (e.g., TCP) derive their packet sizing parameters from the MTU of the
644	   outgoing interface and as such may select too large an initial size.
645	   This is not a problem for upper layers that use conservative initial
646	   maximum segment size estimates and/or when the tunnel interface can
647	   reduce the upper layer's maximum segment size, e.g., by reducing the
648	   size advertised in the MSS option of outgoing TCP messages.

650	   The inner network layer protocol consults the tunnel interface MTU
651	   when admitting a packet into the interface.  For non-SEAL inner IPv4
652	   packets with the IPv4 Don't Fragment (DF) bit set to 0, if the packet
653	   is larger than the tunnel interface MTU the inner IPv4 layer uses
654	   IPv4 fragmentation to break the packet into fragments no larger than
655	   the tunnel interface MTU.  The ITE then admits each fragment into the
656	   interface as an independent packet.

658	   For all other inner packets, the inner network layer admits the
659	   packet if it is no larger than the tunnel interface MTU; otherwise,
660	   it drops the packet and sends a PTB error message to the source with
661	   the MTU value set to the tunnel interface MTU.  The message must
662	   contain as much of the invoking packet as possible without the entire
663	   message exceeding the network layer minimum MTU (e.g., 576 bytes for
664	   IPv4, 1280 bytes for IPv6, etc.).  For SEAL packets, however, the
665	   inner layer must send a SEAL PTB message instead of a PTB of the
666	   inner network layer (see: Section 4.4.3).

668	   For this reason, when the tunnel interface sets a finite MTU the
669	   inner network layer must be made aware of the SEAL protocol; this may
670	   not be practical for some implementations.  When the tunnel interface
671	   sets an indefinite MTU, however, the inner network layer
672	   unconditionally admits all packets into the interface without
673	   fragmentation.  Once the packet has been admitted into the interface,
674	   it transitions from the inner network layer and becomes subject to
675	   SEAL layer processing.

677	   In light of the above considerations, it is RECOMMENDED that the ITE
678	   configure an indefinite MTU on the tunnel interface such that the
679	   inner network layer unconditionally admits all inner packets into the
680	   interface and any necessary tunnel adaptations are performed by the
681	   SEAL layer within the tunnel interface as described in the following
682	   sections.

684	4.4.2.  Tunnel Interface Soft State

686	   The ITE maintains per-ETE soft state within the tunnel interface,
687	   e.g., in a neighbor cache.  (The ITE can instead maintain only per-
688	   tunnel interface instead of per-ETE packet identification and sizing
689	   variables if it is willing to use lowest-common-denominator values
690	   that are acceptable for all ETEs.)  The soft state includes the
691	   following:

693	   o  a Mid-layer Header Length (MHLEN); set to the length of any mid-
694	      layer encapsulation headers and trailers that must be added before
695	      SEAL segmentation.

697	   o  an Outer Header Length (OHLEN); set to the length of the outer IP,
698	      SEAL and other outer encapsulation headers and trailers.

700	   o  a total Header Length (HLEN); set to MHLEN plus OHLEN.

702	   o  a SEAL Maximum Segment Size (S_MSS).  The ITE initializes S_MSS to
703	      the minimum MTU of the underlying interfaces if the underlying
704	      interface MTUs can be determined (otherwise, the ITE initializes
705	      S_MSS to "infinity").  The ITE decreases or increased S_MSS based
706	      on any SCMP "Packet Too Big (PTB)" messages received (see Section
707	      4.6).

709	   o  a SEAL Maximum Reassembly Unit (S_MRU).  If the ITE is not
710	      configured to use SEAL segmentation, it initializes S_MRU to the
711	      constant value 0 and ignores any S_MRU values reported by the ETE.
712	      Otherwise, the ITE initializes S_MRU to "infinity" and decreases
713	      or increases S_MRU based on any SCMP PTB messages received from
714	      the ETE (see Section 4.6).  When (S_MRU>(S_MSS*256)), the ITE uses
715	      (S_MSS*256) as the effective S_MRU value.

717	   o  a SEAL Inner Fragmentation Threshold (S_IFT); used to determine a
718	      maximum fragment size for fragmentable IPv4 packets.  Required
719	      only for tunnels that support encapsulation with IPv4 as the inner
720	      network layer protocol.  The ITE should use a "safe" estimate for
721	      S_IFT that would be highly unlikely to trigger additional
722	      fragmentation on the path to the ETE.  In particular, it is
723	      RECOMMENDED that the ITE set S_IFT to 512 unless it can determine
724	      a more accurate safe value, e.g., via probing.

726	   o  a set of 8 bit LINK_IDs that identify the ITE's underlying links
727	      and are used to fill the SEAL header field of the same name for
728	      packets sent to this ETE.  The ITE selects a separate randomly-
729	      initialized LINK_ID for each underlying link, and the ETE uses the
730	      LINK_ID (in combination with the SEAL_ID and NONCE) to identify
731	      the ITE's underlying link of origin.

733	   o  an 8 bit NONCE that encodes a randomly-initialized constant value
734	      and is used to fill the SEAL header field of the same name for
735	      packets sent to this ETE.

737	   o  a 32 bit SEAL_ID that is randomly-initialized constant ETE
738	      identifier or monotonically-increasing packet identifier and is
739	      used to fill the SEAL header field of the same name for packets
740	      sent to this ETE.

742	   Note that S_MSS and S_MRU include the length of the outer and mid-
743	   layer encapsulating headers and trailers (i.e., HLEN), since the ETE
744	   must retain the headers and trailers during reassembly.  Note also
745	   that the ITE maintains S_MSS and S_MRU as 32-bit values such that
746	   inner packets larger than 64KB (e.g., IPv6 jumbograms [RFC2675]) can
747	   be accommodated when appropriate for a given subnetwork.

749	4.4.3.  Admitting Packets into the Tunnel

751	   Once an inner packet/fragment has been admitted into the tunnel
752	   interface, it transitions from the inner network layer and becomes
753	   subject to SEAL layer processing.  The ITE then examines each packet
754	   to determine whether it is too large for SEAL encapsulation, then
755	   prepares the packet for admission into the tunnel according to
756	   whether it is "fragmentable" (discussed in the next paragraph) or
757	   "unfragmentable" (discussed in the following paragraph).

759	   If the packet is a non-SEAL IPv4 packet with DF=0 in the IPv4 header
760	   (*), and the packet is larger than S_IFT, the ITE uses fragmentation
761	   to break the packet into IPv4 fragments no larger than S_IFT bytes
762	   then submits each fragment for encapsulation separately.

764	   For all other packets, if the packet is larger than (MAX(S_MRU,
765	   S_MSS) - HLEN), the ITE drops it and sends a PTB message to the
766	   source (**) with an MTU value of (MAX(S_MRU, S_MSS) - HLEN);
767	   otherwise, it submits the packet for encapsulation.  The ITE must
768	   include the length of the uncompressed headers and trailers when
769	   calculating HLEN even if the tunnel is using header compression.  The
770	   ITE is also permitted to admit inner packets into the tunnel that can
771	   be accommodated in a single SEAL segment (i.e., no larger than S_MSS)
772	   even if they are larger than the ETE would be willing to reassemble
773	   if fragmented (i.e., larger than S_MRU) - see: Section 4.5.1.

775	   (*) In order to support nested encapsulations, inner SEAL-protocol
776	   IPv4 packets with DF=0 must be treated as unfragmentable and subject
777	   to drop due to an MTU restriction as for all other packets.

779	   (**) When the ITE needs to drop a packet and send a PTB message, it
780	   sends an SCMP PTB message if the packet itself is a SEAL encapsulated
781	   packet (see: Section 4.6.1.1).  Otherwise, it sends a PTB
782	   corresponding to the inner network layer protocol packet.

784	4.4.4.  Mid-Layer Encapsulation

786	   After inner IP fragmentation (if necessary), the ITE next
787	   encapsulates each inner packet/fragment in the MHLEN bytes of mid-
788	   layer headers and trailers.  The ITE then submits the mid-layer
789	   packet for SEAL segmentation and encapsulation.

791	4.4.5.  SEAL Segmentation

793	   If the ITE is configured to use SEAL segmentation, it checks the
794	   length of the resulting packet after mid-layer encapsulation to
795	   determine whether segmentation is needed.  If the length of the
796	   resulting mid-layer packet plus OHLEN is larger than S_MSS but no
797	   larger than S_MRU the ITE performs SEAL segmentation by breaking the
798	   mid-layer packet into N segments (N <= 256) that are no larger than
799	   (S_MSS - OHLEN) bytes each.  Each segment, except the final one, MUST
800	   be of equal length.  The first byte of each segment MUST begin
801	   immediately after the final byte of the previous segment, i.e., the
802	   segments MUST NOT overlap.  The ITE SHOULD generate the smallest
803	   number of segments possible, e.g., it SHOULD NOT generate 6 smaller
804	   segments when the packet could be accommodated with 4 larger
805	   segments.

807	   This SEAL segmentation process ignores the fact that the mid-layer
808	   packet may be unfragmentable outside of the subnetwork.  The process
809	   is a mid-layer (not an IP layer) operation employed by the ITE to
810	   adapt the mid-layer packet to the subnetwork path characteristics,
811	   and the ETE will restore the packet to its original form during
812	   reassembly.  Therefore, the fact that the packet may have been
813	   segmented within the subnetwork is not observable outside of the
814	   subnetwork.

816	4.4.6.  SEAL Encapsulation

818	   Following SEAL segmentation, the ITE next encapsulates each segment
819	   in a SEAL header formatted as specified in Section 4.3.  For the
820	   first segment, the ITE sets F=1, then sets NEXTHDR to the Internet
821	   Protocol number of the encapsulated inner packet, and finally sets
822	   M=1 if there are more segments or sets M=0 otherwise.  For each non-
823	   initial segment of an N-segment mid-layer packet (N <= 256), the ITE
824	   sets (F=0; M=1; SEG=1) in the SEAL header of the first non-initial
825	   segment, sets (F=0; M=1; SEG=2) in the next non-initial segment,
826	   etc., and sets (F=0; M=0; SEG=N-1) in the final segment.  (Note that
827	   the value SEG=0 is not used, since the initial segment encodes a
828	   NEXTHDR value and not a SEG value.)

830	   For each segment, the ITE then sets C=0, sets R=1 if it is willing to
831	   accept SCMP redirects (see Section 4.6) and sets A=1 if explicit
832	   probing is desired (see Section 4.4.9).  The ITE then sets the
833	   LINK_ID field to an integer between 1 and 255 that identifies the
834	   underlying link over which this packet will be tunneled.  (The ITE
835	   may instead set LINK_ID to 0 if the ETE is not tracking state, e.g.,
836	   if the tunnel neighbor relationship is unidirectional.)  The ITE next
837	   sets the NONCE field to a randomly-initialized constant nonce value
838	   for this ETE.

840	   The ITE finally sets the I flag and SEAL_ID values as follows.  The
841	   ITE maintains a randomly-initialized SEAL_ID value as per-ETE soft
842	   state (e.g., in the neighbor cache).  If the SEAL_ID is to be used as
843	   a packet identifier, the ITE monotonically increments the value for
844	   each successive SEAL protocol packet it sends to the ETE.  If the
845	   SEAL_ID is to be used as an ETE identifier, the ITE instead maintains
846	   SEAL_ID as a constant value.

848	   For each successive SEAL segment, the ITE writes the current SEAL_ID
849	   value into the SEAL header field of the same name.  It then sets I=1
850	   if the SEAL_ID represents a packet identifier and I=0 if the SEAL_ID
851	   represents an ETE identifier.  The ITE must be consistent in its
852	   setting of the I flag.  For example, it must not set I=1 in some
853	   packets and I=0 in others since this may result in unpredictable
854	   behavior.

856	4.4.7.  Outer Encapsulation

858	   Following SEAL encapsulation, the ITE next encapsulates each SEAL
859	   segment in the requisite outer headers and trailers according to the
860	   specific encapsulation format (e.g., [RFC1070], [RFC2003], [RFC2473],
861	   [RFC4213], etc.), except that it writes 'SEAL_PROTO' in the protocol
862	   field of the outer IP header (when simple IP encapsulation is used)
863	   or writes 'SEAL_PORT' in the outer destination service port field
864	   (e.g., when IP/UDP encapsulation is used).

866	   When IPv4 is used as the outer encapsulation layer, the ITE finally
867	   sets the DF flag in the IPv4 header of each segment.  If the path to
868	   the ETE correctly implements IP fragmentation (see: Section 4.4.9),
869	   the ITE sets DF=0; otherwise, it sets DF=1.

871	   When IPv6 is used as the outer encapsulation layer, the "DF" flag is
872	   absent but the packet will not be fragmented within the subnetwork
873	   since IPv6 deprecates in-the-network fragmentation.

875	4.4.8.  Sending SEAL Protocol Packets

877	   Following outer encapsulation, the ITE sends each outer packet that
878	   encapsulates a segment of the same mid-layer packet over the same
879	   underlying link in canonical order, i.e., segment 0 first, followed
880	   by segment 1, etc., and finally segment N-1.

882	4.4.9.  Probing Strategy

884	   When IPv4 is used as the outer encapsulation layer, the ITE should
885	   perform a qualification exchange over each underlying link to
886	   determine whether each subnetwork path to the ETE correctly
887	   implements IPv4 fragmentation.  The qualification exchange can be
888	   performed either as an initial probe or in-band with real data
889	   packets, and should be repeated periodically since the subnetwork
890	   paths may change due to dynamic routing.

892	   To perform this qualification, the ITE prepares a probe packet that
893	   is no larger than 576 bytes (e.g., a NULL packet with A=1 and
894	   NEXTHDR="No Next Header" [RFC2460] in the SEAL header), then splits
895	   the packet into two outer IPv4 fragments and sends both fragments to
896	   the ETE over the same underlying link.  If the ETE returns an SCMP
897	   PTB message with Code=1 (see Section 4.6.1.1), then the subnetwork
898	   path correctly implements IPv4 fragmentation and subsequent data
899	   packets can be sent with DF=0 in the outer header to enable the
900	   preferred method of probing.  If the ETE returns an SCMP PTB message
901	   with Code=2, however, the ITE is obliged to set DF=1 for future
902	   packets sent over that underlying link since a middlebox in the
903	   network is reassembling the IPv4 fragments before they are delivered
904	   to the ETE.

906	   In addition to any control plane probing, all SEAL encapsulated data
907	   packets sent by the ITE are considered implicit probes.  SEAL
908	   encapsulated packets that use IPv4 as the outer layer of
909	   encapsulation with DF=0 will elicit SCMP PTB messages from the ETE if
910	   any IPv4 fragmentation occurs in the path.  SEAL encapsulated packets
911	   that use either IPv6 or IPv4 with DF=1 as the outer layer of
912	   encapsulation may be dropped by a router on the path to the ETE which
913	   will also return an ICMP PTB message to the ITE.  If the message
914	   includes enough information (see Section 4.4.10), the ITE can then
915	   use the (LINK_ID, NONCE, SEAL_ID)-tuple within the packet-in-error to
916	   determine whether the PTB message corresponds to one of its recent
917	   packet transmissions.

919	   The ITE should also send explicit probes, periodically, to verify
920	   that the ETE is still reachable.  The ITE sets A=1 in the SEAL header
921	   of a segment to be used as an explicit probe, where the probe can be
922	   either an ordinary data packet segment or a NULL packet (see above).
923	   The probe will elicit an SCMP PTB message from the ETE as an
924	   acknowledgement (see Section 4.6.1).

926	4.4.10.  Processing ICMP Messages

928	   When the ITE sends outer IP packets, it may receive ICMP error
929	   messages [RFC0792][RFC4443] from either the ETE or routers within the
930	   subnetwork.  The ICMP messages include an outer IP header, followed
931	   by an ICMP header, followed by a portion of the outer IP packet that
932	   generated the error (also known as the "packet-in-error").  The ITE
933	   can use the (LINK_ID, NONCE, SEAL_ID)-tuple encoded in the SEAL
934	   header within the packet-in-error to confirm that the ICMP message
935	   came from either the ETE or an on-path router, and can use any
936	   additional information to determine whether to accept or discard the
937	   message.

939	   The ITE should specifically process raw ICMPv4 Protocol Unreachable
940	   messages and ICMPv6 Parameter Problem messages with Code
941	   "Unrecognized Next Header type encountered" as a hint that the ETE
942	   does not implement the SEAL protocol; specific actions that the ITE
943	   may take in this case are out of scope.

945	4.4.11.  Black Hole Detection

947	   In some subnetwork paths, ICMP error messages may be lost due to
948	   filtering or may not contain enough information due to a router in
949	   the path not observing the recommendations of [RFC1812].  The ITE can
950	   use explicit probing as described in Section 4.4.9 to determine
951	   whether the path to the ETE is silently dropping packets (also known
952	   as a "black hole").  For example, when the ITE is obliged to set DF=1
953	   in the outer headers of data packets it should send explicit probe
954	   packets, periodically, in order to detect path MTU increases or
955	   decreases.

957	4.5.  ETE Specification

959	4.5.1.  Reassembly Buffer Requirements

961	   The ETE SHOULD support the minimum IP-layer reassembly requirements
962	   specified for IPv4 (i.e., 576 bytes [RFC1812]) and IPv6 (i.e., 1500
963	   bytes [RFC2460]).  The ETE SHOULD also support SEAL-layer reassembly
964	   for inner packets of at least 1280 bytes in length and MAY support
965	   reassembly for larger inner packets.  The ETE records the SEAL-layer
966	   reassembly buffer size in a soft-state variable "S_MRU" (see: Section
967	   4.5.2).

969	   The ETE may instead omit the reassembly function altogether and set
970	   S_MRU=0, but this may cause tunnel MTU underruns in some environments
971	   resulting in an unusable link.  When reassembly is supported, the ETE
972	   must retain the outer IP, SEAL and other outer headers and trailers
973	   during both IP-layer and SEAL-layer reassembly for the purpose of
974	   associating the fragments/segments of the same packet, and must also
975	   configure a SEAL-layer reassembly buffer that is no smaller than the
976	   IP-layer reassembly buffer.  Hence, the ETE:

978	   o  SHOULD configure an outer IP-layer reassembly buffer of at least
979	      the minimum specified for the outer IP protocol version.

981	   o  SHOULD configure a SEAL-layer reassembly buffer S_MRU size of at
982	      least (1280 + HELN) bytes, and

984	   o  MUST be capable of discarding inner packets that require IP-layer
985	      and/or SEAL-layer reassembly and that are larger than (S_MRU -
986	      HLEN).

988	   The ETE is permitted to accept inner packets that did not undergo IP-
989	   layer and/or SEAL-layer reassembly even if they are larger than
990	   (S_MRU - HELN) bytes.  Hence, S_MRU is a maximum *reassembly* size,
991	   and may be less than the largest packet size the ETE is able to
992	   receive when no reassembly is required.

994	4.5.2.  Tunnel Interface Soft State

996	   The ETE maintains a single per-interface S_MRU value to be applied
997	   for all unidirectional tunnel neighbors, and can also maintain per-
998	   ITE S_MRU values for any bidirectional tunnel neighbors (see: Section
999	   4.7).  For each bidirectional ITE neighbor, the ETE also maintains
1000	   per-ITE soft state to track the NONCE, SEAL_ID and LINK_ID values
1001	   used by the ITE.

1003	   For each bidirectional tunnel neighbor, the ETE also tracks the outer
1004	   IP source addresses (and also port numbers when outer UDP
1005	   encapsulation is used) of packets received from the ITE and
1006	   associates the most recent values received with the corresponding
1007	   LINK_ID.  In this way, the LINK_ID provides a stable handle for the
1008	   tunnel near end to use for return traffic to the tunnel far end even
1009	   if the outer IP source address and port numbers in packets received
1010	   from the tunnel far end change.

1012	4.5.3.  IP-Layer Reassembly

1014	   The ETE submits unfragmented SEAL protocol IP packets for SEAL-layer
1015	   reassembly as specified in Section 4.5.4.  The ETE instead performs
1016	   standard IP-layer reassembly for multi-fragment SEAL protocol IP
1017	   packets as follows.

1019	   The ETE should maintain conservative IP-layer reassembly cache high-
1020	   and low-water marks.  When the size of the reassembly cache exceeds
1021	   this high-water mark, the ETE should actively discard incomplete
1022	   reassemblies (e.g., using an Active Queue Management (AQM) strategy)
1023	   until the size falls below the low-water mark.  The ETE should also
1024	   actively discard any pending reassemblies that clearly have no
1025	   opportunity for completion, e.g., when a considerable number of new
1026	   fragments have been received before a fragment that completes a
1027	   pending reassembly has arrived.  Following successful IP-layer
1028	   reassembly, the ETE submits the reassembled packet for SEAL-layer
1029	   reassembly as specified in Section 4.5.4.

1031	   When the ETE processes the IP first fragment (i.e., one with MF=1 and
1032	   Offset=0 in the IP header) of a fragmented SEAL packet, it sends an
1033	   SCMP PTB message back to the ITE (see Section 4.6.1.1).  When the ETE
1034	   processes an IP fragment that would cause the reassembled outer
1035	   packet to be larger than the IP-layer reassembly buffer following
1036	   reassembly, it discontinues the reassembly and discards any further
1037	   fragments of the same packet.

1039	4.5.4.  SEAL-Layer Reassembly

1041	   Following IP reassembly (if necessary), the ETE examines each mid-
1042	   layer data packet (i.e., those with C=0 in the SEAL header) packet)
1043	   to determine whether an SCMP error message is required.  If the mid-
1044	   layer data packet has an incorrect value in the SEAL header the ETE
1045	   discards the packet and returns an SCMP "Parameter Problem" message
1046	   (see Section 4.6.1).  Next, if the SEAL header has A=1 and the packet
1047	   did not arrive as multiple outer IP fragments, the ETE sends an SCMP
1048	   PTB message with Code=2 back to the ITE (see Section 4.6.1.1).  The
1049	   ETE next submits single-segment mid-layer packets for decapsulation
1050	   and delivery to upper layers (see Section 4.5.5).  The ETE instead
1051	   performs SEAL-layer reassembly for multi-segment mid-layer packets
1052	   with I=1 in the SEAL header as follows.

1054	   The ETE adds each segment of a multi-segment mid-layer packet with
1055	   I=1 in the SEAL header to a SEAL-layer pending-reassembly queue
1056	   according to the (LINK_ID, NONCE, SEAL_ID)-tuple found in the SEAL
1057	   header.  The ETE performs SEAL-layer reassembly through simple in-
1058	   order concatenation of the encapsulated segments of the same mid-
1059	   layer packet from N consecutive SEAL segments.  SEAL-layer reassembly
1060	   requires the ETE to maintain a cache of recently received segments
1061	   for a hold time that would allow for nominal inter-segment delays.
1062	   When a SEAL reassembly times out, the ETE discards the incomplete
1063	   reassembly and returns an SCMP "Time Exceeded" message to the ITE
1064	   (see Section 4.6.1).  As for IP-layer reassembly, the ETE should also
1065	   maintain a conservative reassembly cache high- and low-water mark and
1066	   should actively discard any pending reassemblies that clearly have no
1067	   opportunity for completion, e.g., when a considerable number of new
1068	   SEAL packets have been received before a packet that completes a
1069	   pending reassembly has arrived.

1071	   If the ETE receives a SEAL packet for which a segment with the same
1072	   (LINK_ID, NONCE, SEAL_ID)-tuple is already in the queue, it must
1073	   determine whether to accept the new segment and release the old, or
1074	   drop the new segment.  If accepting the new segment would cause an
1075	   inconsistency with other segments already in the queue (e.g.,
1076	   differing segment lengths), the ETE drops the segment that is least
1077	   likely to complete the reassembly.  When the ETE has already received
1078	   the SEAL first segment (i.e., one with F=1 and M=1 in the SEAL
1079	   header) of a SEAL protocol packet that arrived as multiple SEAL
1080	   segments, and accepting the current segment would cause the size of
1081	   the reassembled packet to exceed S_MRU, the ETE schedules the
1082	   reassembly resources for garbage collection and sends an SCMP PTB
1083	   message with Code=3 back to the ITE (see Section 4.6.1.1).

1085	   After all segments are gathered, the ETE reassembles the packet by
1086	   concatenating the segments encapsulated in the N consecutive SEAL
1087	   packets beginning with the initial segment (i.e., SEG=0) and followed
1088	   by any non-initial segments 1 through N-1.  That is, for an N-segment
1089	   mid-layer packet, reassembly entails the concatenation of the SEAL-
1090	   encapsulated packet segments with (F=1, M=1, SEAL_ID=j) in the first
1091	   SEAL header, followed by (F=0, M=1, SEG=1, SEAL_ID=(j+1)) in the next
1092	   SEAL header, followed by (F=0, M=1, SEG=2, SEAL_ID=(j+2)), etc., up
1093	   to (F=0, M=0, SEG=(N-1), SEAL_ID=(j + N-1)) in the final SEAL header,
1094	   where modulo arithmetic based on the length of the SEAL_ID field is
1095	   used.  Following successful SEAL-layer reassembly, the ETE submits
1096	   the reassembled mid-layer packet for decapsulation and delivery to
1097	   upper layers as specified in Section 4.5.5.

1099	   The ETE must not perform SEAL-layer reassembly for multi-segment mid-
1100	   layer packets with I=0 in the SEAL header.  The ETE instead silently
1101	   drops all segments with I=0 and either F=0 or (F=1; M=1) in the SEAL
1102	   header and sends an SCMP Parameter Problem message back to the ITE.

1104	4.5.5.  Decapsulation and Delivery to Upper Layers

1106	   Following any necessary IP- and SEAL-layer reassembly, the ETE
1107	   discards the outer headers and trailers and performs any mid-layer
1108	   transformations on the mid-layer packet.  The ETE next discards the
1109	   mid-layer headers and trailers, and delivers the inner packet to the
1110	   upper-layer protocol indicated either in the SEAL NEXTHDR field or
1111	   the next header field of the mid-layer packet (i.e., if the packet
1112	   included mid-layer encapsulations).  The ETE instead silently
1113	   discards the inner packet if it was a NULL packet (see Section
1114	   4.4.9).

1116	4.6.  The SEAL Control Message Protocol (SCMP)

1118	   SEAL uses a companion SEAL Control Message Protocol (SCMP) based on
1119	   the same message format as the Internet Control Message Protocol for
1120	   IPv6 (ICMPv6) [RFC4443].  Each SCMP message is embedded within an
1121	   SCMP packet which begins with the same outer header format as would
1122	   be used for outer encapsulation of a SEAL data packet (see: Section
1123	   4.4.7).  The following sections specify the generation and processing
1124	   of SCMP messages:

1126	4.6.1.  Generating SCMP Messages

1128	   SCMP messages may be generated by either ITEs or ETEs (i.e., by any
1129	   TE) using the same message Type and Code values specified for
1130	   ordinary ICMPv6 messages in [RFC4443].  SCMP is also used to carry
1131	   other ICMPv6 message types and their associated options as specified
1132	   in other documents (e.g., [RFC4191][RFC4861], etc.).  The general
1133	   format for SCMP messages is shown in Figure 4:

1135	       0                   1                   2                   3
1136	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1137	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1138	      |     Type      |     Code      |          Checksum             |
1139	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1140	      |                                                               |
1141	      ~                         Message Body                          ~
1142	      |                                                               |
1143	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1144	      |                  As much of invoking SEAL data                |
1145	      ~                packet as possible without the SCMP            ~
1146	      |                  packet exceeding 576 bytes (*)               |
1147	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1149	      (*) also known as the "packet-in-error"

1151	                       Figure 4: SCMP Message Format

1153	   TEs generate solicitation messages (e.g., an SCMP echo request, an
1154	   SCMP router/neighbor solicitation, a SEAL data packet with A=1, etc.)
1155	   for the purpose of triggering an SCMP response.  TEs generate
1156	   solicited SCMP messages (e.g., an SCMP echo reply, an SCMP router/
1157	   neighbor advertisement, an SCMP PTB message, etc.) in response to
1158	   explicit solicitations, and also generate SCMP error messages in
1159	   response to errored SEAL data packets.  As for ICMP, TEs must not
1160	   generate SCMP error message in response to other SCMP messages.

1162	   As for ordinary ICMPv6 messages, the SCMP message begins with a 4
1163	   byte header that includes 8-bit Type and Code fields followed by a
1164	   16-bit Checksum field followed by a variable-length Message Body.
1165	   The TE sets the Type and Code fields to the same values that would
1166	   appear in the corresponding ICMPv6 message and also formats the
1167	   Message Body the same as for the corresponding ICMPv6 message.

1169	   The Message Body is followed by the leading portion of the invoking
1170	   SEAL data packet (i.e., the "packet-in-error") IFF the packet-in-
1171	   error would also be included in the corresponding ICMPv6 message.  If
1172	   the SCMP message will include a packet-in-error, the TE includes as
1173	   much of the leading portion of the invoking SEAL data packet as
1174	   possible beginning with the outer IP header and extending to a length
1175	   that would not cause the entire SCMP packet following outer
1176	   encapsulation to exceed 576 bytes (see: Figure 5).

1178	   The TE then calculates the SCMP message Checksum the same as
1179	   specified for ICMPv6 messages except that it does not prepend a
1180	   pseudo-header of the outer IP header since the (LINK_ID, NONCE,
1181	   SEAL_ID)-tuple already gives sufficient assurance against mis-
1182	   delivery.  (The Checksum calculation procedure is therefore identical
1183	   to that used for ICMPv4 [RFC0792].)  The TE then encapsulates the
1184	   SCMP message in the outer headers as shown in Figure 5:

1186	                                       +--------------------+
1187	                                       ~  outer IPv4 header ~
1188	                                       +--------------------+
1189	                                       ~  other outer hdrs  ~
1190	                                       +--------------------+
1191	                                       ~    SEAL Header     ~
1192	          +--------------------+       +--------------------+
1193	          ~ SCMP message header~  -->  ~ SCMP message header~
1194	          +--------------------+  -->  +--------------------+
1195	          ~  SCMP message body ~  -->  ~  SCMP message body ~
1196	          +--------------------+  -->  +--------------------+
1197	          ~   packet-in-error  ~  -->  ~  packet-in-error   ~
1198	          +--------------------+       +--------------------+
1199	                                       ~   outer trailers   ~
1200	               SCMP Message            +--------------------+
1201	           before encapsulation
1202	                                             SCMP Packet
1203	                                         after encapsulation

1205	                   Figure 5: SCMP Message Encapsulation

1207	   When a TE generates an SCMP message in response to an SCMP
1208	   solicitation or an ordinary SEAL data packet (i.e., a "solicitation
1209	   packet"), it sets the outer IP destination and source addresses of
1210	   the SCMP packet to the solicitation's source and destination
1211	   addresses (respectively).  (If the destination address in the
1212	   solicitation was multicast, the TE instead sets the outer IP source
1213	   address of the SCMP packet to an address assigned to the underlying
1214	   IP interface.)  The TE then sets the (LINK_ID, NONCE, SEAL_ID)-tuple
1215	   and I flag in the SEAL header of the SCMP packet to the same values
1216	   that appeared in the solicitation.

1218	   When a TE generates an unsolicited SCMP message, it sets the outer IP
1219	   destination and source addresses of the SCMP packet the same as it
1220	   would for ordinary SEAL data packets.  The TE then sets the (LINK_ID,
1221	   NONCE, SEAL_ID)-tuple and I flag in the SEAL header of the SCMP
1222	   packet to the same values that it would use to send an ordinary SEAL
1223	   data packet.

1225	   For all SCMP messages, the TE then sets the other flag bits in the
1226	   SEAL header to C=1, A=0, R=0, F=1, and M=0.  It next sets the
1227	   NEXTHDR/SEG field to 0 and sends the SCMP packet to the tunnel
1228	   neighbor.

1230	4.6.1.1.  Generating SCMP Packet Too Big (PTB) Messages

1232	   An ETE generates an SCMP PTB message under one of the following
1233	   cases:

1235	   o  Case 1: when it receives the IP first fragment (i.e., one with
1236	      MF=1 and Offset=0 in the outer IP header) of a SEAL protocol
1237	      packet that arrived as multiple IP fragments, or:

1239	   o  Case 2: when it receives a SEAL protocol data packet with A=1 in
1240	      the SEAL header that did not arrive as multiple IP fragments
1241	      (i.e., one that does not also match Case 1), or:

1243	   o  Case 3: when it has already received the SEAL first segment (i.e.,
1244	      one with F=1 and M=1 in the SEAL header) of a SEAL protocol packet
1245	      that arrived as multiple SEAL segments, and accepting the current
1246	      segment would cause the size of the reassembled packet to exceed
1247	      S_MRU.

1249	   The ETE prepares an SCMP PTB message the same as for the
1250	   corresponding ICMPv6 PTB message, except that it writes the S_MRU
1251	   value for this ITE in the MTU field (i.e., even if the S_MRU value is
1252	   0).  For cases 1 and 2 above, the packet-in-error field includes the
1253	   leading portion of the IP packet or fragment that triggered the
1254	   condition.  For case 3 above, the packet-in-error field includes the
1255	   leading portion of the SEAL first segment, beginning with the
1256	   encapsulating outer IP header.

1258	   Finally, the ETE writes the value 1, 2 or 3 in the Code field of the
1259	   PTB message according to whether the reason for generating the
1260	   message was due to the corresponding case number from the list of
1261	   cases above.

1263	   NOTE CAREFULLY that, unlike cases 1 and 3 above, case 2 is not an
1264	   error condition and does not necessarily signify packet loss.
1265	   Instead, it is a control plane acknowledgement of a data plane probe.
1266	   NOTE ALSO that the ETE MUST NOT generate both a Case 1 and a Case 2
1267	   SCMP PTB message on behalf of the same SEAL segment.

1269	4.6.1.2.  Generating SCMP Neighbor Discovery Messages

1271	   An ITE generates an SCMP "Neighbor Solicitation" (SNS) or "Router
1272	   Solicitation" (SRS) message when it needs to solicit a response from
1273	   an ETE.  An ETE generates a solicited SCMP "Neighbor Advertisement"
1274	   (SNA) or "Router Advertisement" (SRA) message when it receives an
1275	   SNS/SRS message.  Any TE may also generate unsolicited SNA/SRA
1276	   messages that are not triggered by a specific solicitation event.

1278	   The TE generates SNS, SNA, SRS and SRA messages the same as described
1279	   for the corresponding IPv6 Neighbor Discovery (ND) messages (see:
1280	   [RFC4861]).

1282	4.6.1.3.  Generating SCMP Redirect Messages

1284	   An ETE generates an SCMP "Redirect" message when it receives a SEAL
1285	   data packet with R=1 in the SEAL header and needs to inform the ITE
1286	   of a better next hop.  The ETE generates SCMP Redirect messages the
1287	   same as described for IPv6 ND Redirects in [RFC4861], except that it
1288	   includes Route Information Options (RIOs) [RFC4191] to inform the ITE
1289	   of a better next hop for an entire IP prefix instead of only a single
1290	   destination.  The SCMP Redirect message therefore supports both
1291	   network and host redirection instead of only host redirection.

1293	4.6.1.4.  Generating Other SCMP Messages

1295	   An ETE generates an SCMP "Destination Unreachable - Communication
1296	   with Destination Administratively Prohibited" message when its
1297	   association with the ITE is bidirectional and it receives a SEAL
1298	   packet with a (LINK_ID, NONCE, SEAL_ID)-tuple that does not
1299	   correspond to this ITE (see: Section 4.7).

1301	   An ETE generates an SCMP "Destination Unreachable" message with an
1302	   appropriate code under the same circumstances that an IPv6 system
1303	   would generate an ICMPv6 Destination Unreachable message using the
1304	   same code.  The SCMP Destination Unreachable message is formatted the
1305	   same as for ICMPv6 Destination Unreachable messages.

1307	   An ETE generates an SCMP "Parameter Problem" message when it receives
1308	   a SEAL packet with an incorrect value in the SEAL header, and
1309	   generates an SCMP "Time Exceeded" message when it garbage collects an
1310	   incomplete SEAL data packet reassembly.  The message formats used are
1311	   the same as for the corresponding ICMPv6 messages.

1313	   Generation of all other SCMP message types is outside the scope of
1314	   this document.

1316	4.6.2.  Processing SCMP Messages

1318	   An ITE processes any solicited and error SCMP message it receives as
1319	   long as it can verify that the corresponding SCMP packet was sent
1320	   from an on-path ETE.  The ITE can verify that the SCMP packet came
1321	   from an on-path ETE by checking that the (LINK_ID, NONCE, SEAL_ID)-
1322	   tuple in the SEAL header of the packet corresponds to one of its
1323	   recently-sent SEAL data packets or SCMP solicitation packets.

1325	   For each solicited and error SCMP message it receives, the ITE first
1326	   verifies that the (LINK_ID, NONCE, SEAL_ID)-tuple is acceptable, then
1327	   verifies that the Checksum in the SCMP message header is correct.  If
1328	   the (LINK_ID, NONCE,SEAL_ID)-tuple and/or checksum are incorrect, the
1329	   ITE discards the message; otherwise, it processes the message the
1330	   same as for ordinary ICMPv6 messages.

1332	   Any TE may also receive unsolicited SCMP messages (e.g., SNS, SRS,
1333	   SNA, SRA, etc.) from the tunnel neighbor.  The TE sends SCMP response
1334	   messages in response to solicitations, but does not otherwise process
1335	   the unsolicited SCMP messages as an indication of tunnel neighbor
1336	   liveness.

1338	   Finally, TEs process solicited and error SCMP messages as an
1339	   indication that the tunnel neighbor is responsive, i.e., in the same
1340	   manner implied for IPv6 Neighbor Unreachability Detection "hints of
1341	   forward progress" (see: [RFC4861]).

1343	4.6.2.1.  Processing SCMP PTB Messages

1345	   An ITE may receive an SCMP PTB message after it sends a SEAL data
1346	   packet to an ETE (see: Section 4.6.1).  The packet-in-error within
1347	   the PTB message consists of the encapsulating IP/*/SEAL headers
1348	   followed by the inner packet in the form in which the ITE received it
1349	   prior to SEAL encapsulation.

1351	   If the PTB message has Code=2 in the SCMP header the ITE processes
1352	   the message as a response to an explicit probe request and discards
1353	   the message.  If the PTB has Code=1 or Code=3 in the SCMP header,
1354	   however, the ITE processes the message as an indication of an MTU
1355	   limitation.

1357	   if the PTB has Code =1, the ITE first verifies that the outer IP
1358	   header in the packet-in-error encodes an IP first fragment, then
1359	   examines the outer IP header length field to determine a new S_MSS
1360	   value as follows:

1362	   o  If the length is no less than 1280, the ITE records the length as
1363	      the new S_MSS value.

1365	   o  If the length is less than the current S_MSS value and also less
1366	      than 1280, the ITE can discern that IP fragmentation is occurring
1367	      but it cannot determine the true MTU of the restricting link due
1368	      to the possibility that a router on the path is generating runt
1369	      first fragments.

1371	   In this latter case, the ITE may need to search for a reduced S_MSS
1372	   value through an iterative searching strategy that parallels the IPv4
1373	   Path MTU Discovery "plateau table" procedure in a similar fashion as
1374	   described in Section 5 of [RFC1191].  This searching strategy may
1375	   entail multiple iterations in which the ITE sends additional SEAL
1376	   data packets using a reduced S_MSS and receives additional SCMP PTB
1377	   messages, but the process should quickly converge.  During this
1378	   process, it is essential that the ITE reduce S_MSS based on the first
1379	   SCMP PTB message received under the current S_MSS size, and refrain
1380	   from further reducing S_MSS until SCMP PTB messages pertaining to
1381	   packets sent under the new S_MSS are received.

1383	   For both Code=1 and Code=3 PTB messages, the ITE next records the
1384	   value in the MTU field of the SCMP PTB message as the new S_MRU value
1385	   for this ETE and examines the inner packet within the packet-in-
1386	   error.  If the inner packet was unfragmentable (see: Section 4.4.3)
1387	   and larger than (MAX(S_MRU, S_MSS) - HLEN), the ITE then sends a
1388	   transcribed PTB message appropriate for the inner packet to the
1389	   original source with MTU set to (MAX(S_MRU, S_MSS) - HLEN).  (In the
1390	   case of nested SEAL encapsulations, the transcribed PTB message will
1391	   itself be an SCMP PTB message).  If the inner packet is fragmentable,
1392	   however, the ITE instead reduces its inner fragmentation THRESH
1393	   estimate to a size no larger than S_MSS for this ETE (see: Section
1394	   4.4.3) and does not send a transcribed PTB.  In that case, some
1395	   fragmentable packets may be silently discarded but future
1396	   fragmentable packets will subsequently undergo inner fragmentation
1397	   based on this new THRESH estimate.

1399	   The ITE may alternatively ignore the S_MSS and S_MRU values, thus
1400	   disabling SEAL-layer segmentation.  In that case, the ITE sends all
1401	   SEAL-encapsulated packets as single segments and implements stateless
1402	   MTU discovery.  In that case, if the ITE receives an SCMP PTB message
1403	   from the ETE with Code=1 and with a too-small length value in the
1404	   outer IP header, it can send a translated PTB message back to the
1405	   source listing a slightly smaller MTU size than the length value in
1406	   the inner IP header.  For example, if the ITE receives an SCMP PTB
1407	   message with Code=1, outer IP length 256 and inner IP length 1500, it
1408	   can send a PTB message listing an MTU of 1400 back to the source.  If
1409	   the ITE subsequently receives an SCMP PTB message with Code=1, outer
1410	   IP length 256 and inner IP length 1400, it can send a PTB message
1411	   listing an MTU of 1300 back to the source, etc.

1413	   Actual plateau table values for this "step-down" MTU determination
1414	   procedure are up to the implementation, which may consult Section 7
1415	   of [RFC1191] for non-normative example guidance.

1417	4.6.2.2.  Processing SCMP Neighbor Discovery Messages

1419	   An ETE may receive SNS/SRS messages from an ITE as the initial leg in
1420	   a neighbor discovery exchange.  An ITE may also receive both
1421	   solicited and unsolicited SNA/SRA messages from an ETE.

1423	   The TE processes SNS/SRS and SNA/SRA messages the same as described
1424	   for the corresponding IPv6 Neighbor Discovery (ND) messages (see:
1425	   [RFC4861]).

1427	4.6.2.3.  Processing SCMP Redirect Messages

1429	   An ITE may receive SCMP redirect messages after sending a SEAL data
1430	   packet with R=1 in the SEAL header to an ETE.  The ITE processes any
1431	   RIO options in the SCMP redirect message and updates its Forwarding
1432	   Information Base (FIB) accordingly.

1434	4.6.2.4.  Processing Other SCMP Messages

1436	   An ITE may receive an SCMP "Destination Unreachable - Communication
1437	   with Destination Administratively Prohibited" message after it sends
1438	   a SEAL data packet.  The ITE processes the message as an indication
1439	   that it needs to (re)synchronize with the ETE (see: Section 4.7).

1441	   An ITE may receive an SCMP "Destination Unreachable" message with an
1442	   appropriate code under the same circumstances that an IPv6 node would
1443	   receive an ICMPv6 Destination Unreachable message.  The ITE processes
1444	   the message the same as for the corresponding ICMPv6 Destination
1445	   Unreachable messages.

1447	   An ITE may receive an SCMP "Parameter Problem" message when the ETE
1448	   receives a SEAL packet with an incorrect value in the SEAL header.
1449	   The ITE should examine the incorrect SEAL header field setting to
1450	   determine whether a different setting should be used in subsequent
1451	   packets.

1453	   .An ITE may receive an SCMP "Time Exceeded" message when the ETE
1454	   garbage collects an incomplete SEAL data packet reassembly.  The ITE
1455	   should consider the message as an indication of congestion.

1457	   Processing of all other SCMP message types is outside the scope of
1458	   this document.

1460	4.7.  Tunnel Endpoint Synchronization

1462	   By default, the SEAL ITE retains per-ETE soft state, but the ETE does
1463	   not retain per-ITE soft state.  In that case, the tunnel neighbor
1464	   relationship between the ITE and ETE is said to be "unidirectional",
1465	   and the ETE unconditionally accepts any packets coming from the ITE.
1466	   When peer TEs need to establish a closer coordination with one
1467	   another, however, they can establish a bidirectional tunnel neighbor
1468	   relationship to establish both ITE and ETE soft state within both
1469	   TEs.

1471	   In order to establish a bidirectional tunnel neighbor relationship,
1472	   the initiating TE (call it "A") initiates a short transaction with
1473	   the responding TE (call it "B") carried by a reliable transport
1474	   protocol such as TCP.  The protocol details of the transaction are
1475	   out of scope for this document, and indeed need not be standardized
1476	   as long as both TEs observe the same specifications.

1478	   In the transaction, "A" and "B" first authenticate themselves to each
1479	   other.  "A" then selects randomly-generated NONCE(A) and SEAL_ID(A)
1480	   values and registers them with "B", while "B" in turn selects
1481	   randomly-generated NONCE(B) and SEAL_ID(B) values and registers them
1482	   with "A".  Both TEs then further select one or more randomly-
1483	   generated LINK_IDs (e.g., LINK_ID(A1), LINK_ID(A2), etc.), where each
1484	   LINK_ID represents a different underlying link over which the ITE
1485	   function of "A" will send tunneled packets to the ETE function of "B"
1486	   (and vice-versa).  Both TEs then use each such (LINK_ID(i), NONCE,
1487	   SEAL_ID)-tuple to establish the appropriate bidirectional tunnel
1488	   neighbor soft state (see Sections 4.4.2 and 4.5.2).

1490	   Following this bidirectional tunnel neighbor establishment, the
1491	   reliable transport transaction between the TEs concludes since the
1492	   status of the underlying links is opaque to the transport protocol
1493	   and the transport protocol therefore has no means for selecting
1494	   alternate underlying links should the path through the primary
1495	   underlying link fail.  The soft state is then kept alive by the
1496	   continued flow of SEAL data packets and/or SCMP messages between the
1497	   TEs rather than by higher-layer keepalives of the transport protocol.

1499	   Outbound and inbound traffic engineering between bidirectional tunnel
1500	   neighbors is therefore coordinated by SCMP from within the tunnel
1501	   interface and can remain continuous even if the paths through one or
1502	   more of the underlying links has failed.  When one TE detects that
1503	   most/all underlying link paths to the other TE have failed, however,
1504	   it schedules the bidirectional state for garbage collection.

1506	   This bidirectional tunnel neighbor establishment is most commonly
1507	   initiated by a client TE in establishing a "connection" with a
1508	   serving TE, e.g., when a customer router within a home network
1509	   established a connection with a serving router in a provider network.

1511	5.  Link Requirements

1513	   Subnetwork designers are expected to follow the recommendations in
1514	   Section 2 of [RFC3819] when configuring link MTUs.

1516	6.  End System Requirements

1518	   SEAL provides robust mechanisms for returning PTB messages; however,
1519	   end systems that send unfragmentable IP packets larger than 1500
1520	   bytes are strongly encouraged to implement their own end-to-end MTU
1521	   assurance, e.g., using Packetization Layer Path MTU Discovery per
1522	   [RFC4821].

1524	7.  Router Requirements

1526	   IPv4 routers within the subnetwork are strongly encouraged to
1527	   implement IPv4 fragmentation such that the first fragment is the
1528	   largest and approximately the size of the underlying link MTU, i.e.,
1529	   they should avoid generating runt first fragments.

1531	   IPv6 routers within the subnetwork are required to generate the
1532	   necessary PTB messages when they drop outer IPv6 packets due to an
1533	   MTU restriction.

1535	8.  IANA Considerations

1537	   The IANA is instructed to allocate an IP protocol number for
1538	   'SEAL_PROTO' in the 'protocol-numbers' registry.

1540	   The IANA is instructed to allocate a Well-Known Port number for
1541	   'SEAL_PORT' in the 'port-numbers' registry.

1543	   The IANA is instructed to establish a "SEAL Protocol" registry to
1544	   record SEAL Version values.  This registry should be initialized to
1545	   include the initial SEAL Version number, i.e., Version 0.

1547	9.  Security Considerations

1549	   Unlike IPv4 fragmentation, overlapping fragment attacks are not
1550	   possible due to the requirement that SEAL segments be non-
1551	   overlapping.  This condition is naturally enforced due to the fact
1552	   that each consecutive SEAL segment begins at offset 0 with respect to
1553	   the previous SEAL segment.

1555	   An amplification/reflection attack is possible when an attacker sends
1556	   IP first fragments with spoofed source addresses to an ETE, resulting
1557	   in a stream of SCMP messages returned to a victim ITE.  The (LINK_ID,
1558	   NONCE, SEAL_ID)-tuple in the encapsulated segment of the spoofed IP
1559	   first fragment provides mitigation for the ITE to detect and discard
1560	   spurious SCMP messages.

1562	   The SEAL header is sent in-the-clear (outside of any IPsec/ESP
1563	   encapsulations) the same as for the outer IP and other outer headers.
1564	   In this respect, the threat model is no different than for IPv6
1565	   extension headers.  As for IPv6 extension headers, the SEAL header is
1566	   protected only by L2 integrity checks and is not covered under any L3
1567	   integrity checks.

1569	   SCMP messages carry the (LINK_ID, NONCE, SEAL_ID)-tuple of the
1570	   packet-in-error.  Therefore, when an ITE receives an SCMP message it
1571	   can unambiguously associate it with the SEAL data packet that
1572	   triggered the error.  When the TEs are synchronized, the ETE can also
1573	   detect off-path spoofing attacks.

1575	   Security issues that apply to tunneling in general are discussed in
1576	   [I-D.ietf-v6ops-tunnel-security-concerns].

1578	10.  Related Work

1580	   Section 3.1.7 of [RFC2764] provides a high-level sketch for
1581	   supporting large tunnel MTUs via a tunnel-level segmentation and
1582	   reassembly capability to avoid IP level fragmentation, which is in
1583	   part the same approach used by SEAL.  SEAL could therefore be
1584	   considered as a fully functioned manifestation of the method
1585	   postulated by that informational reference.

1587	   Section 3 of [RFC4459] describes inner and outer fragmentation at the
1588	   tunnel endpoints as alternatives for accommodating the tunnel MTU;
1589	   however, the SEAL protocol specifies a mid-layer segmentation and
1590	   reassembly capability that is distinct from both inner and outer
1591	   fragmentation.

1593	   Section 4 of [RFC2460] specifies a method for inserting and
1594	   processing extension headers between the base IPv6 header and
1595	   transport layer protocol data.  The SEAL header is inserted and
1596	   processed in exactly the same manner.

1598	   The concepts of path MTU determination through the report of
1599	   fragmentation and extending the IP Identification field were first
1600	   proposed in deliberations of the TCP-IP mailing list and the Path MTU
1601	   Discovery Working Group (MTUDWG) during the late 1980's and early
1602	   1990's.  SEAL supports a report fragmentation capability using bits
1603	   in an extension header (the original proposal used a spare bit in the
1604	   IP header) and supports ID extension through a 16-bit field in an
1605	   extension header (the original proposal used a new IP option).  A
1606	   historical analysis of the evolution of these concepts, as well as
1607	   the development of the eventual path MTU discovery mechanism for IP,
1608	   appears in Appendix D of this document.

1610	11.  SEAL Advantages over Classical Methods

1612	   The SEAL approach offers a number of distinct advantages over the
1613	   classical path MTU discovery methods [RFC1191] [RFC1981]:

1615	   1.  Classical path MTU discovery always results in packet loss when
1616	       an MTU restriction is encountered.  Using SEAL, IP fragmentation
1617	       provides a short-term interim mechanism for ensuring that packets
1618	       are delivered while SEAL adjusts its packet sizing parameters.

1620	   2.  Classical path MTU may require several iterations of dropping
1621	       packets and returning PTB messages until an acceptable path MTU
1622	       value is determined.  Under normal circumstances, SEAL determines
1623	       the correct packet sizing parameters in a single iteration.

1625	   3.  Using SEAL, ordinary packets serve as implicit probes without
1626	       exposing data to unnecessary loss.  SEAL also provides an
1627	       explicit probing mode not available in the classic methods.

1629	   4.  Using SEAL, ETEs encapsulate SCMP error messages in outer and
1630	       mid-layer headers such that packet-filtering network middleboxes
1631	       will not filter them the same as for "raw" ICMP messages that may
1632	       be generated by an attacker.

1634	   5.  The SEAL approach ensures that the tunnel either delivers or
1635	       deterministically drops packets according to their size, which is
1636	       a required characteristic of any IP link.

1638	   6.  Most importantly, all SEAL packets have an Identification field
1639	       that is sufficiently long to be used for duplicate packet
1640	       detection purposes and to associate ICMP error messages with
1641	       actual packets sent without requiring per-packet state; hence,
1642	       SEAL avoids certain denial-of-service attack vectors open to the
1643	       classical methods.

1645	12.  Acknowledgments

1647	   The following individuals are acknowledged for helpful comments and
1648	   suggestions: Jari Arkko, Fred Baker, Iljitsch van Beijnum, Oliver
1649	   Bonaventure, Teco Boot, Bob Braden, Brian Carpenter, Steve Casner,
1650	   Ian Chakeres, Noel Chiappa, Remi Denis-Courmont, Remi Despres, Ralph
1651	   Droms, Aurnaud Ebalard, Gorry Fairhurst, Washam Fan, Dino Farinacci,
1652	   Joel Halpern, Sam Hartman, John Heffner, Thomas Henderson, Bob
1653	   Hinden, Christian Huitema, Eliot Lear, Darrel Lewis, Joe Macker, Matt
1654	   Mathis, Erik Nordmark, Dan Romascanu, Dave Thaler, Joe Touch, Mark
1655	   Townsley, Ole Troan, Margaret Wasserman, Magnus Westerlund, Robin
1656	   Whittle, James Woodyatt, and members of the Boeing Research &
1657	   Technology NST DC&NT group.

1659	   Path MTU determination through the report of fragmentation was first
1660	   proposed by Charles Lynn on the TCP-IP mailing list in 1987.
1661	   Extending the IP identification field was first proposed by Steve
1662	   Deering on the MTUDWG mailing list in 1989.

1664	13.  References

1666	13.1.  Normative References

1668	   [RFC0791]  Postel, J., "Internet Protocol", STD 5, RFC 791,
1669	              September 1981.

1671	   [RFC0792]  Postel, J., "Internet Control Message Protocol", STD 5,
1672	              RFC 792, September 1981.

1674	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1675	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1677	   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
1678	              (IPv6) Specification", RFC 2460, December 1998.

1680	   [RFC3971]  Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure
1681	              Neighbor Discovery (SEND)", RFC 3971, March 2005.

1683	   [RFC4443]  Conta, A., Deering, S., and M. Gupta, "Internet Control
1684	              Message Protocol (ICMPv6) for the Internet Protocol
1685	              Version 6 (IPv6) Specification", RFC 4443, March 2006.

1687	   [RFC4861]  Narten, T., Nordmark, E., Simpson, W., and H. Soliman,
1688	              "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861,
1689	              September 2007.

1691	13.2.  Informative References

1693	   [FOLK]     Shannon, C., Moore, D., and k. claffy, "Beyond Folklore:
1694	              Observations on Fragmented Traffic", December 2002.

1696	   [FRAG]     Kent, C. and J. Mogul, "Fragmentation Considered Harmful",
1697	              October 1987.

1699	   [I-D.ietf-intarea-ipv4-id-update]
1700	              Touch, J., "Updated Specification of the IPv4 ID Field",
1701	              draft-ietf-intarea-ipv4-id-update-01 (work in progress),
1702	              October 2010.

1704	   [I-D.ietf-v6ops-tunnel-security-concerns]
1705	              Krishnan, S., Thaler, D., and J. Hoagland, "Security
1706	              Concerns With IP Tunneling",
1707	              draft-ietf-v6ops-tunnel-security-concerns-04 (work in
1708	              progress), October 2010.

1710	   [I-D.russert-rangers]
1711	              Russert, S., Fleischman, E., and F. Templin, "RANGER
1712	              Scenarios", draft-russert-rangers-05 (work in progress),
1713	              July 2010.

1715	   [I-D.templin-intarea-vet]
1716	              Templin, F., "Virtual Enterprise Traversal (VET)",
1717	              draft-templin-intarea-vet-16 (work in progress),
1718	              July 2010.

1720	   [I-D.templin-iron]
1721	              Templin, F., "The Internet Routing Overlay Network
1722	              (IRON)", draft-templin-iron-13 (work in progress),
1723	              October 2010.

1725	   [MTUDWG]   "IETF MTU Discovery Working Group mailing list,
1726	              gatekeeper.dec.com/pub/DEC/WRL/mogul/mtudwg-log, November
1727	              1989 - February 1995.".

1729	   [RFC1063]  Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP
1730	              MTU discovery options", RFC 1063, July 1988.

1732	   [RFC1070]  Hagens, R., Hall, N., and M. Rose, "Use of the Internet as
1733	              a subnetwork for experimentation with the OSI network
1734	              layer", RFC 1070, February 1989.

1736	   [RFC1191]  Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
1737	              November 1990.

1739	   [RFC1812]  Baker, F., "Requirements for IP Version 4 Routers",
1740	              RFC 1812, June 1995.

1742	   [RFC1981]  McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery
1743	              for IP version 6", RFC 1981, August 1996.

1745	   [RFC2003]  Perkins, C., "IP Encapsulation within IP", RFC 2003,
1746	              October 1996.

1748	   [RFC2473]  Conta, A. and S. Deering, "Generic Packet Tunneling in
1749	              IPv6 Specification", RFC 2473, December 1998.

1751	   [RFC2675]  Borman, D., Deering, S., and R. Hinden, "IPv6 Jumbograms",
1752	              RFC 2675, August 1999.

1754	   [RFC2764]  Gleeson, B., Heinanen, J., Lin, A., Armitage, G., and A.
1755	              Malis, "A Framework for IP Based Virtual Private
1756	              Networks", RFC 2764, February 2000.

1758	   [RFC2923]  Lahey, K., "TCP Problems with Path MTU Discovery",
1759	              RFC 2923, September 2000.

1761	   [RFC3232]  Reynolds, J., "Assigned Numbers: RFC 1700 is Replaced by
1762	              an On-line Database", RFC 3232, January 2002.

1764	   [RFC3366]  Fairhurst, G. and L. Wood, "Advice to link designers on
1765	              link Automatic Repeat reQuest (ARQ)", BCP 62, RFC 3366,
1766	              August 2002.

1768	   [RFC3819]  Karn, P., Bormann, C., Fairhurst, G., Grossman, D.,
1769	              Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L.
1770	              Wood, "Advice for Internet Subnetwork Designers", BCP 89,
1771	              RFC 3819, July 2004.

1773	   [RFC4191]  Draves, R. and D. Thaler, "Default Router Preferences and
1774	              More-Specific Routes", RFC 4191, November 2005.

1776	   [RFC4213]  Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms
1777	              for IPv6 Hosts and Routers", RFC 4213, October 2005.

1779	   [RFC4380]  Huitema, C., "Teredo: Tunneling IPv6 over UDP through
1780	              Network Address Translations (NATs)", RFC 4380,
1781	              February 2006.

1783	   [RFC4459]  Savola, P., "MTU and Fragmentation Issues with In-the-
1784	              Network Tunneling", RFC 4459, April 2006.

1786	   [RFC4821]  Mathis, M. and J. Heffner, "Packetization Layer Path MTU
1787	              Discovery", RFC 4821, March 2007.

1789	   [RFC4963]  Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly
1790	              Errors at High Data Rates", RFC 4963, July 2007.

1792	   [RFC4987]  Eddy, W., "TCP SYN Flooding Attacks and Common
1793	              Mitigations", RFC 4987, August 2007.

1795	   [RFC5445]  Watson, M., "Basic Forward Error Correction (FEC)
1796	              Schemes", RFC 5445, March 2009.

1798	   [RFC5720]  Templin, F., "Routing and Addressing in Networks with
1799	              Global Enterprise Recursion (RANGER)", RFC 5720,
1800	              February 2010.

1802	   [RFC5927]  Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010.

1804	   [SIGCOMM]  Luckie, M. and B. Stasiewicz, "Measuring Path MTU
1805	              Discovery Behavior", November 2010.

1807	   [TBIT]     Medina, A., Allman, M., and S. Floyd, "Measuring
1808	              Interactions Between Transport Protocols and Middleboxes",
1809	              October 2004.

1811	   [TCP-IP]   "Archive/Hypermail of Early TCP-IP Mail List,
1812	              http://www-mice.cs.ucl.ac.uk/multimedia/misc/tcp_ip/, May
1813	              1987 - May 1990.".

1815	   [WAND]     Luckie, M., Cho, K., and B. Owens, "Inferring and
1816	              Debugging Path MTU Discovery Failures", October 2005.

1818	Appendix A.  Reliability

1820	   Although a SEAL tunnel may span an arbitrarily-large subnetwork
1821	   expanse, the IP layer sees the tunnel as a simple link that supports
1822	   the IP service model.  Since SEAL supports segmentation at a layer
1823	   below IP, SEAL therefore presents a case in which the link unit of
1824	   loss (i.e., a SEAL segment) is smaller than the end-to-end
1825	   retransmission unit (e.g., a TCP segment).

1827	   Links with high bit error rates (BERs) (e.g., IEEE 802.11) use
1828	   Automatic Repeat-ReQuest (ARQ) mechanisms [RFC3366] to increase
1829	   packet delivery ratios, while links with much lower BERs typically
1830	   omit such mechanisms.  Since SEAL tunnels may traverse arbitrarily-
1831	   long paths over links of various types that are already either
1832	   performing or omitting ARQ as appropriate, it would therefore often
1833	   be inefficient to also require the tunnel to perform ARQ.

1835	   When the SEAL ITE has knowledge that the tunnel will traverse a
1836	   subnetwork with non-negligible loss due to, e.g., interference, link
1837	   errors, congestion, etc., it can solicit Segment Reports from the ETE
1838	   periodically to discover missing segments for retransmission within a
1839	   single round-trip time.  However, retransmission of missing segments
1840	   may require the ITE to maintain considerable state and may also
1841	   result in considerable delay variance and packet reordering.

1843	   SEAL may also use alternate reliability mechanisms such as Forward
1844	   Error Correction (FEC).  A simple FEC mechanism may merely entail
1845	   gratuitous retransmissions of duplicate data, however more efficient
1846	   alternatives are also possible.  Basic FEC schemes are discussed in

1848	   [RFC5445].

1850	   The use of ARQ and FEC mechanisms for improved reliability are for
1851	   further study.

1853	Appendix B.  Integrity

1855	   Each link in the path over which a SEAL tunnel is configured is
1856	   responsible for link layer integrity verification for packets that
1857	   traverse the link.  As such, when a multi-segment SEAL packet with N
1858	   segments is reassembled, its segments will have been inspected by N
1859	   independent link layer integrity check streams instead of a single
1860	   stream that a single segment SEAL packet of the same size would have
1861	   received.  Intuitively, a reassembled packet subjected to N
1862	   independent integrity check streams of shorter-length segments would
1863	   seem to have integrity assurance that is no worse than a single-
1864	   segment packet subjected to only a single integrity check steam,
1865	   since the integrity check strength diminishes in inverse proportion
1866	   with segment length.  In any case, the link-layer integrity assurance
1867	   for a multi-segment SEAL packet is no different than for a multi-
1868	   fragment IPv6 packet.

1870	   Fragmentation and reassembly schemes must also consider packet-
1871	   splicing errors, e.g., when two segments from the same packet are
1872	   concatenated incorrectly, when a segment from packet X is reassembled
1873	   with segments from packet Y, etc.  The primary sources of such errors
1874	   include implementation bugs and wrapping IP ID fields.  In terms of
1875	   implementation bugs, the SEAL segmentation and reassembly algorithm
1876	   is much simpler than IP fragmentation resulting in simplified
1877	   implementations.  In terms of wrapping ID fields, when IPv4 is used
1878	   as the outer IP protocol, the 16-bit IP ID field can wrap with only
1879	   64K packets with the same (src, dst, protocol)-tuple alive in the
1880	   system at a given time [RFC4963] increasing the likelihood of
1881	   reassembly mis-associations.  However, SEAL ensures that any outer
1882	   IPv4 fragmentation and reassembly will be short-lived and tuned out
1883	   as soon as the ITE receives a Reassembly Repot, and SEAL segmentation
1884	   and reassembly uses a much longer ID field.  Therefore, reassembly
1885	   mis-associations of IP fragments nor of SEAL segments should be
1886	   prohibitively rare.

1888	Appendix C.  Transport Mode

1890	   SEAL can also be used in "transport-mode", e.g., when the inner layer
1891	   comprises upper-layer protocol data rather than an encapsulated IP
1892	   packet.  For instance, TCP peers can negotiate the use of SEAL for
1893	   the carriage of protocol data encapsulated as IPv4/SEAL/TCP.  In this
1894	   sense, the "subnetwork" becomes the entire end-to-end path between
1895	   the TCP peers and may potentially span the entire Internet.

1897	   Section specifies the operation of SEAL in "tunnel mode", i.e., when
1898	   there are both an inner and outer IP layer with a SEAL encapsulation
1899	   layer between.  However, the SEAL protocol can also be used in a
1900	   "transport mode" of operation within a subnetwork region in which the
1901	   inner-layer corresponds to a transport layer protocol (e.g., UDP,
1902	   TCP, etc.) instead of an inner IP layer.

1904	   For example, two TCP endpoints connected to the same subnetwork
1905	   region can negotiate the use of transport-mode SEAL for a connection
1906	   by inserting a 'SEAL_OPTION' TCP option during the connection
1907	   establishment phase.  If both TCPs agree on the use of SEAL, their
1908	   protocol messages will be carried as TCP/SEAL/IPv4 and the connection
1909	   will be serviced by the SEAL protocol using TCP (instead of an
1910	   encapsulating tunnel endpoint) as the transport layer protocol.  The
1911	   SEAL protocol for transport mode otherwise observes the same
1912	   specifications as for Section 4.

1914	Appendix D.  Historic Evolution of PMTUD

1916	   The topic of Path MTU discovery (PMTUD) saw a flurry of discussion
1917	   and numerous proposals in the late 1980's through early 1990.  The
1918	   initial problem was posed by Art Berggreen on May 22, 1987 in a
1919	   message to the TCP-IP discussion group [TCP-IP].  The discussion that
1920	   followed provided significant reference material for [FRAG].  An IETF
1921	   Path MTU Discovery Working Group [MTUDWG] was formed in late 1989
1922	   with charter to produce an RFC.  Several variations on a very few
1923	   basic proposals were entertained, including:

1925	   1.  Routers record the PMTUD estimate in ICMP-like path probe
1926	       messages (proposed in [FRAG] and later [RFC1063])

1928	   2.  The destination reports any fragmentation that occurs for packets
1929	       received with the "RF" (Report Fragmentation) bit set (Steve
1930	       Deering's 1989 adaptation of Charles Lynn's Nov. 1987 proposal)

1932	   3.  A hybrid combination of 1) and Charles Lynn's Nov. 1987 (straw
1933	       RFC draft by McCloughrie, Fox and Mogul on Jan 12, 1990)

1935	   4.  Combination of the Lynn proposal with TCP (Fred Bohle, Jan 30,
1936	       1990)

1938	   5.  Fragmentation avoidance by setting "IP_DF" flag on all packets
1939	       and retransmitting if ICMPv4 "fragmentation needed" messages
1940	       occur (Geof Cooper's 1987 proposal; later adapted into [RFC1191]
1941	       by Mogul and Deering).

1943	   Option 1) seemed attractive to the group at the time, since it was
1944	   believed that routers would migrate more quickly than hosts.  Option
1945	   2) was a strong contender, but repeated attempts to secure an "RF"
1946	   bit in the IPv4 header from the IESG failed and the proponents became
1947	   discouraged. 3) was abandoned because it was perceived as too
1948	   complicated, and 4) never received any apparent serious
1949	   consideration.  Proposal 5) was a late entry into the discussion from
1950	   Steve Deering on Feb. 24th, 1990.  The discussion group soon
1951	   thereafter seemingly lost track of all other proposals and adopted
1952	   5), which eventually evolved into [RFC1191] and later [RFC1981].

1954	   In retrospect, the "RF" bit postulated in 2) is not needed if a
1955	   "contract" is first established between the peers, as in proposal 4)
1956	   and a message to the MTUDWG mailing list from jrd@PTT.LCS.MIT.EDU on
1957	   Feb 19. 1990.  These proposals saw little discussion or rebuttal, and
1958	   were dismissed based on the following the assertions:

1960	   o  routers upgrade their software faster than hosts

1962	   o  PCs could not reassemble fragmented packets

1964	   o  Proteon and Wellfleet routers did not reproduce the "RF" bit
1965	      properly in fragmented packets

1967	   o  Ethernet-FDDI bridges would need to perform fragmentation (i.e.,
1968	      "translucent" not "transparent" bridging)

1970	   o  the 16-bit IP_ID field could wrap around and disrupt reassembly at
1971	      high packet arrival rates

1973	   The first four assertions, although perhaps valid at the time, have
1974	   been overcome by historical events.  The final assertion is addressed
1975	   by the mechanisms specified in SEAL.

1977	Author's Address

1979	   Fred L. Templin (editor)
1980	   Boeing Research & Technology
1981	   P.O. Box 3707
1982	   Seattle, WA  98124
1983	   USA

1985	   Email: fltemplin@acm.org