idnits 2.17.1 draft-templin-seal-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 895. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 906. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 913. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 919. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 13, 2008) is 5915 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '128' on line 325 == Missing Reference: '2KB' is mentioned on line 325, but not defined ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-12) exists of draft-farinacci-lisp-05 == Outdated reference: A later version (-14) exists of draft-ietf-manet-smf-06 == Outdated reference: A later version (-38) exists of draft-templin-autoconf-dhcp-11 -- Obsolete informational reference (is this intentional?): RFC 1063 (Obsoleted by RFC 1191) -- Obsolete informational reference (is this intentional?): RFC 1981 (Obsoleted by RFC 8201) Summary: 2 errors (**), 0 flaws (~~), 6 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Templin, Ed. 3 Internet-Draft Boeing Phantom Works 4 Intended status: Informational February 13, 2008 5 Expires: August 16, 2008 7 Subnetwork Encapsulation and Adaptation Layer 8 draft-templin-seal-02.txt 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on August 16, 2008. 35 Copyright Notice 37 Copyright (C) The IETF Trust (2008). 39 Abstract 41 Subnetworks connect routers within a bounded region, and may also 42 connect to other networks including the Internet. These routers 43 forward unicast and multicast packets over paths that span multiple 44 IP- and/or sub-IP layer forwarding hops which may cross links with 45 diverse Maximum Transmission Units (MTUs) and introduce packet 46 duplication. This document specifies a Subnetwork Encapsulation and 47 Adaptation Layer (SEAL) that supports simplified duplicate packet 48 detection and accommodates links with diverse MTUs. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 53 2. Terminology and Requirements . . . . . . . . . . . . . . . . . 3 54 3. Applicability Statement . . . . . . . . . . . . . . . . . . . 4 55 4. SEAL Protocol Specification . . . . . . . . . . . . . . . . . 5 56 4.1. Model of Operation . . . . . . . . . . . . . . . . . . . . 5 57 4.2. Packetization . . . . . . . . . . . . . . . . . . . . . . 6 58 4.2.1. Packet Size Considerations . . . . . . . . . . . . . . 6 59 4.2.2. Inner IPv4 Fragmentation . . . . . . . . . . . . . . . 7 60 4.2.3. SEAL Segmentation and Encapsulation . . . . . . . . . 7 61 4.2.4. Setting DF and Sending Packets . . . . . . . . . . . . 10 62 4.3. Reassembly . . . . . . . . . . . . . . . . . . . . . . . . 11 63 4.3.1. Reassembly Buffer Requirements . . . . . . . . . . . . 11 64 4.3.2. IPv4-Layer Reassembly . . . . . . . . . . . . . . . . 11 65 4.3.3. SEAL-Layer Reassembly . . . . . . . . . . . . . . . . 11 66 4.4. Generating Fragmentation Reports . . . . . . . . . . . . . 12 67 4.5. Receiving Fragmentation Reports . . . . . . . . . . . . . 13 68 4.6. S-MSS Probing . . . . . . . . . . . . . . . . . . . . . . 13 69 4.7. Processing ICMP PTBs . . . . . . . . . . . . . . . . . . . 14 70 5. Link Requirements . . . . . . . . . . . . . . . . . . . . . . 14 71 6. End System Requirements . . . . . . . . . . . . . . . . . . . 15 72 7. Router Requirements . . . . . . . . . . . . . . . . . . . . . 15 73 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 74 9. Security Considerations . . . . . . . . . . . . . . . . . . . 15 75 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 15 76 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16 77 11.1. Normative References . . . . . . . . . . . . . . . . . . . 16 78 11.2. Informative References . . . . . . . . . . . . . . . . . . 16 79 Appendix A. Historic Evolution of PMTUD (written 10/30/2002) . . 18 80 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 19 81 Intellectual Property and Copyright Statements . . . . . . . . . . 21 83 1. Introduction 85 Mobile Ad-hoc Networks (MANETs) and other subnetworks connect routers 86 on links with asymmetric reachability characteristics, and may also 87 connect to other networks including the Internet. These routers 88 forward unicast and multicast packets over paths that span multiple 89 IP- and/or sub-IP layer forwarding hops, which may traverse links 90 with diverse Maximum Transmission Units (MTUs) and may also introduce 91 packet duplication due to temporal or persistent routing loops. It 92 is also expected that these routers will support operation of the 93 Internet protocols [RFC0791][RFC2460]. 95 The use of IPv4 encapsulation has long been considered as an 96 alternative for introducing a well-behaved identification field 97 useful for duplicate packet detection, such as required for 98 Simplified Multicast Forwarding [I-D.ietf-manet-smf]. However, the 99 16-bit ID field in the outer IPv4 header supports only 2^16 distinct 100 identification values and therefore does not provide sufficient space 101 for robust duplicate packet detection over modern link technologies. 103 Additionally, the insertion of an outer IPv4 header reduces the 104 effective path MTU as-seen by the IP layer. This reduced MTU can be 105 accommodated through the use of IPv4 fragmentation, but unmitigated 106 in-the-network fragmentation has been shown to be harmful through 107 operational experience and studies conducted over the course of many 108 years [FRAG][RFC2923][RFC4459][RFC4963]. 110 This document proposes a Subnetwork Encapsulation and Adaptation 111 Layer (SEAL) for the operation of IP over subnetworks that connect 112 Ingress- and Egress Tunnel Endpoints (ITEs/ETEs). SEAL supports 113 simple and robust duplicate packet detection, and accommodates links 114 with diverse MTUs. SEAL additionally supports multiprotocol 115 operation and provides extended quality of service for the protocols 116 that use it. The SEAL protocol is specified in the following 117 sections. 119 2. Terminology and Requirements 121 The terminology of [RFC3819][RFC2501][I-D.ietf-autoconf-manetarch] is 122 used in this document. The following abbreviations correspond to 123 terms used within this document and elsewhere in common 124 Internetworking nomenclature: 126 MANET - Mobile Ad-hoc Network 128 Subnetwork - a MANET or other network that connects (and is 129 bounded by) ITEs and ETEs 130 SEAL - Subnetwork Encapsulation and Adaptation Layer 132 VET - Virtual EThernet 134 ITE - Ingress Tunnel Endpoint 136 ETE - Egress Tunnel Endpoint 138 MTU - Maximum Transmission Unit 140 S-MSS - SEAL Maximum Segment Size 142 EMTU_R - Effective MTU to Receive 144 PTB - an ICMPv6 "Packet Too Big" or an ICMPv4 "fragmentation 145 needed" message 147 DF - the IPv4 header Don't Fragment flag 149 ENCAPS - the size of the outer encapsulating SEAL/*/IPv4 headers 151 FRAGREP - a Fragmentation Report message 153 SEAL packet - a segment of an inner packet encapsulated in outer 154 SEAL/*/IPv4 headers 156 SEAL ID - a 32-bit Identification value that is randomly 157 initialized and monotonically incremented for each SEAL packet 158 sent to an ETE 160 Unfragmentable - an IPv4 packet with DF=1, or an IPv6 packet 162 The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, 163 SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this 164 document, are to be interpreted as described in [RFC2119]. 166 3. Applicability Statement 168 SEAL inserts an additional mid-layer encapsulation when IP/*/IPv4 169 encapsulation is used, and appears as a subnetwork encapsulation as 170 seen by inner layers. 172 While the SEAL approach was motivated by the specific use case of 173 duplicate packet detection in MANETs, the domain of applicability is 174 not limited to the MANET problem space and extends to other 175 subnetwork uses such as tunneling across enterprise networks, the 176 interdomain routing core, etc. 178 For further study, SEAL may also be useful for "transport-mode" 179 applications, e.g., when the inner packet encapsulates ordinary 180 protocol data rather than an IP packet. 182 4. SEAL Protocol Specification 184 4.1. Model of Operation 186 Ingres Tunnel Endpoints (ITEs) insert a SEAL header in the IP/*/ 187 IPv4-encapsulated packets they inject into a subnetwork, where the 188 outermost IPv4 header contains the source and destination addresses 189 of the ITR/ETR subnetwork entry/exit points, respectively. SEAL 190 defines a new IP protocol type and a new mid-layer encapsulation for 191 both unicast and multicast inner packets. The ITE inserts a SEAL 192 header during encapsulation as shown in Figure 1: 194 +-------------------------+ 195 | | 196 ~ Outer */IPv4 headers ~ 197 | | 198 +-------------------------+ 199 +-- SEAL Header --+ 200 +-------------------------+ +-------------------------+ 201 | | | | 202 ~ Any mid-layer * headers ~ ~ Any mid-layer * headers ~ 203 | | | | 204 +-------------------------+ +-------------------------+ 205 | | | | 206 ~ Inner IP ~ ---> ~ Inner IP ~ 207 ~ Packet ~ ---> ~ Packet ~ 208 | | | | 209 +-------------------------+ +-------------------------+ 210 | Any mid-layer trailers | | Any mid-layer trailers | 211 +-------------------------+ +-------------------------+ 212 | Any outer trailers | 213 +-------------------------+ 215 Figure 1: SEAL Encapsulation 217 where the SEAL header is inserted as follows: 219 o For simple IP/IPv4 encapsulations (e.g., 220 [RFC2003][RFC2004][RFC4213]), the SEAL header is inserted between 221 the inner IP and outer IPv4 headers as: IP/SEAL/IPv4. 223 o For tunnel-mode IPsec/ESP encapsulations over IPv4, 224 [RFC4301][RFC4303], the SEAL header is inserted between the ESP 225 and outer IPv4 headers as: IP/*/ESP/SEAL/IPv4. 227 o For IP encapsulations over transports such as UDP (e.g., 228 [RFC4380][I-D.farinacci-lisp]), the SEAL header is embedded in any 229 middle- and outer-'*' encapsulations within the transport layer, 230 e.g., as IP/*/SEAL/*/UDP/IPv4. 232 Encapsulation and tunneling establishes a virtual point-to-multipoint 233 interface abstraction of the subnetwork. From a logical viewpoint, 234 this interface appears as a Virtual EThernet (VET) 235 [I-D.templin-autoconf-dhcp] that connects the ITE to all ETEs in the 236 subnetwork as single-hop neighbors. From a physical perspective, 237 however, packets sent over the VET interface may be forwarded across 238 many IPv4 and/or sub-IPv4 layer subnetwork hops. 240 SEAL-encapsulated packets include a 32-bit SEAL-ID formed from the 241 concatenation of the 16-bit ID Extension field in the SEAL header as 242 the most-significant bits and with the 16-bit ID value in the outer 243 IPv4 header as the least-significant bits. Routers use the SEAL-ID 244 for both duplicate packet detection within the subnetwork and also 245 for multi-level segmentation and reassembly of large packets. 247 SEAL enables a multi-level segmentation and reassembly capability. 248 First, the ITE can use inner IPv4 fragmentation for fragmentable 249 inner IPv4 packets before encapsulation to avoid lower-level 250 segmentation and reassembly. Secondly, the SEAL layer itself 251 provides a simple mid-layer cutting-and-pasting of inner packets 252 without incurring IPv4 fragmentation on the outer packet. Finally, 253 ordinary IPv4 fragmentation for the outer IPv4 packet after SEAL 254 encapsulation is also permitted under certain limited and carefully 255 managed circumstances. 257 4.2. Packetization 259 4.2.1. Packet Size Considerations 261 Due to the ubiquitous deployment of standard Ethernet and similar 262 networking gear, the nominal Internet cell size has become 1500 263 bytes; this is the de facto size that end systems have come to expect 264 will be delivered by the network without loss due to an MTU 265 restriction on the path, or a suitable ICMP PTB message returned. 266 However, PTB messages are not delivered reliably, and any PTBs coming 267 from within the subnetwork could be erroneous or maliciously 268 fabricated. The ITE therefore requires a means for conveying 1500 269 byte (or smaller) original packets over the VET interface without 270 loss due to link MTU restrictions and/or triggering PTB messages from 271 within the subnetwork. 273 In common deployments, there may be many forwarding hops between the 274 source and the ITE. Within those hops, there may be additional 275 encapsulations (IPSec, L2TP, etc.) such that a 1500 byte original 276 packet might grow to a larger size by the time it reaches the ITE. 277 In order to preserve the end system expectation of delivery for 1500 278 byte and smaller packets, the ITE therefore requires a means for 279 conveying this larger packet over the VET interface even though there 280 may be subnetwork links that configure a smaller MTU. 282 The ITE upholds the 1500-byte-and-smaller packet delivery expectation 283 by instituting a SEAL Maximum Segment Size (S-MSS) variable, set to 284 1KB by default and configurable within the range of [128 - 2KB]. The 285 ITE also institutes a [S-MSS - 2KB] segmentation region such that all 286 inner packets within this size range are segmented into multiple SEAL 287 packets. For 1500 byte and smaller inner packets/fragments, the 2KB 288 upper bound allows for ~500 bytes of additional subnetwork 289 encapsulation overhead on the path from the original source to the 290 ITE. Similarly, the default 1KB lower bound allows ~500 bytes of 291 additional encapsulation on the path between the ITE and ETE to 292 accommodate each SEAL packet while avoiding IPv4 fragmentation along 293 most paths within subnetwork that deploy 1500 byte links. 295 The ITE additionally admits all inner packets larger than 2KB into 296 the VET interface as single-segment SEAL packets under the assumption 297 that original sources that send packets larger than 1500 bytes are 298 using an end-to-end MTU determination capability such as specified in 299 [RFC4821]. 301 4.2.2. Inner IPv4 Fragmentation 303 The IP layer fragments inner IPv4 packets larger than 2KB and with 304 the IPv4 Don't Fragment (DF) bit set to 0 into IPv4 fragments no 305 larger than 2KB before any mid-layer '*' encapsulations. (It is also 306 recommended that the fragment size be chosen small enough so as to 307 avoid any SEAL segmentation and/or outer IPv4 fragmentation if 308 possible). The IP layer then submits each inner IPv4 fragment to the 309 ITE as an independent IP packet for encapsulation. Note that inner 310 fragmentation may not be available for certain ITE types, e.g., for 311 tunnel-mode IPsec. 313 Any inner IPv4 fragments created in this fashion will be reassembled 314 by the final destination. 316 4.2.3. SEAL Segmentation and Encapsulation 318 After inner IPv4 fragmentation, the ITE encapsulates the IPv4 packet/ 319 fragment in any mid-layer '*' headers, then performs segmentation on 320 this inner packet based on a segment size that is likely to avoid 321 IPv4 fragmentation within the subnetwork. The ITE maintains a SEAL 322 Maximum Segment Size (S-MSS) variable for each ETR as per-ETR IPv4 323 destination cache soft state, including IPv4 multicast destinations. 324 S-MSS SHOULD be initialized to 1KB by default, and MAY be changed to 325 different values in the range [128, 2KB] based on static 326 configuration and/or dynamic segment size probing. 328 The ITE MUST NOT break unfragmentable inner packets larger than 2KB 329 into smaller segments, but rather MUST encapsulate them as a single 330 segment SEAL packet. The ITE breaks inner packets no larger than 2KB 331 into N segments (N <= 16) that are no larger than S-MSS bytes each, 332 i.e., even if the inner packet is unfragmentable. Each segment 333 except the final one MUST be of equal length, while the final segment 334 MAY be of different length. The first byte of each segment MUST 335 begin immediately after the final byte of the previous segment, i.e., 336 the segments MUST NOT overlap. 338 The ITE encapsulates each segment in a SEAL header formatted in 339 either minimal- or extended- formats according to the following 340 figures: 342 0 1 2 3 343 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 344 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 345 | ID Extension |R|M|CTL|Segment| Next Header A | 346 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 348 Figure 2: Minimal SEAL Header Format 350 0 1 2 3 351 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 352 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 353 | ID Extension |R|M|CTL|Segment| 0 | 354 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 355 | RSVD | Flow Label | Next Header B | 356 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 358 Figure 3: Extended SEAL Header Format 360 where the header fields are defined as follows: 362 ID Extension (16) 363 a 16-bit extension of the 16-bit ID field in the outer IPv4 364 header; encodes the most-significant 16 bits of a 32 bit SEAL-ID 365 value. 367 R (1) 368 Reserved, and must be zero. 370 M (1) 371 the "More Segments bit. Set to 1 if this SEAL packet contains a 372 non-final segment of a multi-segment inner packet. 374 CTL (2) 375 a 2-bit "Control" field that identifies the type of SEAL packet as 376 follows: 378 '00' - an ordinary SEAL packet. 380 '01' - a Fragmentation Report (FRAGREP). 382 '10' - an implicit probe. 384 '11' - an explicit probe. 386 Segment (4) 387 a 4-bit Segment number. Encodes a segment number between 0 - 15. 389 Next Header A (8) an 8-bit field that encodes either an IP protocol 390 number the same as for the IPv4 protocol and IPv6 next header 391 fields, or the value zero. When Next Header A is non-zero, the 392 SEAL header is in minimal format; otherwise, the SEAL header is in 393 extended format. 395 RSVD a 4-bit Reserved field, present only in extended format. Must 396 be zero. 398 Flow Label (20) a 20-bit flow label field, present only in extended 399 format. Contains a 20-bit value corresponding to the inner packet 400 during SEAL encapsulation. 402 Next Header B (8) an 8-bit field that encodes an IP protocol number 403 the same as for the IPv4 protocol and IPv6 next header fields. 405 For N-segment inner packets (N <= 16), the ITE selects a SEAL header 406 format (minimal or extended) and encapsulates each segment in a 407 header of the same format with (M=1; Segment=0) for the first 408 segment, (M=1; Segment=1) for the second segment, etc., with the 409 final segment setting (M=0; Segment=N-1). Note that single-segment 410 inner packets instead set (M=0; Segment=0). 412 During encapsulation, the ITE also sets CTL='00' in the SEAL header 413 of each segment if this segment is not to be used as an explicit or 414 implicit probe. Otherwise, the ITE sets CTL='10' or '11' according 415 to the type of probe desired (see: Section 4.6). 417 The ITE next writes either the IP protocol number corresponding to 418 the inner packet (minimal format) or the value zero (extended format) 419 in 'Next Header A' in the SEAL header of each segment. When extended 420 format is used, the ITE also writes a 20-bit flow label value 421 corresponding to the inner packet into the Flow Label field and 422 writes the IP protocol number corresponding to the inner packet in 423 'Next Header B'. The ITE then encapsulates the segment in the 424 requisite */IPv4 outer headers. 426 The ITE maintains a 32-bit SEAL-ID value as per-ETE soft state in the 427 IPv4 destination cache. The value is randomly-initialized when the 428 soft state is created and monotonically incremented (modulo 2^32) for 429 each successive SEAL packet sent to the ETE. For each SEAL packet, 430 the ITE writes the least-significant 16 bits of the SEAL-ID value in 431 the ID field in the outer IPv4 header, and writes the most- 432 significant 16 bits in the ID Extension field in the SEAL header. 434 The ITE finally sets other fields in the outer */IPv4 headers 435 according to the specific encapsulation format (e.g., [RFC2003], 436 [RFC4213], etc.). 438 4.2.4. Setting DF and Sending Packets 440 For inner packets larger than 2KB, the ITE determines whether the 441 size of the packet plus the size of the SEAL/*/IPv4 encapsulation 442 headers is larger than the MTU of the underlying interface over which 443 the tunnel is configured. If the packet is too large, the ITE 444 discards it and sends an ICMP PTB message back to the original source 445 with an MTU value taken from the underlying interface minus the size 446 of the encapsulation headers. Otherwise, the ITE sets the Don't 447 Fragment (DF) bit in the outer IPv4 header to DF=1. 449 For inner packets that were no larger than 2KB before segmentation, 450 the ITE sets DF=0 or DF=1 in the outer IPv4 header of each SEAL 451 packet according to the desired behavior as follows: 453 o if the ITE is probing the path to the ETE, it MUST set DF=0 to 454 allow the ETE to sense and report fragmentation. 456 o if S-MSS=128, the ITE MUST set DF=0 in case any unavoidable in- 457 the-network IPv4 fragmentation is required. 459 o if the ITE has recently probed the path to the ETE, it MAY set 460 DF=1 in subsequent SEAL packets until the next probing cycle. 462 After setting the DF bits, the ITE SHOULD send all SEAL packets that 463 encapsulate segments of the same inner packet into the VET interface 464 in canonical order, i.e., Segment 0 first, then Segment 1, etc. 466 4.3. Reassembly 468 4.3.1. Reassembly Buffer Requirements 470 ETEs MUST be capable of using IPv4-layer reassembly to reassemble 471 SEAL packets of at least (2KB+ENCAPS) bytes, i.e., ETEs MUST 472 configure an IPv4 Effective MTU to Receive (EMTU_R) of at least (2KB+ 473 ENCAPS). 475 ETEs MUST also be capable of using SEAL-layer reassembly to 476 reassemble inner packets of at least 2KB, i.e., ETEs MUST configure a 477 SEAL EMTU_R of at least 2KB. 479 4.3.2. IPv4-Layer Reassembly 481 The ETE performs IPv4 reassembly as-normal, and maintains a 482 conservative high- and low-water mark for the number of outstanding 483 reassemblies pending for each ITE as is common for widely deployed 484 implementations. When the size of the reassembly buffer exceeds this 485 high-water mark, the ETE actively discards incomplete reassemblies 486 (e.g., using an Active Queue Management (AQM) strategy such as drop- 487 eldest, Random Early Drop (RED), etc.) until the size falls below the 488 low-water mark. 490 Note that in the limiting case the ETE may choose to discard all 491 reassemblies for packets that set CTL='1X' in the SEAL header and 492 only perform reassembly for packets that set CTL='0X' in the SEAL 493 header (see; Section 4.4). 495 4.3.3. SEAL-Layer Reassembly 497 After any IPv4-layer reassembly, the ETE performs SEAL-layer 498 reassembly for N-segment inner packets through simple in-order 499 concatenation of the encapsulated segments from N consecutive SEAL 500 packets. These packets contain Segment numbers 0 through N-1, and 501 with consecutive SEAL-ID values encoded in the 32-bit concatenation 502 of the ID Extension field in the SEAL header and the ID field in the 503 IPv4 header. That is, for an N-segment inner packet, inner packet 504 reassembly entails the concatenation of the segments from SEAL 505 packets with (Segment 0, SEAL-ID i), followed by (Segment 1, SEAL-ID 506 ((i + 1) mod 2^32)), etc. up to (Segment N-1, SEAL-ID ((i + N-1) mod 507 2^32)). This requires the ETE to maintain a cache of recently 508 received SEAL packets for a hold time that would allow for reasonable 509 inter-segment delays. 511 Rather than set an absolute hold time, the ETE must actively discard 512 any pending reassemblies that appear to have no opportunity for 513 completion, e.g., when a considerable number of SEAL packets have 514 been received before a packet that completes the pending reassembly 515 has arrived. This assumes that any packet reordering within the 516 subnetwork will be on the order of a small number of positions and 517 that any gross reordering will be short-lived in nature. 519 4.4. Generating Fragmentation Reports 521 When the ETE has received at least the leading 128 bytes (or up to 522 the end) of a SEAL packet that was delivered as multiple IPv4 523 fragments and with CTL='1X' in the SEAL header, it generates a 524 Fragmentation Report (FRAGREP) message to send back over the VET 525 interface to the original source. The ETE also generates a FRAGREP 526 for any SEAL packet with CTL='11' in the SEAL header (see: Section 527 4.6), i.e. even if the packet was not fragmented. 529 The ETE prepares the FRAGREP message by encapsulating the leading 128 530 bytes of the fragmented SEAL packet in an outer SEAL/*/IPv4 header. 531 The ETE sets the IPv4 length field in the encapsulated packet to the 532 length of the largest IPv4 fragment received, i.e., even if the 533 largest fragment received was not the first fragment. 535 The ETE next sets CTL='01' and Segment=0 in the SEAL header and sets 536 the fields of the IPv4 header set according to the specific 537 encapsulation type. In particular, the ETE sets the destination 538 address of the FRAGREP to the source address that was included in the 539 IPv4 first fragment, and sets the source address of the FRAGREP to 540 the destination address that was included in the IPv4 first fragment. 541 If the destination address in the first fragment was multicast, the 542 ETE instead sets the source address of the FRAGREP to an address 543 assigned to the underlying IPv4 interface. 545 The FRAGREP message has the following format: 547 +-------------------------+ 548 | | 549 ~ Outer */IPv4 headers ~ 550 | | 551 +-------------------------+ 552 | SEAL Header | 553 | (CTL='01'; Segment=0) | 554 +-------------------------+ 555 | | 556 ~ Up to 128 bytes of pkt, ~ 557 ~ with IPv4 len set to ~ 558 | len of largest fragment | 559 | | 560 +-------------------------+ 562 Figure 4: Fragmentation Report (FRAGREP) Message 564 4.5. Receiving Fragmentation Reports 566 When the ITE receives a potential FRAGREP message, it first verifies 567 that the message was formatted correctly by the ETE (per Section 4.4) 568 and confirms that the FRAGREP corresponds to one of the SEAL packets 569 that it actually sent to the ETE by examining the encapsulated IPv4 570 fragment. 572 For a valid FRAGREP, if the length field in the encapsulated IPv4 573 fragment contains a value larger than (128+ENCAPS), the ITE sets 574 S-MSS for this ETE to this length minus ENCAPS; otherwise, it sets 575 S-MSS = MIN(S-MSS/2, 128) . This limited halving procedure accounts 576 for the possibility that the ETE received the leading 128 bytes of 577 the fragmented SEAL packet in IPv4 fragments that were significantly 578 smaller than the path MTU. In that case, convergence to an 579 acceptable S-MSS size may require multiple iterations of sending SEAL 580 packets and receiving FRAGREP messages in a manner that parallels 581 classical path MTU discovery [RFC1191], albeit with all path MTU 582 feedback coming from the ETE and not a network middlebox. But, the 583 limited halving procedure ensures that convergence will occur quickly 584 even in extreme cases, while the correct MTU will normally be 585 determined in a single iteration since routers that use IPv4 586 fragmentation are recommended to produce the minimum number of 587 fragments [RFC1812]. 589 4.6. S-MSS Probing 591 When S-MSS is larger than 128, the ITE MUST probe the path to the ETE 592 periodically to detect and dampen any in-the-network IPv4 593 fragmentation. The ITE performs implicit probing of the path by 594 setting CLT='10' in the SEAL header and DF=0 in the IPv4 header of 595 all SEAL packets containing segments of the same inner packet used 596 for probing. If any in-the-network fragmentation occurs, the ITE 597 will receive verifiable FRAGREP messages from the ETE. 599 The ITE can also send explicit probes to periodically probe for 600 larger S-MSS values (to a maximum of 2KB) by sending single-segment 601 SEAL packets with CTL='11' in the SEAL header and DF=0 in the IPv4 602 header. The ETE will return a FRAGREP message whether or not any in- 603 the-network fragmentation occurs, which the ITE will process exactly 604 as for any FRAGREP per Section 4.5. The ITE MAY pad the length of 605 SEAL packets used for explicit probing (to a maximum size of 2KB+ 606 ENCAPS) if permitted by the specific */IPv4 encapsulation method. 608 The ITE can optionally send intervening SEAL packets between probing 609 intervals as passive probes by setting DF=0, or as non-probes by 610 setting DF=1. 612 When S-MSS=128, the ITE MUST set CTL='00' in the SEAL header of each 613 SEAL packet that is not being used as an explicit probe such that the 614 ETE will not generate FRAGREPs for unavoidable in-the-network 615 fragmentation. 617 4.7. Processing ICMP PTBs 619 The ITE may receive ICMP PTB messages in response to any packets that 620 were admitted into the VET interface with DF=1. The ITE may 621 optionally ignore, log, or honor the messages according to the 622 subnetwork trust basis. For example, ITEs connected to subnetworks 623 managed under a single administrative domain may be configured to 624 honor ICMP PTBs while ITEs connected to the global interdomain 625 routing core may be configured to ignore/log them. 627 When ICMP PTBs are honored, the ITE: 629 o SHOULD send translated ICMP PTB messages back to the original 630 source (if possible) for ICMP PTBs that correspond to SEAL packets 631 that encapsulate a segment larger than 2KB. 633 o SHOULD treat ICMP PTBs that correspond to SEAL packets that 634 encapsulate segments no larger than 2KB as an indication to resume 635 probing. 637 5. Link Requirements 639 Subnetwork designers are strongly encouraged to follow the 640 recommendations in [RFC3819] when configuring link MTUs. 642 6. End System Requirements 644 SEAL is a router-to-router encapsulation protocol and therefore makes 645 no requirements for end systems. However, end systems that send 646 unfragmentable IP packets of 1501 bytes or larger are strongly 647 encouraged to use Packetization Layer Path MTU Discovery per 648 [RFC4821], since the network may not always be able to return useful 649 ICMP PTB messages. 651 7. Router Requirements 653 IPv4 routers observe the requirements in [RFC1812]. 655 8. IANA Considerations 657 A new IP protocol number for the SEAL protocol is requested. 659 A new IPv4 site-scoped ALL_MANET_ROUTERS multicast group is 660 requested. 662 9. Security Considerations 664 Unlike IPv4 fragmentation, overlapping fragment attacks are not 665 possible due to the requirement that SEAL segments be non- 666 overlapping. 668 An amplification/reflection attack is possible when an attacker sends 669 spoofed IPv4 fragments to an ETE with CTL='1X' in the SEAL header, 670 resulting in a stream of FRAGREP messages returned to a victim ITE. 671 The encapsulated segment of the spoofed IPv4 fragment provides 672 mitigation for the ITE to detect and discard spurious FRAGREPs. 674 The SEAL header is sent in-the-clear (outside of any IPsec/ESP 675 encapsulations) the same as for the IPv4 header. As for IPv6 676 extension headers, the SEAL header is protected only by L2 integrity 677 checks, and is not covered under any L3 integrity checks. 679 10. Acknowledgments 681 Path MTU determination through the report of fragmentation 682 experienced by the final destination was first proposed by Charles 683 Lynn of BBN on the TCP-IP mailing list in May 1987. An historical 684 analysis of the evolution of path MTU discovery appears in 685 http://www.tools.ietf.org/html/draft-templin-v6v4-ndisc-01 and is 686 reproduced in Appendix A of this document. 688 This work was inspired in part by discussions on the IETF MANET and 689 IRTF RRG mailing lists in the 12/07 - 01/08 timeframe, and the author 690 acknowledges those who participated in the discussions. The work 691 also draws on the earlier investigations of [I-D.templin-inetmtu] 692 which acknowledges many who contributed to the effort. 694 The extended SEAL header format was inspired by recent discussions. 696 11. References 698 11.1. Normative References 700 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 701 September 1981. 703 [RFC1812] Baker, F., "Requirements for IP Version 4 Routers", 704 RFC 1812, June 1995. 706 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 707 Requirement Levels", BCP 14, RFC 2119, March 1997. 709 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 710 (IPv6) Specification", RFC 2460, December 1998. 712 11.2. Informative References 714 [FOLK] C, C., D, D., and k. k, "Beyond Folklore: Observations on 715 Fragmented Traffic", December 2002. 717 [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", 718 October 1987. 720 [I-D.farinacci-lisp] 721 Farinacci, D., "Locator/ID Separation Protocol (LISP)", 722 draft-farinacci-lisp-05 (work in progress), November 2007. 724 [I-D.ietf-autoconf-manetarch] 725 Chakeres, I., Macker, J., and T. Clausen, "Mobile Ad hoc 726 Network Architecture", draft-ietf-autoconf-manetarch-07 727 (work in progress), November 2007. 729 [I-D.ietf-manet-smf] 730 Macker, J. and S. Team, "Simplified Multicast Forwarding 731 for MANET", draft-ietf-manet-smf-06 (work in progress), 732 November 2007. 734 [I-D.templin-autoconf-dhcp] 735 Templin, F., Russert, S., and S. Yi, "MANET 736 Autoconfiguration", draft-templin-autoconf-dhcp-11 (work 737 in progress), February 2008. 739 [I-D.templin-inetmtu] 740 Templin, F., "Simple Protocol for Robust IP/*/IP Tunnel 741 Endpoint MTU Determination (sprite-mtu)", 742 draft-templin-inetmtu-06 (work in progress), 743 November 2007. 745 [MTUDWG] "IETF MTU Discovery Working Group mailing list, 746 gatekeeper.dec.com/pub/DEC/WRL/mogul/mtudwg-log, November 747 1989 - February 1995.". 749 [RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP 750 MTU discovery options", RFC 1063, July 1988. 752 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 753 November 1990. 755 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 756 for IP version 6", RFC 1981, August 1996. 758 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 759 October 1996. 761 [RFC2004] Perkins, C., "Minimal Encapsulation within IP", RFC 2004, 762 October 1996. 764 [RFC2501] Corson, M. and J. Macker, "Mobile Ad hoc Networking 765 (MANET): Routing Protocol Performance Issues and 766 Evaluation Considerations", RFC 2501, January 1999. 768 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", 769 RFC 2923, September 2000. 771 [RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D., 772 Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. 773 Wood, "Advice for Internet Subnetwork Designers", BCP 89, 774 RFC 3819, July 2004. 776 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 777 for IPv6 Hosts and Routers", RFC 4213, October 2005. 779 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 780 Internet Protocol", RFC 4301, December 2005. 782 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 783 RFC 4303, December 2005. 785 [RFC4380] Huitema, C., "Teredo: Tunneling IPv6 over UDP through 786 Network Address Translations (NATs)", RFC 4380, 787 February 2006. 789 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 790 Network Tunneling", RFC 4459, April 2006. 792 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 793 Discovery", RFC 4821, March 2007. 795 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 796 Errors at High Data Rates", RFC 4963, July 2007. 798 [TCP-IP] "TCP-IP mailing list archives, 799 http://www-mice.cs.ucl.ac.uk/multimedia/mist/tcpip, May 800 1987 - May 1990.". 802 Appendix A. Historic Evolution of PMTUD (written 10/30/2002) 804 The topic of Path MTU discovery (PMTUD) saw a flurry of discussion 805 and numerous proposals in the late 1980's through early 1990. The 806 initial problem was posed by Art Berggreen on May 22, 1987 in a 807 message to the TCP-IP discussion group [TCP-IP]. The discussion that 808 followed provided significant reference material for [FRAG]. An IETF 809 Path MTU Discovery Working Group [MTUDWG] was formed in late 1989 810 with charter to produce an RFC. Several variations on a very few 811 basic proposals were entertained, including: 813 1. Routers record the PMTUD estimate in ICMP-like path probe 814 messages (proposed in [FRAG] and later [RFC1063]) 816 2. The destination reports any fragmentation that occurs for packets 817 received with the "RF" (Report Fragmentation) bit set (Steve 818 Deering's 1989 adaptation of Charles Lynn's Nov. 1987 proposal) 820 3. A hybrid combination of 1) and Charles Lynn's Nov. 1987 proposal 821 (straw RFC draft by McCloughrie, Fox and Mogul on Jan 12, 1990) 823 4. Combination of the Lynn proposal with TCP (Fred Bohle, Jan 30, 824 1990) 826 5. Fragmentation avoidance by setting "IP_DF" flag on all packets 827 and retransmitting if ICMPv4 "fragmentation needed" messages 828 occur (Geof Cooper's 1987 proposal; later adapted into [RFC1191] 829 by Mogul and Deering). 831 Option 1) seemed attractive to the group at the time, since it was 832 believed that routers would migrate more quickly than hosts. Option 833 2) was a strong contender, but repeated attempts to secure an "RF" 834 bit in the IPv4 header from the IESG failed and the proponents became 835 discouraged. 3) was abandoned because it was perceived as too 836 complicated, and 4) never received any apparent serious 837 consideration. Proposal 5) was a late entry into the discussion from 838 Steve Deering on Feb. 24th, 1990. The discussion group soon 839 thereafter seemingly lost track of all other proposals and adopted 840 5), which eventually evolved into [RFC1191] and later [RFC1981]. 842 In retrospect, the "RF" bit postulated in 2) is not needed if a 843 "contract" is first established between the peers, as in proposal 4) 844 and a message to the MTUDWG mailing list from jrd@PTT.LCS.MIT.EDU on 845 Feb 19. 1990. These proposals saw little discussion or rebuttal, and 846 were dismissed based on the following the assertions: 848 o routers upgrade their software faster than hosts 850 o PCs could not reassemble fragmented packets 852 o Proteon and Wellfleet routers did not reproduce the "RF" bit 853 properly in fragmented packets 855 o Ethernet-FDDI bridges would need to perform fragmentation (i.e., 856 "translucent" not "transparent" bridging) 858 o the 16-bit IP_ID field could wrap around and disrupt reassembly at 859 high packet arrival rates 861 The first four assertions, although perhaps valid at the time, have 862 been overcome by historical events leaving only the final to 863 consider. But, [FOLK] has shown that IP_ID wraparound simply does 864 not occur within several orders of magnitude the reassembly timeout 865 window on high-bandwidth networks. 867 (Authors 2/11/08 note: this final point was based on a loose 868 interpretation of [FOLK], and is more accurately addressed in 869 [RFC4963].) 871 Author's Address 873 Fred L. Templin (editor) 874 Boeing Phantom Works 875 P.O. Box 3707 876 Seattle, WA 98124 877 USA 879 Email: fltemplin@acm.org 881 Full Copyright Statement 883 Copyright (C) The IETF Trust (2008). 885 This document is subject to the rights, licenses and restrictions 886 contained in BCP 78, and except as set forth therein, the authors 887 retain all their rights. 889 This document and the information contained herein are provided on an 890 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 891 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 892 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 893 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 894 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 895 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 897 Intellectual Property 899 The IETF takes no position regarding the validity or scope of any 900 Intellectual Property Rights or other rights that might be claimed to 901 pertain to the implementation or use of the technology described in 902 this document or the extent to which any license under such rights 903 might or might not be available; nor does it represent that it has 904 made any independent effort to identify any such rights. Information 905 on the procedures with respect to rights in RFC documents can be 906 found in BCP 78 and BCP 79. 908 Copies of IPR disclosures made to the IETF Secretariat and any 909 assurances of licenses to be made available, or the result of an 910 attempt made to obtain a general license or permission for the use of 911 such proprietary rights by implementers or users of this 912 specification can be obtained from the IETF on-line IPR repository at 913 http://www.ietf.org/ipr. 915 The IETF invites any interested party to bring to its attention any 916 copyrights, patents or patent applications, or other proprietary 917 rights that may cover technology that may be required to implement 918 this standard. Please address the information to the IETF at 919 ietf-ipr@ietf.org. 921 Acknowledgment 923 Funding for the RFC Editor function is provided by the IETF 924 Administrative Support Activity (IASA).