idnits 2.17.1 draft-templin-seal-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 798. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 809. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 816. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 822. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 11, 2008) is 5912 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-12) exists of draft-farinacci-lisp-05 == Outdated reference: A later version (-14) exists of draft-ietf-manet-smf-06 == Outdated reference: A later version (-38) exists of draft-templin-autoconf-dhcp-11 -- Obsolete informational reference (is this intentional?): RFC 1063 (Obsoleted by RFC 1191) -- Obsolete informational reference (is this intentional?): RFC 1981 (Obsoleted by RFC 8201) Summary: 2 errors (**), 0 flaws (~~), 5 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Templin, Ed. 3 Internet-Draft Boeing Phantom Works 4 Intended status: Informational February 11, 2008 5 Expires: August 14, 2008 7 Subnetwork Encapsulation and Adaptation Layer 8 draft-templin-seal-00.txt 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on August 14, 2008. 35 Copyright Notice 37 Copyright (C) The IETF Trust (2008). 39 Abstract 41 Subnetworks connect routers within a bounded region, and may also 42 connect to other networks including the Internet. These routers 43 forward unicast and multicast packets over paths that span multiple 44 IP- and/or sub-IP layer forwarding hops which may configure diverse 45 Maximum Transmission Units (MTUs) and introduce packet duplication. 46 This document specifies a Subnetwork Encapsulation and Adaptation 47 Layer (SEAL) that supports simplified duplicate packet detection and 48 accommodates links with diverse MTUs. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 53 2. Terminology and Requirements . . . . . . . . . . . . . . . . . 3 54 3. Applicability Statement . . . . . . . . . . . . . . . . . . . 4 55 4. SEAL Protocol Specification . . . . . . . . . . . . . . . . . 5 56 4.1. Model of Operation . . . . . . . . . . . . . . . . . . . . 5 57 4.2. Packetization . . . . . . . . . . . . . . . . . . . . . . 6 58 4.2.1. Packet Size Considerations . . . . . . . . . . . . . . 6 59 4.2.2. Inner IPv4 Fragmentation . . . . . . . . . . . . . . . 7 60 4.2.3. SEAL Segmentation and Encapsulation . . . . . . . . . 7 61 4.2.4. Sending Packets . . . . . . . . . . . . . . . . . . . 9 62 4.3. Reassembly . . . . . . . . . . . . . . . . . . . . . . . . 9 63 4.3.1. Reassembly Buffer Requirements . . . . . . . . . . . . 9 64 4.3.2. IPv4 Reassembly . . . . . . . . . . . . . . . . . . . 10 65 4.3.3. Inner Packet Reassembly . . . . . . . . . . . . . . . 10 66 4.4. Generating Fragmentation Reports . . . . . . . . . . . . . 11 67 4.5. Receiving Fragmentation Reports . . . . . . . . . . . . . 11 68 4.6. Probing for Larger S-MSS Values . . . . . . . . . . . . . 12 69 4.7. Processing ICMP PTBs . . . . . . . . . . . . . . . . . . . 12 70 5. Link Requirements . . . . . . . . . . . . . . . . . . . . . . 13 71 6. End System Requirements . . . . . . . . . . . . . . . . . . . 13 72 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 73 8. Security Considerations . . . . . . . . . . . . . . . . . . . 13 74 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13 75 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 76 10.1. Normative References . . . . . . . . . . . . . . . . . . . 14 77 10.2. Informative References . . . . . . . . . . . . . . . . . . 14 78 Appendix A. Historic Evolution of PMTUD (written 10/30/2003) . . 16 79 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 17 80 Intellectual Property and Copyright Statements . . . . . . . . . . 18 82 1. Introduction 84 Mobile Ad-hoc Networks (MANETs) and other subnetworks connect routers 85 on links with asymmetric reachability characteristics, and may also 86 connect to other networks including the Internet. These routers 87 forward unicast and multicast packets over paths that span multiple 88 IP- and/or sub-IP layer forwarding hops, which may traverse links 89 with diverse Maximum Transmission Units (MTUs) and may also introduce 90 packet duplication due to temporal or persistent routing loops. It 91 is also expected that these routers will support operation of the 92 Internet protocols [RFC0791][RFC2460]. 94 The use of IPv4 encapsulation has long been considered as an 95 alternative for introducing a well-behaved identification field 96 useful for duplicate packet detection, such as required for 97 Simplified Multicast Forwarding [I-D.ietf-manet-smf]. However, the 98 16-bit ID field in the outer IPv4 header supports only 2^16 distinct 99 identification values and therefore does not provide sufficient space 100 for robust duplicate packet detection over modern link technologies. 102 Additionally, the insertion of an outer IPv4 header reduces the 103 effective path MTU as-seen by the IP layer. This reduced MTU can be 104 accommodated through the use of IPv4 fragmentation, but unmitigated 105 in-the-network fragmentation has been shown to be harmful through 106 operational experience and studies conducted over the course of many 107 years [FRAG][RFC2923][RFC4459][RFC4963]. 109 This document proposes a Subnetwork Encapsulation and Adaptation 110 Layer (SEAL) for the operation of IP over subnetworks (such as 111 MANETs) that connect Ingress- and Egress Tunnel Endpoints (ITEs/ 112 ETEs). SEAL supports simple and robust duplicate packet detection, 113 and accommodates links with diverse MTUs. SEAL additionally supports 114 multiprotocol operation and provides extended quality of service for 115 the protocols that use it. The SEAL protocol is specified in the 116 following sections. 118 2. Terminology and Requirements 120 The terminology of [RFC3819][RFC2501][I-D.ietf-autoconf-manetarch] is 121 used in this document. The following abbreviations correspond to 122 terms used within this document and elsewhere in common 123 Internetworking nomenclature: 125 MANET - Mobile Ad-hoc Network 127 Subnetwork - a MANET or other network that connects (and is 128 bounded by) ITEs and ETEs 129 SEAL - Subnetwork Encapsulation and Adaptation Layer 131 VET - Virtual EThernet 133 ITE - Ingress Tunnel Endpoint 135 ETE - Egress Tunnel Endpoint 137 MTU - Maximum Transmission Unit 139 S-MSS - SEAL Maximum Segment Size 141 EMTU_R - Effective MTU to Receive 143 PTB - an ICMPv6 "Packet Too Big" or an ICMPv4 "fragmentation 144 needed" message 146 DF - the IPv4 header Don't Fragment flag 148 ENCAPS - the size of the outer encapsulating SEAL/*/IPv4 headers 150 FRAGREP - a Fragmentation Report message 152 SEAL packet - a segment of an inner packet encapsulated in outer 153 SEAL/*/IPv4 headers 155 The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, 156 SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this 157 document, are to be interpreted as described in [RFC2119]. 159 3. Applicability Statement 161 SEAL inserts an additional mid-layer encapsulation when IP/*/IPv4 162 encapsulation is used, and appears as a subnetwork encapsulation as 163 seen by inner layers. 165 While the SEAL approach was motivated by the specific use case of 166 duplicate packet detection in MANETs, the domain of applicability is 167 not limited to the MANET problem space and extends to other 168 subnetwork uses such as tunneling across enterprise networks, the 169 interdomain routing core, etc. 171 For further study, SEAL may also be useful for "transport-mode" 172 applications, e.g., when the inner packet encapsulates ordinary 173 protocol data rather than an IP packet. 175 4. SEAL Protocol Specification 177 4.1. Model of Operation 179 Ingres Tunnel Endpoints (ITEs) insert a SEAL header in the IP/*/ 180 IPv4-encapsulated packets they inject into a subnetwork, where the 181 outermost IPv4 header contains the source and destination addresses 182 of the ITR/ETR subnetwork entry/exit points, respectively. SEAL 183 defines a new IP protocol type and a new mid-layer encapsulation for 184 both unicast and multicast inner packets. The ITE inserts a SEAL 185 header during encapsulation as shown in Figure 1: 187 +-------------------------+ 188 | | 189 ~ Outer */IPv4 headers ~ 190 | | 191 +-------------------------+ 192 +-- SEAL Header --+ 193 +-------------------------+ +-------------------------+ 194 | | | | 195 ~ Any mid-layer * headers ~ ~ Any mid-layer * headers ~ 196 | | | | 197 +-------------------------+ +-------------------------+ 198 | | | | 199 ~ Inner IP ~ ---> ~ Inner IP ~ 200 ~ Packet ~ ---> ~ Packet ~ 201 | | | | 202 +-------------------------+ +-------------------------+ 203 | Any mid-layer trailers | | Any mid-layer trailers | 204 +-------------------------+ +-------------------------+ 205 | Any outer trailers | 206 +-------------------------+ 208 Figure 1: SEAL Encapsulation 210 where the SEAL header is inserted as follows: 212 o For simple IP/IPv4 encapsulations (e.g., 213 [RFC2003][RFC2004][RFC4213]), the SEAL header is inserted between 214 the inner IP and outer IPv4 headers as: IP/SEAL/IPv4. 216 o For tunnel-mode IPsec/ESP encapsulations over IPv4, 217 [RFC4301][RFC4303], the SEAL header is inserted between the ESP 218 and outer IPv4 headers as: IP/*/ESP/SEAL/IPv4. 220 o For IP encapsulations over transports such as UDP (e.g., 221 [RFC4380][I-D.farinacci-lisp]), the SEAL header is embedded in any 222 middle- and outer-'*' encapsulations within the transport layer, 223 e.g., as IP/*/SEAL/*/UDP/IPv4. 225 Encapsulation and tunneling establishes a virtual point-to-multipoint 226 interface abstraction of the subnetwork. From a logical viewpoint, 227 this interface appears as a Virtual EThernet (VET) 228 [I-D.templin-autoconf-dhcp] that connects the ITE to all ETEs in the 229 subnetwork as single-hop neighbors. From a physical perspective, 230 however, packets sent over the VET interface may be forwarded across 231 many IPv4 and/or sub-IPv4 layer subnetwork hops. 233 SEAL-encapsulated packets include a 16-bit ID in the outer IPv4 234 header and a separate 30-bit ID in the SEAL header. Together, the 235 two ID values are used for both duplicate packet detection within the 236 subnetwork and also for multi-level segmentation and reassembly of 237 large packets. 239 SEAL enables a multi-level segmentation and reassembly capability. 240 First, the ITE can use inner IPv4 fragmentation for fragmentable 241 inner IPv4 packets before encapsulation to avoid lower-level 242 segmentation and reassembly. Secondly, the SEAL layer itself 243 provides a simple mid-layer cutting-and-pasting of inner packets 244 without incurring IPv4 fragmentation on the outer packet. Finally, 245 ordinary IPv4 fragmentation for the outer IPv4 packet after SEAL 246 encapsulation is also permitted under certain limited and carefully 247 managed circumstances, and useful for probing the path MTU. 249 4.2. Packetization 251 4.2.1. Packet Size Considerations 253 Due to the ubiquitous deployment of standard Ethernet and similar 254 networking gear, the nominal Internet cell size has become 1500 255 bytes; this is the de facto size that end systems have come to expect 256 will be delivered by the network without loss due to an MTU 257 restriction on the path, or a suitable ICMP PTB message returned. 258 However, PTB messages are not delivered reliably, and any PTBs coming 259 from within the subnetwork could be erroneous or maliciously 260 fabricated. The ITE therefore requires a means for conveying 1500 261 byte (or smaller) original packets over the VET interface without 262 loss due to link MTU restrictions and/or triggering PTB messages from 263 within the subnetwork. 265 In common deployments, there may be many forwarding hops between the 266 source and the ITE. Within those hops, there may be additional 267 encapsulations (IPSec, L2TP, etc.) such that a 1500 byte original 268 packet might grow to a larger size by the time it reaches the ITE. 269 In order to preserve the end system expectation of delivery for 1500 270 byte and smaller packets, the ITE therefore requires a means for 271 conveying this larger packet over the VET interface even though there 272 may be subnetwork links that configure a smaller MTU. 274 The ITE upholds the 1500-byte-and-smaller packet delivery expectation 275 by instituting a SEAL Maximum Segment Size (S-MSS) variable (set to 276 1KB by default) and a (S-MSS - 2KB] segmentation region such that all 277 inner packets within this size range are segmented into multiple SEAL 278 packets. For 1500 byte and smaller inner packets/fragments, the 2KB 279 upper bound allows for ~500 bytes of additional subnetwork 280 encapsulation overhead on the path from the original source to the 281 ITE. Similarly, the default 1KB lower bound allows ~500 bytes of 282 additional encapsulation on the path between the ITE and ETE to 283 accommodate each SEAL packet while avoiding IPv4 fragmentation along 284 most paths within subnetwork that deploy 1500 byte links. 286 The ITE additionally admits all inner packets larger than 2KB into 287 the VET interface as single-segment SEAL packets under the assumption 288 that original sources that send packets larger than 1500 bytes are 289 using an end-to-end MTU determination capability such as specified in 290 [RFC4821]. 292 4.2.2. Inner IPv4 Fragmentation 294 The IP layer fragments inner IPv4 packets larger than 2KB and with 295 the IPv4 Don't Fragment (DF) bit set to 0 into IPv4 fragments no 296 larger than 2KB before any mid-layer '*' encapsulations. The IP 297 layer then submits each inner IPv4 fragment to the ITE as an 298 independent IP packet for encapsulation. Note that inner 299 fragmentation may not be available for certain ITE types, e.g., for 300 tunnel-mode IPsec. 302 Any inner IPv4 fragments created in this fashion will be reassembled 303 by the final destination. 305 4.2.3. SEAL Segmentation and Encapsulation 307 After inner IPv4 fragmentation, the ITE adds any mid-layer '*' 308 encapsulations to the packet/fragment, then uses SEAL segmentation 309 based on a segment size that is likely to avoid IPv4 fragmentation 310 within the subnetwork. The ITE maintains a SEAL Maximum Segment Size 311 (S-MSS) variable for each ETR as per-ETR IPv4 destination cache soft 312 state, including IPv4 multicast destinations. S-MSS SHOULD be 313 initialized to 1KB by default, and MAY change to different values 314 based on static configuration and/or dynamic segment size probing. 316 The ITE MUST NOT break inner packets larger than 2KB into smaller 317 segments, but rather MUST encapsulate them as a single segment SEAL 318 packet. The ITE breaks inner packets no larger than 2KB into N 319 segments (N <= 16) that are no larger than S-MSS bytes each. Each 320 segment except the final one MUST be of equal length, while the final 321 segment MAY be of different length. The first byte of each segment 322 MUST begin immediately after the final byte of the segment that 323 preceded it, i.e., the segments MUST NOT overlap. 325 For each segment, the ITE inserts a SEAL header formatted according 326 to the following figure: 328 0 1 2 3 329 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 330 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 331 | Identification |M|R| 332 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 333 |Segment| Flow Label | Next Header | 334 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 336 Figure 2: SEAL Header Format 338 where the header fields are defined as follows: 340 Identification (30) 341 a 30-bit ID value that identifies the segments of the same packet. 343 M (1) 344 the "More Segments" bit. If set, this is a non-final segment of a 345 segmented packet. 347 R (1) 348 the "Report Fragmentation" bit. If set, the ETE must report any 349 fragmentation experienced by this SEAL packet. 351 Segment (4) 352 a 4-bit Segment number. 354 Flow Label (20) a 20-bit flow label field. Contains a 20-bit value 355 corresponding to the inner packet during SEAL encapsulation. 357 Next Header (8) an 8-bit field that encodes an IP protocol number 358 the same as for the IPv4 protocol and IPv6 next header fields. 360 For N-segment inner packets (N <= 16), the ITE encapsulates each 361 segment in a SEAL header with (M=1; Segment=0) for the first segment, 362 (M=1; Segment=1) for the second segment, etc., with the final segment 363 setting (M=0; Segment=N-1). Note that single-segment inner packets 364 instead set (M=0; Segment=0). 366 During encapsulation, the ITE also sets R=0 in the SEAL header of 367 each segment if *no* segments are longer than 128 bytes. If *any* 368 segments are longer than 128 bytes, the ITE instead sets R=1 in the 369 SEAL header of each segment. 371 The ITE next writes the IP protocol number corresponding to the inner 372 packet in 'Next Header' in the SEAL header of each segment and writes 373 a 20-bit flow label value corresponding to the inner packet into the 374 Flow Label field. The ITE then encapsulates the segment in the 375 requisite */IPv4 outer headers. 377 The ITE maintains a 30-bit monotonically-increasing SEAL ID value 378 initialized to 0 for the first inner packet and incremented by 1 379 (modulo 2^30) for each successive inner packet; the ITE also 380 maintains a 16-bit randomly-initialized IPv4 value ID value that is 381 randomly modulated for each successive SEAL packet. The ITE writes 382 the same SEAL ID value in each SEAL packet belonging to the same 383 inner packet, and writes a different modulated IPv4 ID value in the 384 ID field in the outer IPv4 header of each SEAL packet. The ITE 385 finally sets other fields in the outer */IPv4 headers according to 386 the specific encapsulation format (e.g., [RFC2003], [RFC4213], etc.). 388 4.2.4. Sending Packets 390 For inner packets larger than 2KB, the ITE determines whether the 391 size of the packet plus the size of the SEAL/*/IPv4 encapsulation 392 headers is larger than the MTU of the underlying interface over which 393 the tunnel is configured. If the packet is too large, the ITE 394 discards it and sends an ICMP PTB message back to the original source 395 with an MTU value taken from the underlying interface minus the size 396 of the encapsulation headers. Otherwise, the ITE sets DF=1 in the 397 outer IPv4 header and sends the packet into the VET interface. 399 For inner packets which were no larger than 2KB before segmentation, 400 the ITE sets the Don't Fragment (DF) in the outer IPv4 header of each 401 segment to 0 and sends the segment into the VET interface. 403 The ITE should send all SEAL packets that encapsulate segments of the 404 same inner packet in canonical order, i.e., Segment 0 first, then 405 Segment 1, etc. 407 4.3. Reassembly 409 4.3.1. Reassembly Buffer Requirements 411 ETEs MUST be capable of using IPv4 reassembly to reassemble SEAL 412 packets of at least (2KB+ENCAPS) bytes, i.e., ETEs MUST configure an 413 Effective MTU to Receive (EMTU_R) of at least (2KB+ENCAPS). ETEs 414 MUST also support a minimum 2KB reassembly size for reassembling the 415 decapsulated segments of inner packets. 417 4.3.2. IPv4 Reassembly 419 The ETE may receive IPv4 fragments of a fragmented SEAL packet. The 420 receipt of a first IPv4 fragment of a fragmented SEAL packet (i.e., 421 one with MF=1 and Offset=0) that encapsulates an inner packet segment 422 with R=1 in the SEAL header serves as indication to the ETE that 423 excessive IPv4 fragmentation is occurring in the subnetwork. 425 The ETE maintains a conservative high- and low-water mark for the 426 number of outstanding reassemblies pending for each ITE. When the 427 size of the reassembly buffer exceeds this high-water mark, the ETE 428 actively discards incomplete reassemblies (e.g., using an Active 429 Queue Management (AQM) strategy such as drop-eldest, Random Early 430 Drop (RED), etc.) until the size falls below the low-water mark. The 431 ETE otherwise performs IPv4 reassembly as-normal. 433 Note that in the limiting case the ETE may choose to discard all 434 reassemblies for packets that set R=1 in the SEAL header and only 435 perform reassembly for packets that set R=0 in the SEAL header. 437 For each IPv4 first fragment that sets R=1 in the SEAL header, the 438 ETE also sends a Fragmentation Report message (see: Section 4.4) to 439 the ITE to report the size of the largest fragment received, subject 440 to rate limiting. 442 4.3.3. Inner Packet Reassembly 444 The ETE reassembles inner packets through simple in-order 445 concatenation of the encapsulated segments from SEAL packets that 446 contain the same ID value. That is, for all SEAL packets of an 447 N-segment inner packet that include the same SEAL ID value, inner 448 packet reassembly entails the concatenation of Segment 0 followed by 449 Segment 1 followed by ... followed by Segment N-1. This requires the 450 ETE to maintain a cache of recently received SEAL packets for a hold 451 time that would allow for reasonable inter-segment delays. 453 Rather than set an absolute hold time, the ETE must actively discard 454 any pending reassemblies that appear to have no opportunity for 455 completion, e.g., when a considerable number of SEAL packets have 456 been received before a packet that completes the pending reassembly 457 has arrived. This assumes that any packet reordering within the 458 subnetwork will be on the order of a small number of positions and 459 that any gross reordering will be short-lived in nature. 461 4.4. Generating Fragmentation Reports 463 When the ETE receives an IPv4 first fragment of a fragmented SEAL 464 packet with (R=1; Next Header != 0) in the SEAL header, it prepares a 465 Fragmentation Report (FRAGREP) message to send back over the VET 466 interface to the original source. The FRAGREP message consists of an 467 outer SEAL/*/IPv4 header with (R=0; Next Header=0) in the SEAL 468 header. The message body contains the first N bytes of the IPv4 469 first fragment, where ENCAPS <= N <= 128 bytes. 471 The ETE sets the destination address of the FRAGREP to the source 472 address that was included in the IPv4 first fragment, and sets the 473 source address of the FRAGREP to the destination address that was 474 included in the first fragment. If the destination address in the 475 first fragment was multicast, the ETE instead sets the source address 476 of the FRAGREP to an address assigned to the outgoing interface. The 477 ETE sets DF=0 in the outer IPv4 header. 479 The FRAGREP message has the following format: 481 +-------------------------+ 482 | | 483 ~ Outer */IPv4 headers ~ 484 | | 485 +-------------------------+ 486 | SEAL Header | 487 | (R=0; Next Header=0) | 488 +-------------------------+ +-------------------------+ 489 | | | | 490 ~ IPv4 first fragment ~ ---> ~ Leading N bytes of IPv4 ~ 491 ~ (R=1; Next Header!=0) ~ ---> ~ first fragment ~ 492 | | | | 493 +-------------------------+ +-------------------------+ 495 Figure 3: Fragmentation Report (FRAGREP) Message 497 The ETE additionally generates a FRAGREP in response to an ITE's 498 explicit probe whether or not the probe was fragmented by IPv4 499 fragmentation. In particular, when the SEAL header in the first 500 fragment of an (un)fragmented SEAL packet includes (M=1, R=1, 501 Segment=16), the ETE generates a FRAGREP message exactly as specified 502 above (see also: Section 4.6). 504 4.5. Receiving Fragmentation Reports 506 When the ITE receives a potential FRAGREP message, it first verifies 507 that the message was formatted correctly by the ETE per Section 4.4. 508 Next, it confirms that the FRAGREP corresponds to one of the SEAL 509 packets that it actually sent into the VET interface by examining the 510 source, destination, IPv4 ID, SEAL ID etc. The ETE discards any 511 invalid FRAGREP messages without further processing. 513 Next, if the IPv4 length ('LEN') minus ENCAPS is 128 or larger, the 514 ITE sets S-MSS to (LEN-ENCAPS). Otherwise, the ITE performs S-MSS 515 reduction by setting S-MSS = MIN(S-MSS/2, 128). This limited halving 516 procedure accounts for the possibility that the ETE received IPv4 517 first fragments that were significantly smaller than the path MTU. 518 In that case, convergence to an acceptable S-MSS size may require 519 multiple iterations of sending SEAL packets and receiving FRAGREP 520 messages, i.e., the same as for classical path MTU discovery 521 [RFC1191]. But, the limited halving procedure ensures that 522 convergence will occur quickly even in extreme cases, while the 523 correct MTU will be determined in a single iteration under normal 524 circumstances in which routers produce large first fragments. 526 Note that multiple FRAGREP messages may be received for SEAL packets 527 that encapsulate segments of the same inner packet. In that case, 528 the ITE should set S-MSS to the minimum length reported in all 529 FRAGREP messages. If multiple FRAGREP messages report an MTU of 128 530 bytes or smaller, however, the ITE should only halve the current 531 S-MSS once - not multiple times. 533 4.6. Probing for Larger S-MSS Values 535 The ITE may periodically probe for larger S-MSS values (to a maximum 536 of 2KB) by sending one or more large single-segment SEAL packets, 537 i.e., by temporarily suspending S-MSS when preparing an inner packet. 538 The ITE sets (R=1, M=1, Segment=16) in the SEAL header to indicate to 539 the ETE that this is a single-segment probe. 541 The ETE will return a FRAGREP message whether fragmentation is 542 occurring or not, which the ITE will process exactly as for any 543 FRAGREP per Section 4.5. 545 4.7. Processing ICMP PTBs 547 The ITE may receive ICMP PTB messages in response to any packets that 548 were admitted into the VET interface with DF=1. The ITE may 549 optionally ignore, log, or honor the messages according to the 550 subnetwork trust basis. For example, ITEs connected to managed 551 subnetworks may be configured to honor ICMP PTBs while ITEs connected 552 to the global interdomain routing core may be configured to ignore/ 553 log them. 555 5. Link Requirements 557 Subnetwork designers are strongly encouraged to follow the 558 recommendations in [RFC3819] when configuring link MTUs. 560 6. End System Requirements 562 SEAL is a router-to-router protocol and therefore makes no 563 requirements for end systems. However, end systems that send 564 unfragmentable IP packets of 1501 bytes or larger are strongly 565 encouraged to use Packetization Layer Path MTU Discovery per 566 [RFC4821], since the network may not always be able to return useful 567 ICMP PTB messages. 569 7. IANA Considerations 571 A new IP protocol number for the SEAL protocol is requested. 573 A new IPv4 site-scoped ALL_MANET_ROUTERS multicast group is 574 requested. 576 8. Security Considerations 578 Unlike IPv4 fragmentation, overlapping fragment attacks are not 579 possible due to the requirement that SEAL segments be non- 580 overlapping. 582 The SEAL header is sent in-the-clear (outside of any IPsec/ESP 583 encapsulations) the same as for the IPv4 header. As for IPv6 584 extension headers, the SEAL header is protected only by L2 integrity 585 checks, and is not covered under any L3 integrity checks. 587 9. Acknowledgments 589 Path MTU determination through the report of fragmentation 590 experienced by the final destination was first proposed by Charles 591 Lynn of BBN on the TCP-IP mailing list in May 1987. An historical 592 analysis of the evolution of path MTU discovery appears in 593 http://www.tools.ietf.org/html/draft-templin-v6v4-ndisc-01 and is 594 reproduced in Appendix A of this document. 596 This work was inspired in part by discussions on the IETF MANET and 597 IRTF RRG mailing lists in the 12/07 -01/08 timeframe, and the author 598 acknowledges those who participated in the discussions. The work 599 also draws on the earlier investigations of [I-D.templin-inetmtu] 600 which acknowledges many who contributed to the effort. 602 10. References 604 10.1. Normative References 606 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 607 September 1981. 609 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 610 Requirement Levels", BCP 14, RFC 2119, March 1997. 612 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 613 (IPv6) Specification", RFC 2460, December 1998. 615 10.2. Informative References 617 [FOLK] C, C., D, D., and k. k, "Beyond Folklore: Observations on 618 Fragmented Traffic", December 2002. 620 [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", 621 October 1987. 623 [I-D.farinacci-lisp] 624 Farinacci, D., "Locator/ID Separation Protocol (LISP)", 625 draft-farinacci-lisp-05 (work in progress), November 2007. 627 [I-D.ietf-autoconf-manetarch] 628 Chakeres, I., Macker, J., and T. Clausen, "Mobile Ad hoc 629 Network Architecture", draft-ietf-autoconf-manetarch-07 630 (work in progress), November 2007. 632 [I-D.ietf-manet-smf] 633 Macker, J. and S. Team, "Simplified Multicast Forwarding 634 for MANET", draft-ietf-manet-smf-06 (work in progress), 635 November 2007. 637 [I-D.templin-autoconf-dhcp] 638 Templin, F., Russert, S., and S. Yi, "MANET 639 Autoconfiguration", draft-templin-autoconf-dhcp-11 (work 640 in progress), February 2008. 642 [I-D.templin-inetmtu] 643 Templin, F., "Simple Protocol for Robust IP/*/IP Tunnel 644 Endpoint MTU Determination (sprite-mtu)", 645 draft-templin-inetmtu-06 (work in progress), 646 November 2007. 648 [MTUDWG] "IETF MTU Discovery Working Group mailing list, 649 gatekeeper.dec.com/pub/DEC/WRL/mogul/mtudwg-log, November 650 1989 - February 1995.". 652 [RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP 653 MTU discovery options", RFC 1063, July 1988. 655 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 656 November 1990. 658 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 659 for IP version 6", RFC 1981, August 1996. 661 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 662 October 1996. 664 [RFC2004] Perkins, C., "Minimal Encapsulation within IP", RFC 2004, 665 October 1996. 667 [RFC2501] Corson, M. and J. Macker, "Mobile Ad hoc Networking 668 (MANET): Routing Protocol Performance Issues and 669 Evaluation Considerations", RFC 2501, January 1999. 671 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", 672 RFC 2923, September 2000. 674 [RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D., 675 Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. 676 Wood, "Advice for Internet Subnetwork Designers", BCP 89, 677 RFC 3819, July 2004. 679 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 680 for IPv6 Hosts and Routers", RFC 4213, October 2005. 682 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 683 Internet Protocol", RFC 4301, December 2005. 685 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 686 RFC 4303, December 2005. 688 [RFC4380] Huitema, C., "Teredo: Tunneling IPv6 over UDP through 689 Network Address Translations (NATs)", RFC 4380, 690 February 2006. 692 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 693 Network Tunneling", RFC 4459, April 2006. 695 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 696 Discovery", RFC 4821, March 2007. 698 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 699 Errors at High Data Rates", RFC 4963, July 2007. 701 [TCP-IP] "TCP-IP mailing list archives, 702 http://www-mice.cs.ucl.ac.uk/multimedia/mist/tcpip, May 703 1987 - May 1990.". 705 Appendix A. Historic Evolution of PMTUD (written 10/30/2003) 707 The topic of Path MTU discovery (PMTUD) saw a flurry of discussion 708 and numerous proposals in the late 1980's through early 1990. The 709 initial problem was posed by Art Berggreen on May 22, 1987 in a 710 message to the TCP-IP discussion group [TCP-IP]. The discussion that 711 followed provided significant reference material for [FRAG]. An IETF 712 Path MTU Discovery Working Group [MTUDWG] was formed in late 1989 713 with charter to produce an RFC. Several variations on a very few 714 basic proposals were entertained, including: 716 1. Routers record the PMTUD estimate in ICMP-like path probe 717 messages (proposed in [FRAG] and later [RFC1063]) 719 2. The destination reports any fragmentation that occurs for packets 720 received with the "RF" (Report Fragmentation) bit set (Steve 721 Deering's 1989 adaptation of Charles Lynn's Nov. 1987 proposal) 723 3. A hybrid combination of 1) and Charles Lynn's Nov. 1987 proposal 724 (straw RFC draft by McCloughrie, Fox and Mogul on Jan 12, 1990) 726 4. Combination of the Lynn proposal with TCP (Fred Bohle, Jan 30, 727 1990) 729 5. Fragmentation avoidance by setting "IP_DF" flag on all packets 730 and retransmitting if ICMPv4 "fragmentation needed" messages 731 occur (Geof Cooper's 1987 proposal; later adapted into [RFC1191] 732 by Mogul and Deering). 734 Option 1) seemed attractive to the group at the time, since it was 735 believed that routers would migrate more quickly than hosts. Option 736 2) was a strong contender, but repeated attempts to secure an "RF" 737 bit in the IPv4 header from the IESG failed and the proponents became 738 discouraged. 3) was abandoned because it was perceived as too 739 complicated, and 4) never received any apparent serious 740 consideration. Proposal 5) was a late entry into the discussion from 741 Steve Deering on Feb. 24th, 1990. The discussion group soon 742 thereafter seemingly lost track of all other proposals and adopted 743 5), which eventually evolved into [RFC1191] and later [RFC1981]. 745 In retrospect, the "RF" bit postulated in 2) is not needed if a 746 "contract" is first established between the peers, as in proposal 4) 747 and a message to the MTUDWG mailing list from jrd@PTT.LCS.MIT.EDU on 748 Feb 19. 1990. These proposals saw little discussion or rebuttal, and 749 were dismissed based on the following the assertions: 751 o routers upgrade their software faster than hosts 753 o PCs could not reassemble fragmented packets 755 o Proteon and Wellfleet routers did not reproduce the "RF" bit 756 properly in fragmented packets 758 o Ethernet-FDDI bridges would need to perform fragmentation (i.e., 759 "translucent" not "transparent" bridging) 761 o the 16-bit IP_ID field could wrap around and disrupt reassembly at 762 high packet arrival rates 764 The first four assertions, although perhaps valid at the time, have 765 been overcome by historical events leaving only the final to 766 consider. But, [FOLK] has shown that IP_ID wraparound simply does 767 not occur within several orders of magnitude the reassembly timeout 768 window on high-bandwidth networks. 770 (Authors 2/11/08 note: this final point was based on a loose 771 interpretation of [FOLK], and is more accurately addressed in 772 [RFC4963].) 774 Author's Address 776 Fred L. Templin (editor) 777 Boeing Phantom Works 778 P.O. Box 3707 779 Seattle, WA 98124 780 USA 782 Email: fltemplin@acm.org 784 Full Copyright Statement 786 Copyright (C) The IETF Trust (2008). 788 This document is subject to the rights, licenses and restrictions 789 contained in BCP 78, and except as set forth therein, the authors 790 retain all their rights. 792 This document and the information contained herein are provided on an 793 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 794 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 795 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 796 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 797 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 798 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 800 Intellectual Property 802 The IETF takes no position regarding the validity or scope of any 803 Intellectual Property Rights or other rights that might be claimed to 804 pertain to the implementation or use of the technology described in 805 this document or the extent to which any license under such rights 806 might or might not be available; nor does it represent that it has 807 made any independent effort to identify any such rights. Information 808 on the procedures with respect to rights in RFC documents can be 809 found in BCP 78 and BCP 79. 811 Copies of IPR disclosures made to the IETF Secretariat and any 812 assurances of licenses to be made available, or the result of an 813 attempt made to obtain a general license or permission for the use of 814 such proprietary rights by implementers or users of this 815 specification can be obtained from the IETF on-line IPR repository at 816 http://www.ietf.org/ipr. 818 The IETF invites any interested party to bring to its attention any 819 copyrights, patents or patent applications, or other proprietary 820 rights that may cover technology that may be required to implement 821 this standard. Please address the information to the IETF at 822 ietf-ipr@ietf.org. 824 Acknowledgment 826 Funding for the RFC Editor function is provided by the IETF 827 Administrative Support Activity (IASA).