idnits 2.17.1 draft-templin-seal-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 897. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 908. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 915. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 921. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 12, 2008) is 5911 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '128' on line 323 == Missing Reference: '2KB' is mentioned on line 323, but not defined ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) == Outdated reference: A later version (-12) exists of draft-farinacci-lisp-05 == Outdated reference: A later version (-14) exists of draft-ietf-manet-smf-06 == Outdated reference: A later version (-38) exists of draft-templin-autoconf-dhcp-11 -- Obsolete informational reference (is this intentional?): RFC 1063 (Obsoleted by RFC 1191) -- Obsolete informational reference (is this intentional?): RFC 1981 (Obsoleted by RFC 8201) Summary: 2 errors (**), 0 flaws (~~), 6 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group F. Templin, Ed. 3 Internet-Draft Boeing Phantom Works 4 Intended status: Informational February 12, 2008 5 Expires: August 15, 2008 7 Subnetwork Encapsulation and Adaptation Layer 8 draft-templin-seal-01.txt 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on August 15, 2008. 35 Copyright Notice 37 Copyright (C) The IETF Trust (2008). 39 Abstract 41 Subnetworks connect routers within a bounded region, and may also 42 connect to other networks including the Internet. These routers 43 forward unicast and multicast packets over paths that span multiple 44 IP- and/or sub-IP layer forwarding hops which may cross links with 45 diverse Maximum Transmission Units (MTUs) and introduce packet 46 duplication. This document specifies a Subnetwork Encapsulation and 47 Adaptation Layer (SEAL) that supports simplified duplicate packet 48 detection and accommodates links with diverse MTUs. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 53 2. Terminology and Requirements . . . . . . . . . . . . . . . . . 3 54 3. Applicability Statement . . . . . . . . . . . . . . . . . . . 4 55 4. SEAL Protocol Specification . . . . . . . . . . . . . . . . . 5 56 4.1. Model of Operation . . . . . . . . . . . . . . . . . . . . 5 57 4.2. Packetization . . . . . . . . . . . . . . . . . . . . . . 6 58 4.2.1. Packet Size Considerations . . . . . . . . . . . . . . 6 59 4.2.2. Inner IPv4 Fragmentation . . . . . . . . . . . . . . . 7 60 4.2.3. SEAL Segmentation and Encapsulation . . . . . . . . . 7 61 4.2.4. Setting DF and Sending Packets . . . . . . . . . . . . 10 62 4.3. Reassembly . . . . . . . . . . . . . . . . . . . . . . . . 11 63 4.3.1. Reassembly Buffer Requirements . . . . . . . . . . . . 11 64 4.3.2. IPv4-Layer Reassembly . . . . . . . . . . . . . . . . 11 65 4.3.3. SEAL-Layer Reassembly . . . . . . . . . . . . . . . . 11 66 4.4. Generating Fragmentation Reports . . . . . . . . . . . . . 12 67 4.5. Receiving Fragmentation Reports . . . . . . . . . . . . . 13 68 4.6. S-MSS Probing . . . . . . . . . . . . . . . . . . . . . . 13 69 4.7. Processing ICMP PTBs . . . . . . . . . . . . . . . . . . . 14 70 5. Link Requirements . . . . . . . . . . . . . . . . . . . . . . 14 71 6. End System Requirements . . . . . . . . . . . . . . . . . . . 14 72 7. Router Requirements . . . . . . . . . . . . . . . . . . . . . 15 73 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 74 9. Security Considerations . . . . . . . . . . . . . . . . . . . 15 75 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 15 76 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16 77 11.1. Normative References . . . . . . . . . . . . . . . . . . . 16 78 11.2. Informative References . . . . . . . . . . . . . . . . . . 16 79 Appendix A. Historic Evolution of PMTUD (written 10/30/2002) . . 18 80 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 19 81 Intellectual Property and Copyright Statements . . . . . . . . . . 21 83 1. Introduction 85 Mobile Ad-hoc Networks (MANETs) and other subnetworks connect routers 86 on links with asymmetric reachability characteristics, and may also 87 connect to other networks including the Internet. These routers 88 forward unicast and multicast packets over paths that span multiple 89 IP- and/or sub-IP layer forwarding hops, which may traverse links 90 with diverse Maximum Transmission Units (MTUs) and may also introduce 91 packet duplication due to temporal or persistent routing loops. It 92 is also expected that these routers will support operation of the 93 Internet protocols [RFC0791][RFC2460]. 95 The use of IPv4 encapsulation has long been considered as an 96 alternative for introducing a well-behaved identification field 97 useful for duplicate packet detection, such as required for 98 Simplified Multicast Forwarding [I-D.ietf-manet-smf]. However, the 99 16-bit ID field in the outer IPv4 header supports only 2^16 distinct 100 identification values and therefore does not provide sufficient space 101 for robust duplicate packet detection over modern link technologies. 103 Additionally, the insertion of an outer IPv4 header reduces the 104 effective path MTU as-seen by the IP layer. This reduced MTU can be 105 accommodated through the use of IPv4 fragmentation, but unmitigated 106 in-the-network fragmentation has been shown to be harmful through 107 operational experience and studies conducted over the course of many 108 years [FRAG][RFC2923][RFC4459][RFC4963]. 110 This document proposes a Subnetwork Encapsulation and Adaptation 111 Layer (SEAL) for the operation of IP over subnetworks that connect 112 Ingress- and Egress Tunnel Endpoints (ITEs/ETEs). SEAL supports 113 simple and robust duplicate packet detection, and accommodates links 114 with diverse MTUs. SEAL additionally supports multiprotocol 115 operation and provides extended quality of service for the protocols 116 that use it. The SEAL protocol is specified in the following 117 sections. 119 2. Terminology and Requirements 121 The terminology of [RFC3819][RFC2501][I-D.ietf-autoconf-manetarch] is 122 used in this document. The following abbreviations correspond to 123 terms used within this document and elsewhere in common 124 Internetworking nomenclature: 126 MANET - Mobile Ad-hoc Network 128 Subnetwork - a MANET or other network that connects (and is 129 bounded by) ITEs and ETEs 130 SEAL - Subnetwork Encapsulation and Adaptation Layer 132 VET - Virtual EThernet 134 ITE - Ingress Tunnel Endpoint 136 ETE - Egress Tunnel Endpoint 138 MTU - Maximum Transmission Unit 140 S-MSS - SEAL Maximum Segment Size 142 EMTU_R - Effective MTU to Receive 144 PTB - an ICMPv6 "Packet Too Big" or an ICMPv4 "fragmentation 145 needed" message 147 DF - the IPv4 header Don't Fragment flag 149 ENCAPS - the size of the outer encapsulating SEAL/*/IPv4 headers 151 FRAGREP - a Fragmentation Report message 153 SEAL packet - a segment of an inner packet encapsulated in outer 154 SEAL/*/IPv4 headers 156 SEAL ID - a 32-bit Identification value that is randomly 157 initialized and monotonically incremented for each SEAL packet 158 sent to an ETE 160 Unfragmentable - an IPv4 packet with DF=1, or an IPv6 packet 162 The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, 163 SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this 164 document, are to be interpreted as described in [RFC2119]. 166 3. Applicability Statement 168 SEAL inserts an additional mid-layer encapsulation when IP/*/IPv4 169 encapsulation is used, and appears as a subnetwork encapsulation as 170 seen by inner layers. 172 While the SEAL approach was motivated by the specific use case of 173 duplicate packet detection in MANETs, the domain of applicability is 174 not limited to the MANET problem space and extends to other 175 subnetwork uses such as tunneling across enterprise networks, the 176 interdomain routing core, etc. 178 For further study, SEAL may also be useful for "transport-mode" 179 applications, e.g., when the inner packet encapsulates ordinary 180 protocol data rather than an IP packet. 182 4. SEAL Protocol Specification 184 4.1. Model of Operation 186 Ingres Tunnel Endpoints (ITEs) insert a SEAL header in the IP/*/ 187 IPv4-encapsulated packets they inject into a subnetwork, where the 188 outermost IPv4 header contains the source and destination addresses 189 of the ITR/ETR subnetwork entry/exit points, respectively. SEAL 190 defines a new IP protocol type and a new mid-layer encapsulation for 191 both unicast and multicast inner packets. The ITE inserts a SEAL 192 header during encapsulation as shown in Figure 1: 194 +-------------------------+ 195 | | 196 ~ Outer */IPv4 headers ~ 197 | | 198 +-------------------------+ 199 +-- SEAL Header --+ 200 +-------------------------+ +-------------------------+ 201 | | | | 202 ~ Any mid-layer * headers ~ ~ Any mid-layer * headers ~ 203 | | | | 204 +-------------------------+ +-------------------------+ 205 | | | | 206 ~ Inner IP ~ ---> ~ Inner IP ~ 207 ~ Packet ~ ---> ~ Packet ~ 208 | | | | 209 +-------------------------+ +-------------------------+ 210 | Any mid-layer trailers | | Any mid-layer trailers | 211 +-------------------------+ +-------------------------+ 212 | Any outer trailers | 213 +-------------------------+ 215 Figure 1: SEAL Encapsulation 217 where the SEAL header is inserted as follows: 219 o For simple IP/IPv4 encapsulations (e.g., 220 [RFC2003][RFC2004][RFC4213]), the SEAL header is inserted between 221 the inner IP and outer IPv4 headers as: IP/SEAL/IPv4. 223 o For tunnel-mode IPsec/ESP encapsulations over IPv4, 224 [RFC4301][RFC4303], the SEAL header is inserted between the ESP 225 and outer IPv4 headers as: IP/*/ESP/SEAL/IPv4. 227 o For IP encapsulations over transports such as UDP (e.g., 228 [RFC4380][I-D.farinacci-lisp]), the SEAL header is embedded in any 229 middle- and outer-'*' encapsulations within the transport layer, 230 e.g., as IP/*/SEAL/*/UDP/IPv4. 232 Encapsulation and tunneling establishes a virtual point-to-multipoint 233 interface abstraction of the subnetwork. From a logical viewpoint, 234 this interface appears as a Virtual EThernet (VET) 235 [I-D.templin-autoconf-dhcp] that connects the ITE to all ETEs in the 236 subnetwork as single-hop neighbors. From a physical perspective, 237 however, packets sent over the VET interface may be forwarded across 238 many IPv4 and/or sub-IPv4 layer subnetwork hops. 240 SEAL-encapsulated packets include a 32-bit SEAL-ID formed from the 241 concatenation of the 16-bit ID Extension field in the SEAL header as 242 the most-significant bits and with the 16-bit ID value in the outer 243 IPv4 header as the least-significant bits. Routers use the SEAL-ID 244 for both duplicate packet detection within the subnetwork and also 245 for multi-level segmentation and reassembly of large packets. 247 SEAL enables a multi-level segmentation and reassembly capability. 248 First, the ITE can use inner IPv4 fragmentation for fragmentable 249 inner IPv4 packets before encapsulation to avoid lower-level 250 segmentation and reassembly. Secondly, the SEAL layer itself 251 provides a simple mid-layer cutting-and-pasting of inner packets 252 without incurring IPv4 fragmentation on the outer packet. Finally, 253 ordinary IPv4 fragmentation for the outer IPv4 packet after SEAL 254 encapsulation is also permitted under certain limited and carefully 255 managed circumstances. 257 4.2. Packetization 259 4.2.1. Packet Size Considerations 261 Due to the ubiquitous deployment of standard Ethernet and similar 262 networking gear, the nominal Internet cell size has become 1500 263 bytes; this is the de facto size that end systems have come to expect 264 will be delivered by the network without loss due to an MTU 265 restriction on the path, or a suitable ICMP PTB message returned. 266 However, PTB messages are not delivered reliably, and any PTBs coming 267 from within the subnetwork could be erroneous or maliciously 268 fabricated. The ITE therefore requires a means for conveying 1500 269 byte (or smaller) original packets over the VET interface without 270 loss due to link MTU restrictions and/or triggering PTB messages from 271 within the subnetwork. 273 In common deployments, there may be many forwarding hops between the 274 source and the ITE. Within those hops, there may be additional 275 encapsulations (IPSec, L2TP, etc.) such that a 1500 byte original 276 packet might grow to a larger size by the time it reaches the ITE. 277 In order to preserve the end system expectation of delivery for 1500 278 byte and smaller packets, the ITE therefore requires a means for 279 conveying this larger packet over the VET interface even though there 280 may be subnetwork links that configure a smaller MTU. 282 The ITE upholds the 1500-byte-and-smaller packet delivery expectation 283 by instituting a SEAL Maximum Segment Size (S-MSS) variable, set to 284 1KB by default and configurable within the range of [128 - 2KB]. The 285 ITE also institutes a [S-MSS - 2KB] segmentation region such that all 286 inner packets within this size range are segmented into multiple SEAL 287 packets. For 1500 byte and smaller inner packets/fragments, the 2KB 288 upper bound allows for ~500 bytes of additional subnetwork 289 encapsulation overhead on the path from the original source to the 290 ITE. Similarly, the default 1KB lower bound allows ~500 bytes of 291 additional encapsulation on the path between the ITE and ETE to 292 accommodate each SEAL packet while avoiding IPv4 fragmentation along 293 most paths within subnetwork that deploy 1500 byte links. 295 The ITE additionally admits all inner packets larger than 2KB into 296 the VET interface as single-segment SEAL packets under the assumption 297 that original sources that send packets larger than 1500 bytes are 298 using an end-to-end MTU determination capability such as specified in 299 [RFC4821]. 301 4.2.2. Inner IPv4 Fragmentation 303 The IP layer fragments inner IPv4 packets larger than 2KB and with 304 the IPv4 Don't Fragment (DF) bit set to 0 into IPv4 fragments no 305 larger than 2KB before any mid-layer '*' encapsulations. The IP 306 layer then submits each inner IPv4 fragment to the ITE as an 307 independent IP packet for encapsulation. Note that inner 308 fragmentation may not be available for certain ITE types, e.g., for 309 tunnel-mode IPsec. 311 Any inner IPv4 fragments created in this fashion will be reassembled 312 by the final destination. 314 4.2.3. SEAL Segmentation and Encapsulation 316 After inner IPv4 fragmentation, the ITE encapsulates the IPv4 packet/ 317 fragment in any mid-layer '*' headers, then performs segmentation on 318 this inner packet based on a segment size that is likely to avoid 319 IPv4 fragmentation within the subnetwork. The ITE maintains a SEAL 320 Maximum Segment Size (S-MSS) variable for each ETR as per-ETR IPv4 321 destination cache soft state, including IPv4 multicast destinations. 322 S-MSS SHOULD be initialized to 1KB by default, and MAY be changed to 323 different values in the range [128, 2KB] based on static 324 configuration and/or dynamic segment size probing. 326 The ITE MUST NOT break unfragmentable inner packets larger than 2KB 327 into smaller segments, but rather MUST encapsulate them as a single 328 segment SEAL packet. The ITE breaks inner packets no larger than 2KB 329 into N segments (N <= 16) that are no larger than S-MSS bytes each, 330 i.e., even if the inner packet is unfragmentable. Each segment 331 except the final one MUST be of equal length, while the final segment 332 MAY be of different length. The first byte of each segment MUST 333 begin immediately after the final byte of the previous segment, i.e., 334 the segments MUST NOT overlap. 336 The ITE encapsulates each segment in a SEAL header formatted in 337 either minimal- or extended- formats according to the following 338 figures: 340 0 1 2 3 341 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 342 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 343 | ID Extension |R|M|CTL|Segment| Next Header A | 344 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 346 Figure 2: Minimal SEAL Header Format 348 0 1 2 3 349 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 350 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 351 | ID Extension |R|M|CTL|Segment| 0 | 352 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 353 | RSVD | Flow Label | Next Header B | 354 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 356 Figure 3: Extended SEAL Header Format 358 where the header fields are defined as follows: 360 ID Extension (16) 361 a 16-bit extension of the 16-bit ID field in the outer IPv4 362 header; encodes the most-significant 16 bits of a 32 bit SEAL-ID 363 value. 365 R (1) 366 Reserved, and must be zero. 368 M (1) 369 the "More Segments bit. Set to 1 if this SEAL packet contains a 370 non-final segment of a multi-segment inner packet. 372 CTL (2) 373 a 2-bit "Control" field that identifies the type of SEAL packet as 374 follows: 376 '00' - an ordinary SEAL packet. 378 '01' - a Fragmentation Report (FRAGREP). 380 '10' - an implicit probe. 382 '11' - an explicit probe. 384 Segment (4) 385 a 4-bit Segment number. Encodes a segment number between 0 - 15. 387 Next Header A (8) an 8-bit field that encodes either an IP protocol 388 number the same as for the IPv4 protocol and IPv6 next header 389 fields, or the value zero. When Next Header A is non-zero, the 390 SEAL header is in minimal format; otherwise, the SEAL header is in 391 extended format. 393 RSVD a 4-bit Reserved field, present only in extended format. Must 394 be zero. 396 Flow Label (20) a 20-bit flow label field, present only in extended 397 format. Contains a 20-bit value corresponding to the inner packet 398 during SEAL encapsulation. 400 Next Header B (8) an 8-bit field that encodes an IP protocol number 401 the same as for the IPv4 protocol and IPv6 next header fields. 403 For N-segment inner packets (N <= 16), the ITE selects a SEAL header 404 format (minimal or extended) and encapsulates each segment in a 405 header of the same format with (M=1; Segment=0) for the first 406 segment, (M=1; Segment=1) for the second segment, etc., with the 407 final segment setting (M=0; Segment=N-1). Note that single-segment 408 inner packets instead set (M=0; Segment=0). 410 During encapsulation, the ITE also sets CTL='00' in the SEAL header 411 of each segment if this segment is not to be used as an explicit or 412 implicit probe. Otherwise, the ITE sets CTL='10' or '11' according 413 to the type of probe desired (see: Section 4.6). 415 The ITE next writes either the IP protocol number corresponding to 416 the inner packet (minimal format) or the value zero (extended format) 417 in 'Next Header A' in the SEAL header of each segment. When extended 418 format is used, the ITE also writes a 20-bit flow label value 419 corresponding to the inner packet into the Flow Label field and 420 writes the IP protocol number corresponding to the inner packet in 421 'Next Header B'. The ITE then encapsulates the segment in the 422 requisite */IPv4 outer headers. 424 The ITE maintains a 32-bit SEAL-ID value as per-ETE soft state in the 425 IPv4 destination cache. The value is randomly-initialized when the 426 soft state is created and monotonically incremented (modulo 2^32) for 427 each successive SEAL packet sent to the ETE. For each SEAL packet, 428 the ITE writes the least-significant 16 bits of the SEAL-ID value in 429 the ID field in the outer IPv4 header, and writes the most- 430 significant 16 bits in the ID Extension field in the SEAL header. 432 The ITE finally sets other fields in the outer */IPv4 headers 433 according to the specific encapsulation format (e.g., [RFC2003], 434 [RFC4213], etc.). 436 4.2.4. Setting DF and Sending Packets 438 For inner packets larger than 2KB, the ITE determines whether the 439 size of the packet plus the size of the SEAL/*/IPv4 encapsulation 440 headers is larger than the MTU of the underlying interface over which 441 the tunnel is configured. If the packet is too large, the ITE 442 discards it and sends an ICMP PTB message back to the original source 443 with an MTU value taken from the underlying interface minus the size 444 of the encapsulation headers. Otherwise, the ITE sets the Don't 445 Fragment (DF) bit in the outer IPv4 header to DF=1. 447 For inner packets that were no larger than 2KB before segmentation, 448 the ITE sets DF=0 or DF=1 in the outer IPv4 header of each SEAL 449 packet according to the desired behavior as follows: 451 o if the ITE is probing the path to the ETE, it MUST set DF=0 to 452 allow the ETE to sense and report fragmentation. 454 o if S-MSS=128, the ITE MUST set DF=0 in case any unavoidable in- 455 the-network IPv4 fragmentation is required. 457 o if the ITE has recently probed the path to the ETE, it MAY set 458 DF=1 in subsequent SEAL packets until the next probing cycle. 460 After setting the DF bits, the ITE SHOULD send all SEAL packets that 461 encapsulate segments of the same inner packet into the VET interface 462 in canonical order, i.e., Segment 0 first, then Segment 1, etc. 464 4.3. Reassembly 466 4.3.1. Reassembly Buffer Requirements 468 ETEs MUST be capable of using IPv4-layer reassembly to reassemble 469 SEAL packets of at least (2KB+ENCAPS) bytes, i.e., ETEs MUST 470 configure an IPv4 Effective MTU to Receive (EMTU_R) of at least (2KB+ 471 ENCAPS). 473 ETEs MUST also be capable of using SEAL-layer reassembly to 474 reassemble inner packets of at least 2KB, i.e., ETEs MUST configure a 475 SEAL EMTU_R of at least 2KB. 477 4.3.2. IPv4-Layer Reassembly 479 The ETE maintains a conservative high- and low-water mark for the 480 number of outstanding reassemblies pending for each ITE. When the 481 size of the reassembly buffer exceeds this high-water mark, the ETE 482 actively discards incomplete reassemblies (e.g., using an Active 483 Queue Management (AQM) strategy such as drop-eldest, Random Early 484 Drop (RED), etc.) until the size falls below the low-water mark. The 485 ETE otherwise performs IPv4 reassembly as-normal. 487 Note that in the limiting case the ETE may choose to discard all 488 reassemblies for packets that set CTL='1X' in the SEAL header and 489 only perform reassembly for packets that set CTL='0X' in the SEAL 490 header (see; Section 4.4). 492 4.3.3. SEAL-Layer Reassembly 494 After IPv4-layer reassembly, the ETE performs SEAL-layer reassembly 495 for N-segment inner packets through simple in-order concatenation of 496 the encapsulated segments from N consecutive SEAL packets. These 497 packets contain Segment numbers 0 through N-1, and with consecutive 498 SEAL-ID values encoded in the 32-bit concatenation of the ID 499 Extension field in the SEAL header and the ID field in the IPv4 500 header. That is, for an N-segment inner packet, inner packet 501 reassembly entails the concatenation of the segments from SEAL 502 packets with (Segment 0, SEAL-ID i), followed by (Segment 1, SEAL-ID 503 ((i + 1) mod 2^32)), etc. up to (Segment N-1, SEAL-ID ((i + N-1) mod 504 2^32)). This requires the ETE to maintain a cache of recently 505 received SEAL packets for a hold time that would allow for reasonable 506 inter-segment delays. 508 Rather than set an absolute hold time, the ETE must actively discard 509 any pending reassemblies that appear to have no opportunity for 510 completion, e.g., when a considerable number of SEAL packets have 511 been received before a packet that completes the pending reassembly 512 has arrived. This assumes that any packet reordering within the 513 subnetwork will be on the order of a small number of positions and 514 that any gross reordering will be short-lived in nature. 516 4.4. Generating Fragmentation Reports 518 When the ETE receives an IPv4 first fragment of a fragmented SEAL 519 packet with (MF=1; Offset=0) in the IPv4 header and CTL='1X' in the 520 SEAL header, it generates a Fragmentation Report (FRAGREP) message to 521 send back over the VET interface to the original source. The ETE 522 also generates a FRAGREP for an IPv4 first fragment of a SEAL packet 523 with (MF=X; Offset=0) in the IPv4 header and CTL='11' in the SEAL 524 header (see: Section 4.6). 526 The FRAGREP message consists of an outer SEAL/*/IPv4 header followed 527 by the first 68 bytes of the IPv4 first fragment. The ETE sets 528 CTL='01' and Segment=0 in the SEAL header and sets the fields of the 529 IPv4 header set according to the specific encapsulation type. In 530 particular, the ETE sets the destination address of the FRAGREP to 531 the source address that was included in the IPv4 first fragment, and 532 sets the source address of the FRAGREP to the destination address 533 that was included in the IPv4 first fragment. If the destination 534 address in the first fragment was multicast, the ETE instead sets the 535 source address of the FRAGREP to an address assigned to the 536 underlying IPv4 interface. 538 The FRAGREP message has the following format: 540 +-------------------------+ 541 | | 542 ~ Outer */IPv4 headers ~ 543 | | 544 +-------------------------+ 545 | SEAL Header | 546 | (CTL='01'; Segment=0) | 547 +-------------------------+ +-------------------------+ 548 | | | | 549 ~ IPv4 first fragment ~ ---> ~ First 68 bytes of IPv4 ~ 550 ~ (R=1; Next Header!=0) ~ ---> ~ first fragment ~ 551 | | | | 552 +-------------------------+ +-------------------------+ 554 Figure 4: Fragmentation Report (FRAGREP) Message 556 4.5. Receiving Fragmentation Reports 558 When the ITE receives a potential FRAGREP message, it first verifies 559 that the message was formatted correctly by the ETE (per Section 4.4) 560 and confirms that the FRAGREP corresponds to one of the SEAL packets 561 that it actually sent to the ETE by examining the encapsulated 68 562 byte portion of the IPv4 first fragment. 564 For a valid FRAGREP, if the length of the encapsulated IPv4 fragment 565 ('LEN') minus ENCAPS is significantly larger than 128, the ITE sets 566 S-MSS for this ETE to (LEN-ENCAPS); otherwise, it sets S-MSS = MIN(S- 567 MSS/2, 128) . This limited halving procedure accounts for the 568 possibility that the ETE received IPv4 first fragments that were 569 significantly smaller than the path MTU. In that case, convergence 570 to an acceptable S-MSS size may require multiple iterations of 571 sending SEAL packets and receiving FRAGREP messages in a manner that 572 parallels classical path MTU discovery [RFC1191]. But, the limited 573 halving procedure ensures that convergence will occur quickly even in 574 extreme cases, while the correct MTU will be determined in a single 575 iteration under normal circumstances in which routers produce large 576 first fragments. 578 Note that multiple FRAGREP messages may be received for SEAL packets 579 that encapsulate segments of the same inner packet. In that case, 580 the ITE should set S-MSS to the minimum length reported in all 581 FRAGREP messages. If multiple FRAGREP messages corresponding to the 582 same inner packet report small MTUs, however, the ITE should only 583 halve the current S-MSS once; not multiple times. 585 4.6. S-MSS Probing 587 When S-MSS is larger than 128, the ITE MUST probe the path to the ETE 588 periodically to detect and dampen any in-the-network IPv4 589 fragmentation. The ITE performs implicit probing of the path by 590 setting CLT='10' in the SEAL header and DF=0 in the IPv4 header of 591 all SEAL packets of the same inner packet(s). If any in-the-network 592 fragmentation occurs, the ITE will receive authentic FRAGREP messages 593 from the ETE. 595 The ITE can also send explicit probes to periodically probe for 596 larger S-MSS values (to a maximum of 2KB) by sending single-segment 597 SEAL packets with CTL='11' in the SEAL header and DF=0 in the IPv4 598 header. The ETE will return a FRAGREP message whether or not any in- 599 the-network fragmentation occurs, which the ITE will process exactly 600 as for any FRAGREP per Section 4.5. The ITE MAY pad the length of 601 SEAL packets used for explicit probing (to a maximum size of 2KB+ 602 ENCAPS) if permitted by the specific */IPv4 encapsulation method. 604 The ITE can optionally send intervening SEAL packets between probing 605 intervals as passive probes by setting DF=0, or as non-probes by 606 setting DF=1. 608 When S-MSS=128, the ITE MUST set CTL='00' in the SEAL header of each 609 SEAL packet that is not being used as an explicit probe such that the 610 ETE will not generate FRAGREPs for unavoidable in-the-network 611 fragmentation. 613 4.7. Processing ICMP PTBs 615 The ITE may receive ICMP PTB messages in response to any packets that 616 were admitted into the VET interface with DF=1. The ITE may 617 optionally ignore, log, or honor the messages according to the 618 subnetwork trust basis. For example, ITEs connected to managed 619 subnetworks may be configured to honor ICMP PTBs while ITEs connected 620 to the global interdomain routing core may be configured to ignore/ 621 log them. 623 When ICMP PTBs are honored, the ITE: 625 o SHOULD send translated ICMP PTB messages back to the original 626 source for ICMP PTBs that correspond to SEAL packets that 627 encapsulate a segment larger than 2KB. 629 o SHOULD treat ICMP PTBs that correspond to SEAL packets that 630 encapsulate segments no larger than 2KB as an indication to resume 631 probing. 633 5. Link Requirements 635 Subnetwork designers are strongly encouraged to follow the 636 recommendations in [RFC3819] when configuring link MTUs. 638 6. End System Requirements 640 SEAL is a router-to-router protocol and therefore makes no 641 requirements for end systems. However, end systems that send 642 unfragmentable IP packets of 1501 bytes or larger are strongly 643 encouraged to use Packetization Layer Path MTU Discovery per 644 [RFC4821], since the network may not always be able to return useful 645 ICMP PTB messages. 647 7. Router Requirements 649 IPv4 routers observe the requirements in [RFC1812]. However, when a 650 router fragments an IPv4 datagram that may contain an encapsulated 651 SEAL packet, the fragmentation MUST cause the first fragment to be at 652 least 68 bytes (i.e., the minimum IPv4 MTU [RFC0791]). 654 This document therefore updates [RFC1812]. 656 8. IANA Considerations 658 A new IP protocol number for the SEAL protocol is requested. 660 A new IPv4 site-scoped ALL_MANET_ROUTERS multicast group is 661 requested. 663 9. Security Considerations 665 Unlike IPv4 fragmentation, overlapping fragment attacks are not 666 possible due to the requirement that SEAL segments be non- 667 overlapping. 669 An amplification/reflection attack is possible when an attacker sends 670 spoofed IPv4 fragments to an ETE with CTL='1X' in the SEAL header, 671 resulting in a stream of FRAGREP messages returned to a victim ITE. 672 The encapsulated 68 byte segment of the spoofed IPv4 fragment 673 provides mitigation for the ITE to detect and discard spurious 674 FRAGREPs. 676 The SEAL header is sent in-the-clear (outside of any IPsec/ESP 677 encapsulations) the same as for the IPv4 header. As for IPv6 678 extension headers, the SEAL header is protected only by L2 integrity 679 checks, and is not covered under any L3 integrity checks. 681 10. Acknowledgments 683 Path MTU determination through the report of fragmentation 684 experienced by the final destination was first proposed by Charles 685 Lynn of BBN on the TCP-IP mailing list in May 1987. An historical 686 analysis of the evolution of path MTU discovery appears in 687 http://www.tools.ietf.org/html/draft-templin-v6v4-ndisc-01 and is 688 reproduced in Appendix A of this document. 690 This work was inspired in part by discussions on the IETF MANET and 691 IRTF RRG mailing lists in the 12/07 - 01/08 timeframe, and the author 692 acknowledges those who participated in the discussions. The work 693 also draws on the earlier investigations of [I-D.templin-inetmtu] 694 which acknowledges many who contributed to the effort. 696 The extended SEAL header format was inspired by recent discussions. 698 11. References 700 11.1. Normative References 702 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 703 September 1981. 705 [RFC1812] Baker, F., "Requirements for IP Version 4 Routers", 706 RFC 1812, June 1995. 708 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 709 Requirement Levels", BCP 14, RFC 2119, March 1997. 711 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 712 (IPv6) Specification", RFC 2460, December 1998. 714 11.2. Informative References 716 [FOLK] C, C., D, D., and k. k, "Beyond Folklore: Observations on 717 Fragmented Traffic", December 2002. 719 [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", 720 October 1987. 722 [I-D.farinacci-lisp] 723 Farinacci, D., "Locator/ID Separation Protocol (LISP)", 724 draft-farinacci-lisp-05 (work in progress), November 2007. 726 [I-D.ietf-autoconf-manetarch] 727 Chakeres, I., Macker, J., and T. Clausen, "Mobile Ad hoc 728 Network Architecture", draft-ietf-autoconf-manetarch-07 729 (work in progress), November 2007. 731 [I-D.ietf-manet-smf] 732 Macker, J. and S. Team, "Simplified Multicast Forwarding 733 for MANET", draft-ietf-manet-smf-06 (work in progress), 734 November 2007. 736 [I-D.templin-autoconf-dhcp] 737 Templin, F., Russert, S., and S. Yi, "MANET 738 Autoconfiguration", draft-templin-autoconf-dhcp-11 (work 739 in progress), February 2008. 741 [I-D.templin-inetmtu] 742 Templin, F., "Simple Protocol for Robust IP/*/IP Tunnel 743 Endpoint MTU Determination (sprite-mtu)", 744 draft-templin-inetmtu-06 (work in progress), 745 November 2007. 747 [MTUDWG] "IETF MTU Discovery Working Group mailing list, 748 gatekeeper.dec.com/pub/DEC/WRL/mogul/mtudwg-log, November 749 1989 - February 1995.". 751 [RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP 752 MTU discovery options", RFC 1063, July 1988. 754 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 755 November 1990. 757 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 758 for IP version 6", RFC 1981, August 1996. 760 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 761 October 1996. 763 [RFC2004] Perkins, C., "Minimal Encapsulation within IP", RFC 2004, 764 October 1996. 766 [RFC2501] Corson, M. and J. Macker, "Mobile Ad hoc Networking 767 (MANET): Routing Protocol Performance Issues and 768 Evaluation Considerations", RFC 2501, January 1999. 770 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", 771 RFC 2923, September 2000. 773 [RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D., 774 Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. 775 Wood, "Advice for Internet Subnetwork Designers", BCP 89, 776 RFC 3819, July 2004. 778 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 779 for IPv6 Hosts and Routers", RFC 4213, October 2005. 781 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 782 Internet Protocol", RFC 4301, December 2005. 784 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 785 RFC 4303, December 2005. 787 [RFC4380] Huitema, C., "Teredo: Tunneling IPv6 over UDP through 788 Network Address Translations (NATs)", RFC 4380, 789 February 2006. 791 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 792 Network Tunneling", RFC 4459, April 2006. 794 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 795 Discovery", RFC 4821, March 2007. 797 [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly 798 Errors at High Data Rates", RFC 4963, July 2007. 800 [TCP-IP] "TCP-IP mailing list archives, 801 http://www-mice.cs.ucl.ac.uk/multimedia/mist/tcpip, May 802 1987 - May 1990.". 804 Appendix A. Historic Evolution of PMTUD (written 10/30/2002) 806 The topic of Path MTU discovery (PMTUD) saw a flurry of discussion 807 and numerous proposals in the late 1980's through early 1990. The 808 initial problem was posed by Art Berggreen on May 22, 1987 in a 809 message to the TCP-IP discussion group [TCP-IP]. The discussion that 810 followed provided significant reference material for [FRAG]. An IETF 811 Path MTU Discovery Working Group [MTUDWG] was formed in late 1989 812 with charter to produce an RFC. Several variations on a very few 813 basic proposals were entertained, including: 815 1. Routers record the PMTUD estimate in ICMP-like path probe 816 messages (proposed in [FRAG] and later [RFC1063]) 818 2. The destination reports any fragmentation that occurs for packets 819 received with the "RF" (Report Fragmentation) bit set (Steve 820 Deering's 1989 adaptation of Charles Lynn's Nov. 1987 proposal) 822 3. A hybrid combination of 1) and Charles Lynn's Nov. 1987 proposal 823 (straw RFC draft by McCloughrie, Fox and Mogul on Jan 12, 1990) 825 4. Combination of the Lynn proposal with TCP (Fred Bohle, Jan 30, 826 1990) 828 5. Fragmentation avoidance by setting "IP_DF" flag on all packets 829 and retransmitting if ICMPv4 "fragmentation needed" messages 830 occur (Geof Cooper's 1987 proposal; later adapted into [RFC1191] 831 by Mogul and Deering). 833 Option 1) seemed attractive to the group at the time, since it was 834 believed that routers would migrate more quickly than hosts. Option 835 2) was a strong contender, but repeated attempts to secure an "RF" 836 bit in the IPv4 header from the IESG failed and the proponents became 837 discouraged. 3) was abandoned because it was perceived as too 838 complicated, and 4) never received any apparent serious 839 consideration. Proposal 5) was a late entry into the discussion from 840 Steve Deering on Feb. 24th, 1990. The discussion group soon 841 thereafter seemingly lost track of all other proposals and adopted 842 5), which eventually evolved into [RFC1191] and later [RFC1981]. 844 In retrospect, the "RF" bit postulated in 2) is not needed if a 845 "contract" is first established between the peers, as in proposal 4) 846 and a message to the MTUDWG mailing list from jrd@PTT.LCS.MIT.EDU on 847 Feb 19. 1990. These proposals saw little discussion or rebuttal, and 848 were dismissed based on the following the assertions: 850 o routers upgrade their software faster than hosts 852 o PCs could not reassemble fragmented packets 854 o Proteon and Wellfleet routers did not reproduce the "RF" bit 855 properly in fragmented packets 857 o Ethernet-FDDI bridges would need to perform fragmentation (i.e., 858 "translucent" not "transparent" bridging) 860 o the 16-bit IP_ID field could wrap around and disrupt reassembly at 861 high packet arrival rates 863 The first four assertions, although perhaps valid at the time, have 864 been overcome by historical events leaving only the final to 865 consider. But, [FOLK] has shown that IP_ID wraparound simply does 866 not occur within several orders of magnitude the reassembly timeout 867 window on high-bandwidth networks. 869 (Authors 2/11/08 note: this final point was based on a loose 870 interpretation of [FOLK], and is more accurately addressed in 871 [RFC4963].) 873 Author's Address 875 Fred L. Templin (editor) 876 Boeing Phantom Works 877 P.O. Box 3707 878 Seattle, WA 98124 879 USA 881 Email: fltemplin@acm.org 883 Full Copyright Statement 885 Copyright (C) The IETF Trust (2008). 887 This document is subject to the rights, licenses and restrictions 888 contained in BCP 78, and except as set forth therein, the authors 889 retain all their rights. 891 This document and the information contained herein are provided on an 892 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 893 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 894 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 895 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 896 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 897 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 899 Intellectual Property 901 The IETF takes no position regarding the validity or scope of any 902 Intellectual Property Rights or other rights that might be claimed to 903 pertain to the implementation or use of the technology described in 904 this document or the extent to which any license under such rights 905 might or might not be available; nor does it represent that it has 906 made any independent effort to identify any such rights. Information 907 on the procedures with respect to rights in RFC documents can be 908 found in BCP 78 and BCP 79. 910 Copies of IPR disclosures made to the IETF Secretariat and any 911 assurances of licenses to be made available, or the result of an 912 attempt made to obtain a general license or permission for the use of 913 such proprietary rights by implementers or users of this 914 specification can be obtained from the IETF on-line IPR repository at 915 http://www.ietf.org/ipr. 917 The IETF invites any interested party to bring to its attention any 918 copyrights, patents or patent applications, or other proprietary 919 rights that may cover technology that may be required to implement 920 this standard. Please address the information to the IETF at 921 ietf-ipr@ietf.org. 923 Acknowledgment 925 Funding for the RFC Editor function is provided by the IETF 926 Administrative Support Activity (IASA).