| < draft-templin-intarea-parcels-09.txt | draft-templin-intarea-parcels-10.txt > | |||
|---|---|---|---|---|
| Network Working Group F. L. Templin, Ed. | Network Working Group F. L. Templin, Ed. | |||
| Internet-Draft Boeing Research & Technology | Internet-Draft Boeing Research & Technology | |||
| Updates: RFC2675 (if approved) 10 February 2022 | Updates: RFC2675 (if approved) 29 March 2022 | |||
| Intended status: Standards Track | Intended status: Standards Track | |||
| Expires: 14 August 2022 | Expires: 30 September 2022 | |||
| IP Parcels | IP Parcels | |||
| draft-templin-intarea-parcels-09 | draft-templin-intarea-parcels-10 | |||
| Abstract | Abstract | |||
| IP packets (both IPv4 and IPv6) are understood to contain a unit of | IP packets (both IPv4 and IPv6) are understood to contain a unit of | |||
| data which becomes the retransmission unit in case of loss. Upper | data which becomes the retransmission unit in case of loss. Upper | |||
| layer protocols such as the Transmission Control Protocol (TCP) | layer protocols such as the Transmission Control Protocol (TCP) | |||
| prepare data units known as "segments", with traditional arrangements | prepare data units known as "segments", with traditional arrangements | |||
| including a single segment per packet. This document presents a new | including a single segment per packet. This document presents a new | |||
| construct known as the "IP Parcel" which permits a single packet to | construct known as the "IP Parcel" which permits a single packet to | |||
| carry multiple segments, essentially creating a "packet-of-packets". | carry multiple segments, essentially creating a "packet-of-packets". | |||
| skipping to change at page 1, line 40 ¶ | skipping to change at page 1, line 40 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on 14 August 2022. | This Internet-Draft will expire on 30 September 2022. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2022 IETF Trust and the persons identified as the | Copyright (c) 2022 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents (https://trustee.ietf.org/ | |||
| license-info) in effect on the date of publication of this document. | license-info) in effect on the date of publication of this document. | |||
| skipping to change at page 2, line 19 ¶ | skipping to change at page 2, line 19 ¶ | |||
| provided without warranty as described in the Revised BSD License. | provided without warranty as described in the Revised BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
| 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 3. Background and Motivation . . . . . . . . . . . . . . . . . . 4 | 3. Background and Motivation . . . . . . . . . . . . . . . . . . 4 | |||
| 4. IP Parcel Formation . . . . . . . . . . . . . . . . . . . . . 5 | 4. IP Parcel Formation . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 5. Transmission of IP Parcels . . . . . . . . . . . . . . . . . 8 | 5. Transmission of IP Parcels . . . . . . . . . . . . . . . . . 8 | |||
| 6. Parcel Path Qualification . . . . . . . . . . . . . . . . . . 10 | 6. Parcel Path Qualification . . . . . . . . . . . . . . . . . . 10 | |||
| 7. Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . 13 | 7. Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 8. RFC2675 Updates . . . . . . . . . . . . . . . . . . . . . . . 14 | 8. RFC2675 Updates . . . . . . . . . . . . . . . . . . . . . . . 15 | |||
| 9. IPv4 Jumbograms . . . . . . . . . . . . . . . . . . . . . . . 14 | 9. IPv4 Jumbograms . . . . . . . . . . . . . . . . . . . . . . . 15 | |||
| 10. Implementation Status . . . . . . . . . . . . . . . . . . . . 14 | 10. Implementation Status . . . . . . . . . . . . . . . . . . . . 15 | |||
| 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 | 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 | |||
| 12. Security Considerations . . . . . . . . . . . . . . . . . . . 15 | 12. Security Considerations . . . . . . . . . . . . . . . . . . . 15 | |||
| 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 | 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 | |||
| 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 | 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 | |||
| 14.1. Normative References . . . . . . . . . . . . . . . . . . 15 | 14.1. Normative References . . . . . . . . . . . . . . . . . . 16 | |||
| 14.2. Informative References . . . . . . . . . . . . . . . . . 16 | 14.2. Informative References . . . . . . . . . . . . . . . . . 16 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 18 | Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 18 | |||
| 1. Introduction | 1. Introduction | |||
| IP packets (both IPv4 [RFC0791] and IPv6 [RFC8200]) are understood to | IP packets (both IPv4 [RFC0791] and IPv6 [RFC8200]) are understood to | |||
| contain a unit of data which becomes the retransmission unit in case | contain a unit of data which becomes the retransmission unit in case | |||
| of loss. Upper layer protocols such as the Transmission Control | of loss. Upper layer protocols such as the Transmission Control | |||
| Protocol (TCP) [RFC0793], QUIC [RFC9000], LTP [RFC5326] and others | Protocol (TCP) [RFC0793], QUIC [RFC9000], LTP [RFC5326] and others | |||
| prepare data units known as "segments", with traditional arrangements | prepare data units known as "segments", with traditional arrangements | |||
| including a single segment per packet. This document presents a new | including a single segment per packet. This document presents a new | |||
| construct known as the "IP Parcel" which permits a single packet to | construct known as the "IP Parcel" which permits a single packet to | |||
| carry multiple segments. This essentially creates a "packet-of- | carry multiple segments. This essentially creates a "packet-of- | |||
| packets" with the IP layer headers appearing only once but with | packets" with the IP layer headers appearing only once but with | |||
| possibly multiple upper layer protocol segments. | possibly multiple upper layer protocol segments included. | |||
| Parcels are formed when an upper layer protocol entity (identified by | Parcels are formed when an upper layer protocol entity identified by | |||
| the "5-tuple" source IP address/port number, destination IP address/ | the "5-tuple" (source IP, source port, destination IP, destination | |||
| port number and protocol number) prepares a buffer of data with the | port, protocol number) prepares a data buffer with the concatenation | |||
| concatenation of up to 64 properly-formed segments that can be broken | of up to 64 properly-formed segments that can be broken out into | |||
| out into smaller parcels using a copy of the IP header. All segments | smaller parcels using a copy of the IP header. All segments except | |||
| except the final segment must be equal in size and no larger than | the final segment must be equal in size and no larger than 65535 | |||
| 65535 octets (minus headers), while the final segment must be no | octets (minus headers), while the final segment must not be larger | |||
| larger than the others but may be smaller. The upper layer protocol | than the others but may be smaller. The upper layer protocol entity | |||
| entity then delivers the buffer and non-final segment size to the IP | then delivers the buffer and non-final segment size to the IP layer, | |||
| layer, which appends the necessary IP headers to identify this as a | which appends the necessary IP header plus extensions to identify | |||
| parcel and not an ordinary packet. | this as a parcel and not an ordinary packet. | |||
| Each original parcel can traverse arbitrarily many parcel-capable IP | Parcels can be forwarded over consecutive parcel-capable IP links in | |||
| links in the path until arriving at a parcel-capable ingress | the path until arriving at an ingress middlebox at the edge of an | |||
| middlebox at the edge of a wide area Internetwork. The ingress | intermediate Internetwork. The ingress middlebox may break the | |||
| middlebox may break the parcel out into smaller (sub-)parcels and | parcel out into smaller (sub-)parcels and encapsulate them in headers | |||
| encapsulate them in headers suitable for traversing the Internetwork. | suitable for traversing the Internetwork. These smaller parcels may | |||
| These smaller parcels may then be rejoined into one or more larger | then be rejoined into one or more larger parcels at an egress | |||
| parcels at an egress middlebox which forwards them further over | middlebox which either delivers them locally or forwards them further | |||
| parcel-capable IP links toward the final destination. Repackaging of | over parcel-capable IP links toward the final destination. Middlebox | |||
| parcels is therefore commonplace, while reordering of segments within | repackaging of parcels is therefore possible, making reordering and | |||
| a parcel or even loss of individual segments is possible but not | even loss of individual segments possible. But, what matters is that | |||
| desirable. But, what matters is that the number of parcels delivered | the number of parcels delivered to the final destination should be | |||
| to the final destination should be kept to a minimum, and that loss | kept to a minimum, and that loss or receipt of individual segments | |||
| or receipt of individual segments (and not parcel size) determines | (and not parcel size) determines the retransmission unit. | |||
| the retransmission unit. | ||||
| The following sections discuss rationale for creating and shipping | The following sections discuss rationale for creating and shipping | |||
| parcels as well as the actual protocol constructs and procedures | parcels as well as the actual protocol constructs and procedures | |||
| involved. IP parcels provide an essential building block for | involved. IP parcels provide an essential building block for | |||
| accommodating larger Maximum Transmission Units (MTUs) in the | accommodating larger Maximum Transmission Units (MTUs) in the | |||
| Internet. It is further expected that the parcel concept may drive | Internet. It is further expected that the parcel concept may drive | |||
| future innovation in applications, operating systems, network | future innovation in applications, operating systems, network | |||
| equipment and data links. | equipment and data links. | |||
| 2. Terminology | 2. Terminology | |||
| A "parcel" is defined as "a thing or collection of things wrapped in | A "parcel" is defined as "a thing or collection of things wrapped in | |||
| paper in order to be carried or sent by mail". Indeed, there are | paper in order to be carried or sent by mail". Indeed, there are | |||
| many examples of parcel delivery services worldwide that provide an | many examples of parcel delivery services worldwide that provide an | |||
| essential transit backbone for efficient business and consumer | essential transit backbone for efficient business and consumer | |||
| transactions. | transactions. | |||
| In this same spirit, an "IP parcel" is simply a collection of up to | In this same spirit, an "IP parcel" is simply a collection of up to | |||
| 64 upper layer protocol segments wrapped in an efficient package for | 64 upper layer protocol segments wrapped in an efficient package for | |||
| transmission and delivery (i.e., a "packet of packets") while a | transmission and delivery (i.e., a "packet-of-packets") while a | |||
| "singleton IP parcel" is simply a parcel that contains a single | "singleton IP parcel" is simply a parcel that contains a single | |||
| segment. IP parcels are distinguished from ordinary packets through | segment. IP parcels are distinguished from ordinary packets through | |||
| the special header constructions discussed in this document. | the special header constructions discussed in this document. | |||
| The IP parcels construct is defined for both IPv4 and IPv6. Where | ||||
| the document refers to "IPv4 header length", it means the total | ||||
| length of the base IPv4 header plus all included options, i.e., as | ||||
| determined by consulting the Internet Header Length (IHL) field. | ||||
| Where the document refers to "IPv6 header length", however, it means | ||||
| only the length of the base IPv6 header (i.e., 40 octets), while the | ||||
| length of any extension headers is referred to separately as the | ||||
| "extension header length". Finally, the term "IP header plus | ||||
| extensions" refers generically to an IPv4 header plus all included | ||||
| options or an IPv6 header plus all included extension headers. | ||||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
| "OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in BCP | |||
| 14 [RFC2119][RFC8174] when, and only when, they appear in all | 14 [RFC2119][RFC8174] when, and only when, they appear in all | |||
| capitals, as shown here. | capitals, as shown here. | |||
| 3. Background and Motivation | 3. Background and Motivation | |||
| Studies have shown that by sending and receiving larger packets | Studies have shown that applications can realize greater performance | |||
| applications can realize greater performance due to reduced numbers | by sending and receiving larger packets due to reduced numbers of | |||
| of system calls and interrupts as well as larger atomic data copies | system calls and interrupts as well as larger atomic data copies | |||
| between kernel and user space. Large packets in the network also | between kernel and user space. Large packets also result in reduced | |||
| result in reduced numbers of device interrupts and better network | numbers of network device interrupts and better network utilization | |||
| utilization in comparison with smaller packet sizes. | in comparison with smaller packet sizes. | |||
| A first study [QUIC] involved performance enhancement of the QUIC | A first study [QUIC] involved performance enhancement of the QUIC | |||
| protocol [RFC9000] using the linux Generic Segment/Receive Offload | protocol [RFC9000] using the linux Generic Segment/Receive Offload | |||
| (GSO/GRO) facility. GSO/GRO provide a robust (but non-standard) | (GSO/GRO) facility. GSO/GRO provide a robust (but non-standard) | |||
| service very similar in nature to the IP parcel service described | service very similar in nature to the IP parcel service described | |||
| here, and its application has shown significant performance increases | here, and its application has shown significant performance increases | |||
| due to the increased transfer unit size between the operating system | due to the increased transfer unit size between the operating system | |||
| kernel and QUIC application. | kernel and QUIC application. | |||
| A second study [I-D.templin-dtn-ltpfrag] showed that GSO/GRO also | A second study [I-D.templin-dtn-ltpfrag] showed that GSO/GRO also | |||
| skipping to change at page 5, line 8 ¶ | skipping to change at page 5, line 4 ¶ | |||
| (single-segment) UDP datagrams even when IP fragmentation is invoked, | (single-segment) UDP datagrams even when IP fragmentation is invoked, | |||
| and LTP still follows this profile today. Moreover, LTP shows this | and LTP still follows this profile today. Moreover, LTP shows this | |||
| (single-segment) performance increase profile extending to the | (single-segment) performance increase profile extending to the | |||
| largest possible segment size which suggests that additional | largest possible segment size which suggests that additional | |||
| performance gains may be possible using (multi-segment) IP parcels | performance gains may be possible using (multi-segment) IP parcels | |||
| that exceed 65535 octets. | that exceed 65535 octets. | |||
| TCP also benefits from larger packet sizes and efforts have | TCP also benefits from larger packet sizes and efforts have | |||
| investigated TCP performance using jumbograms internally with changes | investigated TCP performance using jumbograms internally with changes | |||
| to the linux GSO/GRO facilities [BIG-TCP]. The idea is to use the | to the linux GSO/GRO facilities [BIG-TCP]. The idea is to use the | |||
| jumbo payload internally and to allow GSO and GRO to use buffer sizes | jumbo payload internally and to allow GSO/GRO to use buffer sizes | |||
| larger than 65535 octets, but with the understanding that links that | larger than 65535 octets, but with the understanding that links that | |||
| support jumbos natively are not yet widely available. Hence, IP | support jumbos natively are not yet widely available. Hence, IP | |||
| parcels provides a packaging that can be considered in the near term | parcels provides a packaging that can be considered in the near term | |||
| under current deployment limitations. | under current deployment limitations. | |||
| The issue with sending large packets is that they are often lost at | A limiting consideration for sending large packets is that they are | |||
| links with smaller Maximum Transmission Units (MTUs), and the | often lost at links with smaller Maximum Transmission Units (MTUs), | |||
| resulting Packet Too Big (PTB) message may be lost somewhere in the | and the resulting Packet Too Big (PTB) message may be lost somewhere | |||
| path back to the original source. This "Path MTU black hole" | in the path back to the original source. This "Path MTU black hole" | |||
| condition can degrade performance unless robust path probing | condition can degrade performance unless robust path probing | |||
| techniques are used, however the best case performance always occurs | techniques are used, however the best case performance always occurs | |||
| when no packets are lost due to size restrictions. | when no packets are lost due to size restrictions. | |||
| These considerations therefore motivate a design where the maximum | These considerations therefore motivate a design where the maximum | |||
| segment size should be no larger than 65535 octets (minus headers), | segment size should be no larger than 65535 octets (minus headers), | |||
| while parcels that carry the segments may themselves be significantly | while parcels that carry the segments may themselves be significantly | |||
| larger. Then, even if a middlebox needs to sub-divide the parcels | larger. Then, even if a middlebox needs to sub-divide the parcels | |||
| into smaller sub-parcels to forward further toward the final | into smaller sub-parcels to forward further toward the final | |||
| destination, an important performance optimization for both the | destination, an important performance optimization for the original | |||
| original source and final destination can be realized. | source, final destination and network middleboxes can be realized. | |||
| An analogy: when a consumer orders 50 small items from a major online | An analogy: when a consumer orders 50 small items from a major online | |||
| retailer, the retailer does not ship the order in 50 separate small | retailer, the retailer does not ship the order in 50 separate small | |||
| boxes. Instead, the retailer puts as many of the small boxes as | boxes. Instead, the retailer puts as many of the small items as | |||
| possible into one or a few larger boxes (or parcels) then places the | possible into one or a few larger boxes (i.e., parcels) then places | |||
| parcels on a semi-truck or airplane. The parcels arrive at a | the parcels on a semi-truck or airplane. The parcels may then pass | |||
| regional distribution center where they may be further redistributed | through one or more regional distribution centers where they may be | |||
| into slightly smaller parcels that get delivered to the consumer. | repackaged into different parcel configurations and forwarded further | |||
| But most often, the consumer will only find one or a few parcels at | until they are finally delivered to the consumer. But most often, | |||
| his doorstep and not 50 individual boxes. This greatly reduces | the consumer will only find one or a few parcels at their doorstep | |||
| handling overhead for both the retailer and consumer. | and not 50 separate small boxes. This flexible parcel delivery | |||
| service greatly reduces shipping and handling overhead for all | ||||
| including the retailer, regional distribution centers and finally the | ||||
| consumer. | ||||
| 4. IP Parcel Formation | 4. IP Parcel Formation | |||
| IP parcel formation is invoked by an upper layer protocol (identified | IP parcel formation is invoked by an upper layer protocol (identified | |||
| by the 5-tuple as above) when it emits a data buffer containing the | by the 5-tuple described above) when it prepares a data buffer | |||
| concatenation of up to 64 segments. All non-final segments MUST be | containing the concatenation of up to 64 segments. All non-final | |||
| equal in length while the final segment MUST NOT be larger but MAY be | segments MUST be equal in length while the final segment MUST NOT be | |||
| smaller. Each non-final segment MUST be no larger than 65535 octets | larger and MAY be smaller. Each non-final segment MUST NOT be larger | |||
| minus the length of the IP header plus extensions, minus the length | than 65535 octets minus the length of the IPv4 header or IPv6 | |||
| of an additional IPv6 header in case an encapsulation middlebox is | extension headers, minus the length of an additional IPv6 header in | |||
| visited on the path (see: Section 5). The upper layer protocol then | case an encapsulation middlebox is visited on the path (see: | |||
| presents the buffer and non-final segment size to the IP layer which | Section 5). The upper layer protocol then presents the buffer and | |||
| appends a single IP header (plus any extension headers) before | non-final segment size to the IP layer which appends a single IP | |||
| presenting the parcel to either the adaptation layer or the outgoing | header (plus extensions) before presenting the parcel either to an | |||
| network interface itself (see: Section 5). | adaptation layer interface or directly to an ordinary network | |||
| interface without engaging the adaptation layer (see: Section 5). | ||||
| For IPv4, the IP layer prepares the parcel by appending an IPv4 | For IPv4, the IP layer prepares the parcel by appending an IPv4 | |||
| header with a Jumbo Payload option formed as follows: | header with a Jumbo Payload option formed as follows: | |||
| +--------+--------+--------+--------+--------+--------+ | +--------+--------+--------+--------+--------+--------+ | |||
| |00001011|00000110| Jumbo Payload Length | | |Opt Type|Opt Len | Jumbo Payload Length | | |||
| +--------+--------+--------+--------+--------+--------+ | +--------+--------+--------+--------+--------+--------+ | |||
| where option type is set to '00001011' and option length is set to | The IPv4 Jumbo Payload option format is identical to that defined in | |||
| '00000110' which distinguishes the option from its former | [RFC2675], except that the IP layer sets option type to '00001011' | |||
| (deprecated) use as "IPv4 Probe MTU" by [RFC1063]). In this new | and option length to '00000110' noting that the length distinguishes | |||
| format, "Jumbo Payload Length" is a 32-bit unsigned integer value (in | this type from its deprecated use as the IPv4 "Probe MTU" option | |||
| network byte order) set to the lengths of the IPv4 header plus all | [RFC1063]. The IP layer then sets "Jumbo Payload Length" to the | |||
| concatenated segments. The IP layer next sets the IPv4 header DF bit | lengths of the IPv4 header plus the combined length of all | |||
| to 1, then sets the IPv4 header Total Length field to the length of | concatenated segments (i.e., as a 32-bit value in network byte | |||
| the IPv4 header plus the length of the first segment only. Note that | order). The IP layer next sets the IPv4 header DF bit to 1, then | |||
| the IP layer can form true IPv4 jumbograms (as opposed to parcels) by | sets the IPv4 header Total Length field to the length of the IPv4 | |||
| header plus the length of the first segment only. Note that the IP | ||||
| layer can form true IPv4 jumbograms (as opposed to parcels) by | ||||
| instead setting the IPv4 header Total Length field to the length of | instead setting the IPv4 header Total Length field to the length of | |||
| the IPv4 header plus options (see: Section 9). | the IPv4 header only (see: Section 9). | |||
| For IPv6, the IP layer forms a parcel by appending an IPv6 header | For IPv6, the IP layer forms a parcel by appending an IPv6 header | |||
| with a Jumbo Payload option [RFC2675] the same as for IPv4 above with | with a Hop-by-Hop Options extension header containing a Jumbo Payload | |||
| option type set to '11000010' and option length set to '00000100' | option formatted the same as for IPv4 above, but with option type set | |||
| where "Jumbo Payload Length" is set to the lengths of the IPv6 Hop- | to '11000010' and option length set to '00000100'. The IP layer then | |||
| by-Hop Options header and any other extension headers present plus | sets "Jumbo Payload Length" to the lengths of all IPv6 extension | |||
| all concatenated segments. The IP layer next sets the IPv6 header | headers present plus the combined length of all concatenated | |||
| Payload Length field to the lengths of the IPv6 Hop-by-Hop Options | segments. The IP layer next sets the IPv6 header Payload Length | |||
| header and any other extension headers present plus the length of the | field to the lengths of all IPv6 extension headers present plus the | |||
| first segment only. As with IPv4 the IP layer can form true IPv6 | length of the first segment only. Note that the IP layer can form | |||
| jumbograms (as opposed to parcels) by instead setting the IPv6 header | true IPv6 jumbograms (as opposed to parcels) by instead setting the | |||
| Payload Length field to 0 (see: [RFC2675]). | IPv6 header Payload Length field to 0 (see: [RFC2675]). | |||
| An IP parcel therefore has the following structure: | An IP parcel therefore has the following structure: | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| | | | | | | |||
| ~ Segment J (K octets) ~ | ~ Segment J (K octets) ~ | |||
| | | | | | | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| ~ ~ | ~ ~ | |||
| ~ ~ | ~ ~ | |||
| skipping to change at page 7, line 34 ¶ | skipping to change at page 7, line 34 ¶ | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| | IP Header Plus Extensions | | | IP Header Plus Extensions | | |||
| ~ {Total, Payload} Length = M ~ | ~ {Total, Payload} Length = M ~ | |||
| | Jumbo Payload Length = N | | | Jumbo Payload Length = N | | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| where J is the total number of segments (between 1 and 64), L is the | where J is the total number of segments (between 1 and 64), L is the | |||
| length of each non-final segment which MUST NOT be larger than 65535 | length of each non-final segment which MUST NOT be larger than 65535 | |||
| octets (minus headers as above) and K is the length of the final | octets (minus headers as above) and K is the length of the final | |||
| segment which MUST NOT be larger than L. The values M and N are then | segment which MUST NOT be larger than L. The values M and N are then | |||
| set to the length of the IP header plus extensions for IPv4 or to the | set to the length of the IP header for IPv4 or to the length of the | |||
| length of the extensions only for IPv6, then further calculated as | extension headers only for IPv6, then further calculated as follows: | |||
| follows: | ||||
| M = M + ((J-1) ? L : K) | M = M + ((J-1) ? L : K) | |||
| N = N + (((J-1) * L) + K) | N = N + (((J-1) * L) + K) | |||
| Using TCP [RFC0793] for example, each of the J segments would include | ||||
| its own TCP header, including Sequence Number, Checksum, etc. The | ||||
| Sequence Number plus segment length (K or L) therefore provides the | ||||
| destination with the necessary parameters for application data | ||||
| reassembly while the Checksum provides per-segment application data | ||||
| integrity. | ||||
| Note: a "singleton" parcel is one that includes only the IP header | Note: a "singleton" parcel is one that includes only the IP header | |||
| plus extensions with a single segment of length K, while a "null" | plus extensions with J=1 and a single segment of length K, while a | |||
| parcel is a singleton with K=0, i.e., a parcel consisting of only the | "null" parcel is a singleton with (J=1; K=0), i.e., a parcel | |||
| IP header plus extensions with no octets beyond. | consisting of only the IP header plus extensions with no octets | |||
| beyond. | ||||
| 5. Transmission of IP Parcels | 5. Transmission of IP Parcels | |||
| The IP layer next presents the parcel to the outgoing network | The IP layer next presents the parcel to the outgoing network | |||
| interface. For ordinary IP interfaces, the IP layer simply forwards | interface. For ordinary IP interfaces, the interface simply forwards | |||
| the parcel over the underlying link the same as for any IP packet | the parcel over the underlying link the same as for any IP packet | |||
| after which it may then be forwarded by any number of routers over | after which it may then be forwarded by any number of routers over | |||
| additional parcel-capable IP links. If any next hop IP link in the | additional consecutive parcel-capable IP links. If any next hop IP | |||
| path either does not support parcels or configures an MTU that is too | link in the path either does not support parcels or configures an MTU | |||
| small to transit the parcel without fragmentation, the router instead | that is too small to transit the parcel without fragmentation, the | |||
| opens the parcel and forwards each enclosed segment as a separate IP | router instead opens the parcel and forwards each enclosed segment as | |||
| packet (i.e., by appending a copy of the parcel's IP header to each | a separate IP packet (i.e., by appending a copy of the parcel's IP | |||
| segment but with the Jumbo Payload option removed according to the | header to each segment but with the Jumbo Payload option removed | |||
| standards [RFC0791][RFC8200]). Or, if the router does not recognize | according to the standards [RFC0791][RFC8200]). Or, if the router | |||
| parcels at all, it drops the parcel and (for IPv6) may return an ICMP | does not recognize parcels at all, it drops the parcel and may return | |||
| "Parameter Problem" message according to [RFC2675]. | an ICMP "Parameter Problem" message. | |||
| If the outgoing network interface is an OMNI interface | If the outgoing network interface is an OMNI interface | |||
| [I-D.templin-6man-omni], the OMNI Adaptation Layer (OAL) of this | [I-D.templin-6man-omni], the OMNI Adaptation Layer (OAL) of this | |||
| First Hop Segment (FHS) OAL node forwards the parcel to the next OAL | First Hop Segment (FHS) OAL source node forwards the parcel to the | |||
| hop which may be either an OAL intermediate node or the Last Hop | next OAL hop which may be either an OAL intermediate node or a Last | |||
| Segment (LHS) OAL node (which may also be the final destination | Hop Segment (LHS) OAL destination node (which may also be the final | |||
| itself). The OAL assigns a monotonically- incrementing (modulo 127) | destination itself). The OAL source assigns a monotonically- | |||
| "Parcel ID" and subdivides the parcel into sub-parcels no larger than | incrementing (modulo 127) "Parcel ID" and subdivides the parcel into | |||
| the maximum of the path MTU to the next hop or 65535 octets (minus | sub-parcels no larger than the maximum of the path MTU to the next | |||
| the length of encapsulation headers) by determining the number of | hop or 65535 octets (minus headers) by determining the number of | |||
| segments of length L that can fit into each sub-parcel under these | segments of length L that can fit into each sub-parcel under these | |||
| size constraints. For example, if the OAL determines that a sub- | size constraints. For example, if the OAL source determines that a | |||
| parcel can contain 3 segments of length L, it creates sub-parcels | sub-parcel can contain 3 segments of length L, it creates sub-parcels | |||
| with the first containing segments 1-3, the second containing | with the first containing segments 1-3, the second containing | |||
| segments 4-6, etc. and with the final containing any remaining | segments 4-6, etc. and with the final containing any remaining | |||
| segments. The OAL then appends an identical IP header plus | segments. The OAL source then appends an identical IP header plus | |||
| extensions to each sub-parcel while resetting M and N in each | extensions to each sub-parcel while resetting M and N in each | |||
| according to the above equations with J set to 3 and K set to L for | according to the above equations with J set to 3 (and K = L) for each | |||
| each non-final sub-parcel and with J set to the remaining number of | non-final sub-parcel and with J set to the remaining number of | |||
| segments for the final sub-parcel. | segments for the final sub-parcel. | |||
| The OAL next performs IP encapsulation on each sub-parcel with | The OAL source next performs IP encapsulation on each sub-parcel with | |||
| destination set to the next hop IP address then inserts an IPv6 | destination set to the next hop IP address then inserts an IPv6 | |||
| Fragment Header after the IP encapsulation header, i.e., even if the | Fragment Header after the IP encapsulation header, i.e., even if the | |||
| encapsulation header is IPv4, even if no actual fragmentation is | encapsulation header is IPv4, even if no actual fragmentation is | |||
| needed and/or even if the Jumbo Payload option is present. The OAL | needed and/or even if the Jumbo Payload option is present. The OAL | |||
| then assigns a randomly-initialized 32-bit Identification number that | source then assigns a randomly-initialized 32-bit Identification | |||
| is monotonically-incremented for each consecutive sub-parcel, then | number that is monotonically-incremented for each consecutive sub- | |||
| performs IPv6 fragmentation over the sub-parcel if necessary to | parcel, then performs IPv6 fragmentation over the sub-parcel if | |||
| create fragments small enough to traverse the path to the next OAL | necessary to create fragments small enough to traverse the path to | |||
| hop while writing the Parcel ID and setting or clearing the "Parcel | the next OAL hop while writing the Parcel ID and setting or clearing | |||
| (P)" and "(More) Sub-Parcels (S)" bits in the Fragment Header of the | the "Parcel (P)" and "(More) Sub-Parcels (S)" bits in the Fragment | |||
| first fragment (see: [I-D.templin-6man-fragrep]). (The OAL sets P to | Header of the first fragment (see: [I-D.templin-6man-fragrep]). (The | |||
| 1 for a parcel or to 0 for a non-parcel. When P is 1, the OAL next | OAL source sets P to 1 for a parcel or to 0 for a non-parcel. When P | |||
| sets S to 1 for non-final sub-parcels or to 0 if the sub-parcel | is 1, the OAL source next sets S to 1 for non-final sub-parcels or to | |||
| contains the final segment.) The OAL then forwards each IP | 0 if the sub-parcel contains the final segment.) The OAL source then | |||
| encapsulated packet/fragment to the next OAL hop. | forwards each IP encapsulated packet/fragment to the next OAL hop. | |||
| When the next OAL hop receives the encapsulated IP fragments or whole | When the next OAL hop receives the encapsulated IP fragments or whole | |||
| packets, it reassembles if necessary. If the P flag in the first | packets, it reassembles if necessary. If the P flag in the first | |||
| fragment is 0, the next hop then processes the reassembled entity as | fragment is 0, the next hop then processes the reassembled entity as | |||
| an ordinary IP packet; otherwise it continues processing as a sub- | an ordinary IP packet; otherwise it continues processing as a sub- | |||
| parcel. If the next hop is an OAL intermediate node, it retains the | parcel. If the next hop is an OAL intermediate node, it retains the | |||
| sub-parcels along with their Parcel ID and Identification values for | sub-parcels along with their Parcel ID and Identification values for | |||
| a brief time in hopes of re-combining with peer sub-parcels of the | a brief time in hopes of re-combining with peer sub-parcels of the | |||
| same original parcel identified by the 4-tuple consisting of the IP | same original parcel identified by the 4-tuple consisting of the IP | |||
| encapsulation source and destination, Identification and Parcel ID. | encapsulation source and destination, Identification and Parcel ID. | |||
| The combining entails the concatenation of the segments included in | The combining entails the concatenation of the segments included in | |||
| sub-parcels with the same Parcel ID and with Identification values | sub-parcels with the same Parcel ID and with Identification values | |||
| within 64 of one another to create a larger sub-parcel possibly even | within 64 of one another to create a larger sub-parcel possibly even | |||
| as large as the entire original parcel. Order of concatenation is | as large as the entire original parcel. Order of concatenation is | |||
| not important, with the exception that the final sub-parcel (i.e., | not important, with the exception that the final sub-parcel (i.e., | |||
| the one with S set to 0) must occur as the final concatenation before | the one with S set to 0) must occur as the final concatenation before | |||
| transmission. The OAL then appends a common IP header plus | transmission. The OAL intermediate node then appends a common IP | |||
| extensions to each re-combined sub-parcel while resetting M and N in | header plus extensions to each re-combined sub-parcel while resetting | |||
| each according to the above equations with J, K and L set | M and N in each according to the above equations with J, K and L set | |||
| accordingly. | accordingly. | |||
| This OAL intermediate node next forwards the re-combined sub- | This OAL intermediate node next forwards the re-combined sub- | |||
| parcel(s) to the next hop toward the LHS OAL node using encapsulation | parcel(s) to the next hop toward the OAL destination using | |||
| the same as specified above. (The intermediate node MUST ensure that | encapsulation the same as specified above. (The intermediate node | |||
| the S flag remains set to 0 in the sub-parcel that contains the final | MUST ensure that the S flag remains set to 0 in the sub-parcel that | |||
| segment.) When the parcel or sub-parcels arrive at the LHS OAL node, | contains the final segment.) When the sub-parcel(s) arrive at the | |||
| the OAL re-combines them into the largest possible sub-parcels while | OAL destination, the OAL destination re-combines them into the | |||
| honoring the S flag. If the LHS OAL node is also the final | largest possible sub-parcels while honoring the S flag as above. If | |||
| destination, it delivers the sub-parcels to upper layers which act on | the OAL destination is also the final destination, it delivers the | |||
| the enclosed 5-tuple information supplied by the original source. If | sub-parcels to the IP layer which acts on the enclosed 5-tuple | |||
| the LHS OAL node is not the final destination, it instead forwards | information supplied by the original source. Otherwise, the OAL | |||
| each sub-parcel the same as for an ordinary IP packet the same as | destination forwards each sub-parcel toward the final destination the | |||
| discussed above. | same as for an ordinary IP packet the same as discussed above. | |||
| Note: while the LHS OAL node may be tempted to re-combine the sub- | Note: while the OAL destination and/or final destination could | |||
| parcels of multiple different parcels with identical upper layer | theoretically re-combine the sub-parcels of multiple different | |||
| protocol 5-tuples and with non-final segments of identical length, | parcels with identical upper layer protocol 5-tuples and with non- | |||
| this process could become complicated when the different parcels each | final segments of identical length, this process could become | |||
| have final segments of diverse lengths. Since this might interfere | complicated when the different parcels each have final segments of | |||
| with any perceived performance advantages, the decision of whether | diverse lengths. Since this might interfere with any perceived | |||
| and how to perform inter-parcel concatenation is an implementation | performance advantages, the decision of whether and how to perform | |||
| matter. | inter-parcel concatenation is an implementation matter. | |||
| Note: some IPv6 fragmentation and reassembly implementations may | Note: some IPv6 fragmentation and reassembly implementations may | |||
| require a well-formed IPv6 header to perform their operations. When | require a well-formed IPv6 header to perform their operations. When | |||
| the encapsulation is based on IPv4, such implementations translate | the encapsulation is based on IPv4, such implementations translate | |||
| the encapsulation header into an IPv6 header with IPv4-Mapped IPv6 | the encapsulation header into an IPv6 header with IPv4-Mapped IPv6 | |||
| addresses before performing the fragmentation/reassembly operation, | addresses before performing the fragmentation/reassembly operation, | |||
| then restore the original IPv4 header before further processing. | then restore the original IPv4 header before further processing. | |||
| 6. Parcel Path Qualification | 6. Parcel Path Qualification | |||
| To determine whether parcels are supported over at least a leading | To determine whether parcels are supported over at least a leading | |||
| portion of the forward path toward the final destination, the | portion of the forward path toward the final destination, the | |||
| original source can send a singleton IP parcel formatted as a "Parcel | original source can send a singleton IP parcel formatted as a "Parcel | |||
| Probe" that contains an upper layer protocol probe segment (e.g., a | Probe" that may include an upper layer protocol probe segment (e.g., | |||
| data segment, an ICMP Echo Request message, etc.). The purpose of | a data segment, an ICMP Echo Request message, etc.). The purpose of | |||
| the probe is to elicit a "Parcel Reply" and possibly also an ordinary | the probe is to elicit a "Parcel Reply" and possibly also an ordinary | |||
| upper layer protocol probe reply from the final destination. | upper layer protocol probe reply from the final destination. | |||
| If the original source receives a positive Parcel Reply, it marks the | If the original source receives a positive Parcel Reply, it marks the | |||
| path as "parcels supported" and ignores any ICMP [RFC0792][RFC4443] | path as "parcels supported" and ignores any ICMP [RFC0792][RFC4443] | |||
| and/or Packet Too Big (PTB) messages [RFC1191][RFC8201] concerning | and/or Packet Too Big (PTB) messages [RFC1191][RFC8201] concerning | |||
| the probe. If the original source instead receives a negative Parcel | the probe. If the original source instead receives a negative Parcel | |||
| Reply or no reply, it marks the path as "parcels not supported" and | Reply or no reply, it marks the path as "parcels not supported" and | |||
| may regard any ICMP and/or PTB messages concerning the probe (or its | may regard any ICMP and/or PTB messages concerning the probe (or its | |||
| contents) as indications of a possible middlebox restriction. | contents) as indications of a possible path MTU restriction. | |||
| The original source can therefore send Parcel Probes in parallel with | The original source can therefore send Parcel Probes in parallel with | |||
| sending real data as ordinary IP packets. If the original source | sending real data as ordinary IP packets. If the original source | |||
| receives a positive Parcel Reply, it can begin using IP parcels. | receives a positive Parcel Reply, it can begin using IP parcels. | |||
| Parcel Probes use the IP parcel option type (see: Section 4) but set | Parcel Probes use the Jumbo Payload option type (see: Section 4) but | |||
| a different option length and replace the option value with control | set a different option length and replace the option value with | |||
| information plus a 4-octet "Path MTU" value into which conformant | control information plus a 4-octet "Path MTU" value into which | |||
| middleboxes write the minimum link MTU observed the same as described | conformant middleboxes write the minimum link MTU observed in a | |||
| in [RFC1063][I-D.ietf-6man-mtu-option]. Parcel Probes can also | similar fashion as described in [RFC1063][I-D.ietf-6man-mtu-option]. | |||
| include an upper layer protocol probe segment the same as described | Parcel Probes can also include an upper layer protocol probe segment, | |||
| in [RFC4821][RFC8899]. | e.g., per [RFC4821][RFC8899]. When an upper layer protocol probe | |||
| segment is included, it appears immediately after the IP header plus | ||||
| extensions. | ||||
| The original source sends Parcel Probes unidirectionally in the | The original source sends Parcel Probes unidirectionally in the | |||
| forward path toward the final destination to elicit a Parcel Reply, | forward path toward the final destination to elicit a Parcel Reply, | |||
| since it will often be the case that IP parcels are supported only in | since it will often be the case that IP parcels are supported only in | |||
| the forward path and not in the return path. Parcel Probes may be | the forward path and not in the return path. Parcel Probes may be | |||
| dropped in the forward path by any node that does not recognize IP | dropped in the forward path by any node that does not recognize IP | |||
| parcels, but a Parcel Reply must be packaged such that it will not be | parcels, but a Parcel Reply must not be dropped even if IP parcels | |||
| dropped even if IP parcels are not recognized in the return path. | are not recognized along portions of the return path. For this | |||
| reason, Parcel Probes are packaged as IPv4 (header) options or IPv6 | ||||
| Hop-by-Hop options while Parcel Replys are always packaged as IPv6 | ||||
| Destination Options (i.e., regardless of the IP protocol version). | ||||
| Original sources send Parcel Probes using the following option | Original sources send Parcel Probes and Replys that include a Jumbo | |||
| format: | Payload option coded in an alternate format as follows: | |||
| +--------+--------+ | ||||
| | Type | Length | | ||||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| | Nonce-1 | Nonce-2 (0-1) | | |Opt Type|Opt Len | Nonce-1 | | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| | Nonce-2 (2-3) |Reserved| Check | | | Nonce-2 | | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| | PMTU | | | PMTU | | |||
| +--------+--------+--------+--------+ | +--------+--------+--------+--------+ | |||
| | Code | Check | | ||||
| +--------+--------+ | ||||
| For IPv4, the original source MUST set Type to '00001011' and Length | For IPv4, the original source includes the option as an IPv4 option | |||
| to '00001110' - this is the same Type as for an ordinary IPv4 parcel | with Type set to '00001011' the same as for an ordinary IPv4 parcel | |||
| (see: Section 4) but with a different Length. The original source | (see: Section 4) but with Length set to '00001110' to distinguish | |||
| then MUST set Nonce-1 to 0xffff, set Nonce-2 to a (pseudo)-random | this as a probe/reply. The original source sets Nonce-1 to 0xffff, | |||
| 32-bit value and set Reserved to 0. The original source finally MUST | sets Nonce-2 to a (pseudo)-random 32-bit value and sets PMTU to the | |||
| set Check to the same value that will appear in the TTL of the | MTU of the outgoing IPv4 interface. The original source then sets | |||
| outgoing IPv4 header and MUST set PMTU to the MTU of the outgoing | Code to 0, sets Check to the same value that will appear in the TTL | |||
| IPv4 interface. The original source finally sends the Parcel Probe | of the outgoing IPv4 header, then finally sets IPv4 Total Length to | |||
| over the outgoing IPv4 interface. According to [RFC7126], | the lengths of the IPv4 header plus the upper layer protocol probe | |||
| middleboxes (i.e., routers, security gateways, firewalls, etc.) that | segment (if any) and sends the Parcel Probe via the outgoing IPv4 | |||
| do not observe this specification SHOULD drop IP packets that contain | interface. According to [RFC7126], middleboxes (i.e., routers, | |||
| option type '00001011' ("IPv4 Probe MTU") but some might instead | security gateways, firewalls, etc.) that do not observe this | |||
| either implement [RFC1063] or ignore the option altogether. IPv4 | specification SHOULD drop IP packets that contain option type | |||
| '00001011' ("IPv4 Probe MTU") but some might instead either attempt | ||||
| to implement [RFC1063] or ignore the option altogether. IPv4 | ||||
| middleboxes that observe this specification instead MUST process the | middleboxes that observe this specification instead MUST process the | |||
| option as a Parcel Probe as specified below. | option as a Parcel Probe as specified below. | |||
| For IPv6, the original source MUST set Type to '11000010' and Length | For IPv6, the original source includes the probe option as an IPv6 | |||
| to '00001100' - this is the same Type as for an ordinary IPv6 parcel | Hop-by-Hop option with Type set to '11000010' the same as for an | |||
| (see: Section 4) but with a different Length. The original source | ordinary IPv6 parcel (see: Section 4) but with Length set to | |||
| then MUST set both Nonce-1 and Nonce-2 to a (pseudo)-random 48-bit | '00001100' to distinguish this as a probe. The original source sets | |||
| value and set Reserved to 0. The original source finally MUST set | the concatenation of Nonce-1 and Nonce-2 to a (pseudo)-random 48-bit | |||
| Check to the same value that will appear in the Hop Limit of the | value and sets PMTU to the MTU of the outgoing IPv6 interface. The | |||
| outgoing IPv6 header and MUST set PMTU to the MTU of the outgoing | original source then sets Code to 0, sets Check to the same value | |||
| IPv6 interface. The original source finally sends the Parcel Probe | that will appear in the Hop Limit of the outgoing IPv6 header, then | |||
| over the outgoing IPv6 interface. According to [RFC2675], | finally sets IPv6 Payload Length to the lengths of the IPv6 extension | |||
| middleboxes (i.e., routers, security gateways, firewalls, etc.) that | headers plus the upper layer protocol probe segment (if any) and | |||
| understand the IPv6 Jumbo Payload option are required to detect a | sends the Parcel Probe via the outgoing IPv6 interface. According to | |||
| number of possible format errors and return an ICMPv6 Parameter | [RFC2675], middleboxes (i.e., routers, security gateways, firewalls, | |||
| Problem message, but no guidance is given regarding forwarding. IPv6 | etc.) that recognize the IPv6 Jumbo Payload option but do not observe | |||
| middleboxes that observe this specification instead MUST process the | this specification SHOULD return an ICMPv6 Parameter Problem message | |||
| option as a Parcel Probe as specified below. | (and presumably also drop the packet). IPv6 middleboxes that observe | |||
| this specification instead MUST process the option as a Parcel Probe | ||||
| as specified below. | ||||
| When a middlebox that observes this specification receives a Parcel | When a middlebox that observes this specification receives a Parcel | |||
| Probe it first compares the Check value with the IP header Hop Limit/ | Probe it first compares the Check value with the IP header Hop Limit/ | |||
| TTL; if the values differ, the middlebox MUST return a negative | TTL; if the values differ, the middlebox MUST return a negative | |||
| Parcel Reply (see below) and drop the probe. | Parcel Reply (see below) and drop the probe. Otherwise, if the next | |||
| hop IP link either does not support parcels or configures an MTU that | ||||
| Next, if the next hop IP link either does not support parcels or | is too small to pass the probe, the middlebox compares the PMTU value | |||
| configures an MTU that is too small to pass the parcel, the middlebox | with the MTU of the inbound link for the probe and MUST (re)set PMTU | |||
| compares the Parcel Probe PMTU value with the MTU of the inbound link | to the lower MTU. The middlebox then MUST return a positive Parcel | |||
| Reply (see below) and convert the probe into an ordinary IP packet by | ||||
| removing the probe option according to [RFC0791] or [RFC8200]. If | ||||
| the next hop IP link configures a sufficiently large MTU to pass the | ||||
| packet, the middlebox then MUST forward the packet to the next hop; | ||||
| otherwise, it MUST drop the packet and return a suitable PTB. If the | ||||
| next hop IP link both supports parcels and configures an MTU that is | ||||
| large enough to pass the probe, the middlebox instead compares the | ||||
| probe PMTU value with the MTUs of both the inbound and outbound links | ||||
| for the probe and MUST (re)set PMTU to the lower MTU. The middlebox | for the probe and MUST (re)set PMTU to the lower MTU. The middlebox | |||
| then MUST return a positive Parcel Reply (see below) and convert the | then MUST reset Check to the same value that will appear in the TTL/ | |||
| probe into an ordinary IP packet by removing the Parcel Probe option | Hop Limit of the outgoing IP header, and MUST forward the Parcel | |||
| according to the standards [RFC0791][RFC8200]. If the next hop IP | Probe to the next hop. | |||
| link configures a sufficiently large MTU, the middlebox then MUST | ||||
| forward the ordinary IP packet to the next hop; otherwise, it MUST | ||||
| drop the packet and should return a suitable PTB. | ||||
| If the next hop IP link both supports parcels and configures an MTU | ||||
| that is large enough to pass the parcel, the middlebox instead | ||||
| compares the Parcel Probe PMTU value with the MTUs of both the | ||||
| inbound and outbound links for the probe and MUST (re)set PMTU to the | ||||
| lower MTU. The middlebox then MUST reset Check to the same value | ||||
| that will appear in the TTL/Hop Limit of the outgoing IP header, and | ||||
| MUST forward the Parcel Probe to the next hop. | ||||
| The final destination may therefore receive either an ordinary IP | The final destination may therefore receive either an ordinary IP | |||
| packet containing an upper layer protocol probe or a Parcel Probe. | packet containing an upper layer protocol probe or a Parcel Probe. | |||
| If the final destination receives an ordinary IP packet, it performs | If the final destination receives an ordinary IP packet, it performs | |||
| any necessary integrity checks then delivers the packet to upper | any necessary integrity checks then delivers the packet to upper | |||
| layers which will return an upper layer probe response. If the final | layers which will return an upper layer probe response. If the final | |||
| destination instead receives a Parcel Probe, it first compares the | destination instead receives a Parcel Probe, it first compares the | |||
| Check value with the IP header Hop Limit/TTL; if the values differ, | Check value with the IP header Hop Limit/TTL; if the values differ, | |||
| the final destination MUST drop the probe and return a negative | the final destination MUST drop the probe and return a negative | |||
| Parcel Reply (see below). Otherwise, the final destination compares | Parcel Reply (see below). Otherwise, the final destination compares | |||
| the Parcel Probe PMTU value with the MTU of the inbound link for the | the probe PMTU value with the MTU of the inbound link and MUST | |||
| probe and MUST (re)set PMTU to the lower MTU. The final destination | (re)set PMTU to the lower MTU. The final destination then MUST | |||
| then MUST return a positive Parcel Reply (see below) and convert the | return a positive Parcel Reply (see below) and convert the probe into | |||
| probe into an ordinary IP packet by removing the Parcel Probe option | an ordinary IP packet by removing the Parcel Probe option according | |||
| according to the standards [RFC0791][RFC8200].The final destination | to the standards [RFC0791][RFC8200].The final destination then | |||
| then performs any necessary integrity checks and delivers the packet | performs any necessary integrity checks and delivers the packet to | |||
| to upper layers. | upper layers. | |||
| When the middlebox or final destination returns a Parcel Reply, it | When the middlebox or final destination returns a Parcel Reply, it | |||
| prepares an IP header of the same protocol version that appeared in | prepares an IP header of the same protocol version that appeared in | |||
| the Parcel Probe with source and destination addresses reversed, with | the Parcel Probe with source and destination addresses reversed, with | |||
| {Protocol, Next Header} set to the value '60' (i.e., "IPv6 | {Protocol, Next Header} set to the value '60' (i.e., "IPv6 | |||
| Destination Option") and with an IPv6 Destination Option header with | Destination Option") and with an IPv6 Destination Option header with | |||
| Next Header set to the value '59' (i.e., "IPv6 No Next Header") | Next Header set to the value '59' (i.e., "IPv6 No Next Header") | |||
| [RFC8200]. The node next copies the body of the Parcel Probe option | [RFC8200]. The node next copies the body of the Parcel Probe option | |||
| as the sole Parcel Reply Destination Option (and for IPv4 resets Type | as the sole Parcel Reply Destination Option (and for IPv4 resets Type | |||
| to '11000010' and Length to '00001100') and includes no other octets | to '11000010' and Length to '00001100') and includes no other octets | |||
| beyond the end of the option. The node then MUST (re)set Check to 1 | beyond the end of the option. The node then MUST (re)set Check to 1 | |||
| for a positive or to 0 for a negative Parcel Reply, then MUST finally | for a positive or to 0 for a negative Parcel Reply, then MUST finally | |||
| set the IP header {Total, Payload} Length field according to the | set the IP header {Total, Payload} Length field according to the | |||
| length of the included Destination Option and return the Parcel Reply | length of the included Destination Option and return the Parcel Reply | |||
| to the source. (Since filtering middleboxes may drop IPv4 packets | to the source. (Since filtering middleboxes may drop IPv4 packets | |||
| with Protocol '60' the destination should wrap an IPv4 Parcel Reply | with Protocol '60' the destination MUST wrap an IPv4 Parcel Reply in | |||
| in UDP/IPv4 headers with the IPv4 source and destination addresses | UDP/IPv4 headers with the IPv4 source and destination addresses | |||
| copied from the Parcel Reply and with UDP port numbers set to the UDP | copied from the Parcel Reply and with UDP port numbers set to the UDP | |||
| port number for OMNI [I-D.templin-6man-omni].) | port number for OMNI [I-D.templin-6man-omni].) | |||
| After sending a Parcel Probe the original source may therefore | After sending a Parcel Probe the original source may therefore | |||
| receive a Parcel Reply (see above) and/or an upper layer protocol | receive a Parcel Reply (see above) and/or an upper layer protocol | |||
| probe reply. If the source receives a Parcel Reply, it first matches | probe reply. If the source receives a Parcel Reply, it first matches | |||
| Nonce-2 (and for IPv6 only also matches Nonce-1) with the values it | Nonce-2 (and for IPv6 only also matches Nonce-1) with the values it | |||
| had included in the Parcel Probe. If the values do not match, the | had included in the Parcel Probe. If the values do not match, the | |||
| source discards the Parcel Reply. Next, the source examines the | source discards the Parcel Reply. Next, the source examines the | |||
| Check value and marks the path as "parcels supported" if the value is | Check value and marks the path as "parcels supported" if the value is | |||
| 1 or "parcels not supported" otherwise. If the source marks the path | 1 or "parcels not supported" otherwise. If the source marks the path | |||
| as "parcels supported", it also records the PMTU value as the maximum | as "parcels supported", it also records the PMTU value as the maximum | |||
| parcel size for the forward path to this destination. | parcel size for the forward path to this destination. | |||
| After receiving a positive Parcel Reply, the original source can | After receiving a positive Parcel Reply, the original source can | |||
| begin sending IP parcels up to the size recorded in the PMTU to the | begin sending IP parcels addressed to the final destination up to the | |||
| final destination. Any upper layer protocol probe replies will | size recorded in the PMTU. Any upper layer protocol probe replies | |||
| determine the maximum segment size that can be included in the | will determine the maximum segment size that can be included in the | |||
| parcel, but this is an upper layer consideration. The original | parcel, but this is an upper layer consideration. The original | |||
| source should then periodically re-initiate Parcel Path Qualification | source should then periodically re-initiate Parcel Path Qualification | |||
| as long as its session with the final destination is sustained (i.e., | as long as it continues to forward parcels toward the final | |||
| in case the forward path fluctuates). If at any time performance | destination (i.e., in case the forward path fluctuates). If at any | |||
| appears to degrade, the original source should immediately re- | time performance appears to degrade, the original source should cease | |||
| initiate Parcel Path Qualification. | sending IP parcels and/or re-initiate Parcel Path Qualification. | |||
| Note: For IPv4, the original source sets the Parcel Probe Nonce-1 | ||||
| field to 0xffff on transmission and ignores the Nonce-1 field value | ||||
| in any corresponding Parcel Replys. This avoids any possible | ||||
| confusion in case an IPv4 router on the path rewrites the Nonce-1 | ||||
| field in a wayward attempt to implement [RFC1063]. | ||||
| Note: The PMTU value returned in a positive Parcel Reply determines | ||||
| only the maximum IP parcel size for the path, while the maximum upper | ||||
| layer protocol segment size may be significantly smaller. The upper | ||||
| layer protocol segment size is instead determined separately | ||||
| according to any upper layer protocol probes and must be assumed to | ||||
| be no larger than 1/64th of the maximum IP parcel size unless a | ||||
| larger size is discovered by probing. | ||||
| 7. Integrity | 7. Integrity | |||
| Each segment of a (multi-segment) IP parcel includes its own upper | Each segment of a (multi-segment) IP parcel includes its own upper | |||
| layer protocol integrity check. This allows for IP parcels to | layer protocol integrity check. This means that IP parcels can | |||
| support much stronger integrity for the same amount of upper layer | support stronger integrity for the same amount of upper layer | |||
| protocol data in comparison with an ordinary IP packet or Jumbogram | protocol data in comparison with an ordinary IP packet or Jumbogram | |||
| containing only a single segment. The integrity checks must then be | containing only a single segment. The integrity checks must then be | |||
| verified at the final destination, which accepts any segments with | verified at the final destination, which accepts any segments with | |||
| correct integrity while discarding any corrupted segments and | correct integrity while discarding all other segments and counting | |||
| counting them as a loss event. | them as a loss event. | |||
| IP parcels can range in length from as small as only the IP header | IP parcels can range in length from as small as only the IP headers | |||
| sizes to as large as the IP headers plus (64 * (65535 minus headers)) | themselves to as large as the IP headers plus (64 * (65535 minus | |||
| octets. Although link layer integrity checks provide sufficient | headers)) octets. Although link layer integrity checks provide | |||
| protection for contiguous data blocks up to approximately 9KB, | sufficient protection for contiguous data blocks up to approximately | |||
| reliance on the presence of link-layer integrity checks may not be | 9KB, reliance on the presence of link-layer integrity checks may not | |||
| possible over links such as tunnels. Moreover, the segment contents | be possible over links such as tunnels. Moreover, the segment | |||
| of a received parcel may arrive in an incomplete and/or rearranged | contents of a received parcel may arrive in an incomplete and/or | |||
| order with respect to their original packaging. | rearranged order with respect to their original packaging. | |||
| For these reasons, the OAL at each hop of an OMNI link includes an | For these reasons, the OAL at each hop of an OMNI link includes an | |||
| integrity check when it performs IP fragmentation on a sub-parcel, | integrity check when it performs IP fragmentation on a sub-parcel, | |||
| with the integrity verified during reassembly at the next hop. | with the integrity verified during reassembly at the next hop. | |||
| 8. RFC2675 Updates | 8. RFC2675 Updates | |||
| Section 3 of [RFC2675] provides a list of certain conditions to be | Section 3 of [RFC2675] provides a list of certain conditions to be | |||
| considered as errors. In particular: | considered as errors. In particular: | |||
| skipping to change at page 14, line 25 ¶ | skipping to change at page 15, line 21 ¶ | |||
| error: Jumbo Payload option present and Jumbo Payload Length < | error: Jumbo Payload option present and Jumbo Payload Length < | |||
| 65,536 | 65,536 | |||
| Implementations that obey this specification ignore these conditions | Implementations that obey this specification ignore these conditions | |||
| and do not consider them as errors. | and do not consider them as errors. | |||
| 9. IPv4 Jumbograms | 9. IPv4 Jumbograms | |||
| By defining a new IPv4 Jumbo Payload option, this document also | By defining a new IPv4 Jumbo Payload option, this document also | |||
| implicitly enables an IPv4 jumbogram service defined as an IPv4 | implicitly enables a true IPv4 jumbogram service defined as an IPv4 | |||
| packet with Total Length set to the length of the IPv4 header plus | packet with a Jumbo Payload option included and with Total Length set | |||
| extensions only, and with a Jumbo Payload option in the IPv4 | to the length of the IPv4 header only. All other aspects of IPv4 | |||
| extension headers. All other aspects of IPv4 jumbograms are the same | jumbograms are the same as for IPv6 jumbograms [RFC2675]. | |||
| as for IPv6 jumbograms [RFC2675]. | ||||
| 10. Implementation Status | 10. Implementation Status | |||
| Common widely-deployed implementations include services such as TCP | Common widely-deployed implementations include services such as TCP | |||
| Segmentation Offload (TSO) and Generic Segmentation/Receive Offload | Segmentation Offload (TSO) and Generic Segmentation/Receive Offload | |||
| (GSO/GRO). These services support a robust (but not standardized) | (GSO/GRO). These services support a robust (but not standardized) | |||
| service that has been shown to improve performance in many instances. | service that has been shown to improve performance in many instances. | |||
| Implementation of the IP parcel service is a work in progress. | Implementation of the IP parcel service is a work in progress. | |||
| 11. IANA Considerations | 11. IANA Considerations | |||
| The IANA is instructed to change the "MTUP - MTU Probe" entry in the | The IANA is instructed to change the "MTUP - MTU Probe" entry in the | |||
| 'ip option numbers' registry to the "JUMBO - IPv4 Jumbo Payload" | 'ip option numbers' registry to the "JUMBO - IPv4 Jumbo Payload" | |||
| option. The Copy and Class fields must both be set to 0, and the | option. The Copy and Class fields must both be set to 0, and the | |||
| Number and Value fields must both be set to 11'. The reference must | Number and Value fields must both be set to 11'. The reference must | |||
| be changed to this document (RFCXXXX). | be changed to this document [RFCXXXX]. | |||
| 12. Security Considerations | 12. Security Considerations | |||
| Original sources match the Jumbo Payload Length and Nonce values in | Original sources match the Nonce values in received Parcel Replies | |||
| received Parcel Replies with the Parcel Probes they send. If the | with their corresponding Parcel Probes. If the values match, the | |||
| values match, the Parcel Reply is likely an authentic response to the | Parcel Reply is likely an authentic response to the Parcel Probe. In | |||
| Parcel Probe. In environments where stronger authentication is | environments where stronger authentication is necessary, the message | |||
| necessary, the encapsulating authentication services of OMNI can be | authentication services of OMNI can be applied | |||
| used [I-D.templin-6man-omni]. | [I-D.templin-6man-omni]. | |||
| Communications networking security is necessary to preserve | Multi-layer security solutions may be necessary to ensure | |||
| confidentiality, integrity and availability. | confidentiality, integrity and availability in some environments. | |||
| 13. Acknowledgements | 13. Acknowledgements | |||
| This work was inspired by ongoing AERO/OMNI/DTN investigations. The | This work was inspired by ongoing AERO/OMNI/DTN investigations. The | |||
| concepts were further motivated through discussions on the intarea | concepts were further motivated through discussions on the intarea | |||
| and 6man lists. | and 6man lists. | |||
| A considerable body of work over recent years has produced useful | A considerable body of work over recent years has produced useful | |||
| "segmentation offload" facilities available in widely-deployed | "segmentation offload" facilities available in widely-deployed | |||
| implementations. | implementations. | |||
| skipping to change at page 16, line 19 ¶ | skipping to change at page 17, line 8 ¶ | |||
| 14.2. Informative References | 14.2. Informative References | |||
| [BIG-TCP] Dumazet, E., "BIG TCP, Netdev 0x15 Conference (virtual), | [BIG-TCP] Dumazet, E., "BIG TCP, Netdev 0x15 Conference (virtual), | |||
| https://netdevconf.info/0x15/session.html?BIG-TCP", 31 | https://netdevconf.info/0x15/session.html?BIG-TCP", 31 | |||
| August 2021. | August 2021. | |||
| [I-D.ietf-6man-mtu-option] | [I-D.ietf-6man-mtu-option] | |||
| Hinden, R. M. and G. Fairhurst, "IPv6 Minimum Path MTU | Hinden, R. M. and G. Fairhurst, "IPv6 Minimum Path MTU | |||
| Hop-by-Hop Option", Work in Progress, Internet-Draft, | Hop-by-Hop Option", Work in Progress, Internet-Draft, | |||
| draft-ietf-6man-mtu-option-12, 27 January 2022, | draft-ietf-6man-mtu-option-13, 28 February 2022, | |||
| <https://www.ietf.org/archive/id/draft-ietf-6man-mtu- | <https://www.ietf.org/archive/id/draft-ietf-6man-mtu- | |||
| option-12.txt>. | option-13.txt>. | |||
| [I-D.ietf-tcpm-rfc793bis] | [I-D.ietf-tcpm-rfc793bis] | |||
| Eddy, W. M., "Transmission Control Protocol (TCP) | Eddy, W. M., "Transmission Control Protocol (TCP) | |||
| Specification", Work in Progress, Internet-Draft, draft- | Specification", Work in Progress, Internet-Draft, draft- | |||
| ietf-tcpm-rfc793bis-26, 8 February 2022, | ietf-tcpm-rfc793bis-28, 7 March 2022, | |||
| <https://www.ietf.org/archive/id/draft-ietf-tcpm- | <https://www.ietf.org/archive/id/draft-ietf-tcpm- | |||
| rfc793bis-26.txt>. | rfc793bis-28.txt>. | |||
| [I-D.templin-6man-fragrep] | [I-D.templin-6man-fragrep] | |||
| Templin, F. L., "IPv6 Fragment Retransmission and Path MTU | Templin, F. L., "IPv6 Fragment Retransmission and Path MTU | |||
| Discovery Soft Errors", Work in Progress, Internet-Draft, | Discovery Soft Errors", Work in Progress, Internet-Draft, | |||
| draft-templin-6man-fragrep-05, 22 December 2021, | draft-templin-6man-fragrep-06, 9 February 2022, | |||
| <https://www.ietf.org/archive/id/draft-templin-6man- | <https://www.ietf.org/archive/id/draft-templin-6man- | |||
| fragrep-05.txt>. | fragrep-06.txt>. | |||
| [I-D.templin-6man-omni] | [I-D.templin-6man-omni] | |||
| Templin, F. L. and T. Whyman, "Transmission of IP Packets | Templin, F. L., "Transmission of IP Packets over Overlay | |||
| over Overlay Multilink Network (OMNI) Interfaces", Work in | Multilink Network (OMNI) Interfaces", Work in Progress, | |||
| Progress, Internet-Draft, draft-templin-6man-omni-52, 31 | Internet-Draft, draft-templin-6man-omni-55, 7 March 2022, | |||
| December 2021, <https://www.ietf.org/archive/id/draft- | <https://www.ietf.org/archive/id/draft-templin-6man-omni- | |||
| templin-6man-omni-52.txt>. | 55.txt>. | |||
| [I-D.templin-dtn-ltpfrag] | [I-D.templin-dtn-ltpfrag] | |||
| Templin, F. L., "LTP Fragmentation", Work in Progress, | Templin, F. L., "LTP Fragmentation", Work in Progress, | |||
| Internet-Draft, draft-templin-dtn-ltpfrag-08, 1 February | Internet-Draft, draft-templin-dtn-ltpfrag-08, 1 February | |||
| 2022, <https://www.ietf.org/archive/id/draft-templin-dtn- | 2022, <https://www.ietf.org/archive/id/draft-templin-dtn- | |||
| ltpfrag-08.txt>. | ltpfrag-08.txt>. | |||
| [QUIC] Ghedini, A., "Accelerating UDP packet transmission for | [QUIC] Ghedini, A., "Accelerating UDP packet transmission for | |||
| QUIC, https://blog.cloudflare.com/accelerating-udp-packet- | QUIC, https://blog.cloudflare.com/accelerating-udp-packet- | |||
| transmission-for-quic/", 8 January 2020. | transmission-for-quic/", 8 January 2020. | |||
| End of changes. 66 change blocks. | ||||
| 272 lines changed or deleted | 314 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||