| < draft-ietf-ipsecme-iptfs-06.txt | draft-ietf-ipsecme-iptfs-07.txt > | |||
|---|---|---|---|---|
| Network Working Group C. Hopps | Network Working Group C. Hopps | |||
| Internet-Draft LabN Consulting, L.L.C. | Internet-Draft LabN Consulting, L.L.C. | |||
| Intended status: Standards Track January 19, 2021 | Intended status: Standards Track February 22, 2021 | |||
| Expires: July 23, 2021 | Expires: August 26, 2021 | |||
| IP-TFS: IP Traffic Flow Security Using Aggregation and Fragmentation | IP-TFS: IP Traffic Flow Security Using Aggregation and Fragmentation | |||
| draft-ietf-ipsecme-iptfs-06 | draft-ietf-ipsecme-iptfs-07 | |||
| Abstract | Abstract | |||
| This document describes a mechanism to enhance IPsec traffic flow | This document describes a mechanism to enhance IPsec traffic flow | |||
| security by adding traffic flow confidentiality to encrypted IP | security (IP-TFS) by adding Traffic Flow Confidentiality (TFC) to | |||
| encapsulated traffic. Traffic flow confidentiality is provided by | encrypted IP encapsulated traffic. TFC is provided by obscuring the | |||
| obscuring the size and frequency of IP traffic using a fixed-sized, | size and frequency of IP traffic using a fixed-sized, constant-send- | |||
| constant-send-rate IPsec tunnel. The solution allows for congestion | rate IPsec tunnel. The solution allows for congestion control as | |||
| control as well as non-constant send-rate usage. | well as non-constant send-rate usage. The mechanisms defined in this | |||
| document are generic with the intent of allowing for non-TFS uses, | ||||
| but such uses are outside the scope of this document. | ||||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on July 23, 2021. | This Internet-Draft will expire on August 26, 2021. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2021 IETF Trust and the persons identified as the | Copyright (c) 2021 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 1.1. Terminology & Concepts . . . . . . . . . . . . . . . . . 3 | 1.1. Terminology & Concepts . . . . . . . . . . . . . . . . . 4 | |||
| 2. The IP-TFS Tunnel . . . . . . . . . . . . . . . . . . . . . . 4 | 2. The IP-TFS Tunnel . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 2.1. Tunnel Content . . . . . . . . . . . . . . . . . . . . . 4 | 2.1. Tunnel Content . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 2.2. Payload Content . . . . . . . . . . . . . . . . . . . . . 5 | 2.2. Payload Content . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 2.2.1. Data Blocks . . . . . . . . . . . . . . . . . . . . . 6 | 2.2.1. Data Blocks . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 2.2.2. No Implicit End Padding Required . . . . . . . . . . 6 | 2.2.2. End Padding . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 2.2.3. Fragmentation, Sequence Numbers and All-Pad Payloads 6 | 2.2.3. Fragmentation, Sequence Numbers and All-Pad Payloads 6 | |||
| 2.2.4. Empty Payload . . . . . . . . . . . . . . . . . . . . 8 | 2.2.4. Empty Payload . . . . . . . . . . . . . . . . . . . . 8 | |||
| 2.2.5. IP Header Value Mapping . . . . . . . . . . . . . . . 8 | 2.2.5. IP Header Value Mapping . . . . . . . . . . . . . . . 8 | |||
| 2.2.6. IP Time-To-Live (TTL) and Tunnel errors . . . . . . . 9 | 2.2.6. IP Time-To-Live (TTL) and Tunnel errors . . . . . . . 9 | |||
| 2.2.7. Effective MTU of the Tunnel . . . . . . . . . . . . . 9 | 2.2.7. Effective MTU of the Tunnel . . . . . . . . . . . . . 9 | |||
| 2.3. Exclusive SA Use . . . . . . . . . . . . . . . . . . . . 9 | 2.3. Exclusive SA Use . . . . . . . . . . . . . . . . . . . . 9 | |||
| 2.4. Modes of Operation . . . . . . . . . . . . . . . . . . . 9 | 2.4. Modes of Operation . . . . . . . . . . . . . . . . . . . 10 | |||
| 2.4.1. Non-Congestion Controlled Mode . . . . . . . . . . . 9 | 2.4.1. Non-Congestion Controlled Mode . . . . . . . . . . . 10 | |||
| 2.4.2. Congestion Controlled Mode . . . . . . . . . . . . . 10 | 2.4.2. Congestion Controlled Mode . . . . . . . . . . . . . 10 | |||
| 3. Congestion Information . . . . . . . . . . . . . . . . . . . 11 | 2.5. Summary of Receiver Processing . . . . . . . . . . . . . 12 | |||
| 3.1. ECN Support . . . . . . . . . . . . . . . . . . . . . . . 12 | 3. Congestion Information . . . . . . . . . . . . . . . . . . . 12 | |||
| 4. Configuration . . . . . . . . . . . . . . . . . . . . . . . . 13 | 3.1. ECN Support . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 4.1. Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . 13 | 4. Configuration . . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 4.2. Fixed Packet Size . . . . . . . . . . . . . . . . . . . . 13 | 4.1. Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 4.3. Congestion Control . . . . . . . . . . . . . . . . . . . 13 | 4.2. Fixed Packet Size . . . . . . . . . . . . . . . . . . . . 14 | |||
| 5. IKEv2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 | 4.3. Congestion Control . . . . . . . . . . . . . . . . . . . 14 | |||
| 5.1. USE_AGGFRAG Notification Message . . . . . . . . . . . . 13 | 5. IKEv2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 6. Packet and Data Formats . . . . . . . . . . . . . . . . . . . 14 | 5.1. USE_AGGFRAG Notification Message . . . . . . . . . . . . 14 | |||
| 6.1. AGGFRAG_PAYLOAD Payload . . . . . . . . . . . . . . . . . 14 | 6. Packet and Data Formats . . . . . . . . . . . . . . . . . . . 15 | |||
| 6.1.1. Non-Congestion Control AGGFRAG_PAYLOAD Payload Format 15 | 6.1. AGGFRAG_PAYLOAD Payload . . . . . . . . . . . . . . . . . 15 | |||
| 6.1.2. Congestion Control AGGFRAG_PAYLOAD Payload Format . . 15 | 6.1.1. Non-Congestion Control AGGFRAG_PAYLOAD Payload Format 16 | |||
| 6.1.3. Data Blocks . . . . . . . . . . . . . . . . . . . . . 17 | 6.1.2. Congestion Control AGGFRAG_PAYLOAD Payload Format . . 16 | |||
| 6.1.4. IKEv2 USE_AGGFRAG Notification Message . . . . . . . 19 | 6.1.3. Data Blocks . . . . . . . . . . . . . . . . . . . . . 18 | |||
| 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 | 6.1.4. IKEv2 USE_AGGFRAG Notification Message . . . . . . . 20 | |||
| 7.1. AGGFRAG_PAYLOAD Sub-Type Registry . . . . . . . . . . . . 20 | 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 | |||
| 7.2. USE_AGGFRAG Notify Message Status Type . . . . . . . . . 20 | 7.1. AGGFRAG_PAYLOAD Sub-Type Registry . . . . . . . . . . . . 21 | |||
| 8. Security Considerations . . . . . . . . . . . . . . . . . . . 20 | 7.2. USE_AGGFRAG Notify Message Status Type . . . . . . . . . 21 | |||
| 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 | 8. Security Considerations . . . . . . . . . . . . . . . . . . . 21 | |||
| 9.1. Normative References . . . . . . . . . . . . . . . . . . 21 | 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 22 | |||
| 9.2. Informative References . . . . . . . . . . . . . . . . . 21 | 9.1. Normative References . . . . . . . . . . . . . . . . . . 22 | |||
| Appendix A. Example Of An Encapsulated IP Packet Flow . . . . . 23 | 9.2. Informative References . . . . . . . . . . . . . . . . . 22 | |||
| Appendix B. A Send and Loss Event Rate Calculation . . . . . . . 24 | Appendix A. Example Of An Encapsulated IP Packet Flow . . . . . 24 | |||
| Appendix C. Comparisons of IP-TFS . . . . . . . . . . . . . . . 24 | Appendix B. A Send and Loss Event Rate Calculation . . . . . . . 25 | |||
| C.1. Comparing Overhead . . . . . . . . . . . . . . . . . . . 24 | Appendix C. Comparisons of IP-TFS . . . . . . . . . . . . . . . 25 | |||
| C.1.1. IP-TFS Overhead . . . . . . . . . . . . . . . . . . . 24 | C.1. Comparing Overhead . . . . . . . . . . . . . . . . . . . 25 | |||
| C.1.2. ESP with Padding Overhead . . . . . . . . . . . . . . 25 | C.1.1. IP-TFS Overhead . . . . . . . . . . . . . . . . . . . 26 | |||
| C.1.2. ESP with Padding Overhead . . . . . . . . . . . . . . 26 | ||||
| C.2. Overhead Comparison . . . . . . . . . . . . . . . . . . . 26 | C.2. Overhead Comparison . . . . . . . . . . . . . . . . . . . 27 | |||
| C.3. Comparing Available Bandwidth . . . . . . . . . . . . . . 26 | C.3. Comparing Available Bandwidth . . . . . . . . . . . . . . 28 | |||
| C.3.1. Ethernet . . . . . . . . . . . . . . . . . . . . . . 27 | C.3.1. Ethernet . . . . . . . . . . . . . . . . . . . . . . 28 | |||
| Appendix D. Acknowledgements . . . . . . . . . . . . . . . . . . 29 | Appendix D. Acknowledgements . . . . . . . . . . . . . . . . . . 30 | |||
| Appendix E. Contributors . . . . . . . . . . . . . . . . . . . . 29 | Appendix E. Contributors . . . . . . . . . . . . . . . . . . . . 30 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 29 | Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 30 | |||
| 1. Introduction | 1. Introduction | |||
| Traffic Analysis ([RFC4301], [AppCrypt]) is the act of extracting | Traffic Analysis ([RFC4301], [AppCrypt]) is the act of extracting | |||
| information about data being sent through a network. While one may | information about data being sent through a network. While directly | |||
| directly obscure the data through the use of encryption [RFC4303], | obscuring the data with encryption [RFC4303], the traffic pattern | |||
| the traffic pattern itself exposes information due to variations in | itself exposes information due to variations in its shape and timing | |||
| it's shape and timing ([I-D.iab-wire-image], [AppCrypt]). Hiding the | ([RFC8546], [AppCrypt]). Hiding the size and frequency of traffic is | |||
| size and frequency of traffic is referred to as Traffic Flow | referred to as Traffic Flow Confidentiality (TFC) per [RFC4303]. | |||
| Confidentiality (TFC) per [RFC4303]. | ||||
| [RFC4303] provides for TFC by allowing padding to be added to | [RFC4303] provides for TFC by allowing padding to be added to | |||
| encrypted IP packets and allowing for transmission of all-pad packets | encrypted IP packets and allowing for transmission of all-pad packets | |||
| (indicated using protocol 59). This method has the major limitation | (indicated using protocol 59). This method has the major limitation | |||
| that it can significantly under-utilize the available bandwidth. | that it can significantly under-utilize the available bandwidth. | |||
| The IP-TFS solution provides for full TFC without the aforementioned | The IP-TFS (IP Traffic Flow Security) solution provides for full TFC | |||
| bandwidth limitation. This is accomplished by using a constant-send- | without the aforementioned bandwidth limitation. This is | |||
| rate IPsec [RFC4303] tunnel with fixed-sized encapsulating packets; | accomplished by using a constant-send-rate IPsec [RFC4303] tunnel | |||
| however, these fixed-sized packets can contain partial, whole or | with fixed-sized encapsulating packets; however, these fixed-sized | |||
| multiple IP packets to maximize the bandwidth of the tunnel. A non- | packets can contain partial, whole or multiple IP packets to maximize | |||
| constant send-rate is allowed, but the confidentiality properties of | the bandwidth of the tunnel. A non-constant send-rate is allowed, | |||
| its use are outside the scope of this document. | but the confidentiality properties of its use are outside the scope | |||
| of this document. | ||||
| For a comparison of the overhead of IP-TFS with the RFC4303 | For a comparison of the overhead of IP-TFS with the RFC4303 | |||
| prescribed TFC solution see Appendix C. | prescribed TFC solution see Appendix C. | |||
| Additionally, IP-TFS provides for dealing with network congestion | Additionally, IP-TFS provides for operating fairly within congested | |||
| [RFC2914]. This is important for when the IP-TFS user is not in full | networks [RFC2914]. This is important for when the IP-TFS user is | |||
| control of the domain through which the IP-TFS tunnel path flows. | not in full control of the domain through which the IP-TFS tunnel | |||
| path flows. | ||||
| The mechanisms defined in this document are generic with the intent | ||||
| of allowing for non-TFS uses, but such uses are outside the scope of | ||||
| this document. | ||||
| 1.1. Terminology & Concepts | 1.1. Terminology & Concepts | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
| "OPTIONAL" in this document are to be interpreted as described in | "OPTIONAL" in this document are to be interpreted as described in BCP | |||
| [RFC2119] [RFC8174] when, and only when, they appear in all capitals, | 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
| as shown here. | capitals, as shown here. | |||
| This document assumes familiarity with IP security concepts described | This document assumes familiarity with IP security concepts including | |||
| in [RFC4301]. | TFC as described in [RFC4301]. | |||
| 2. The IP-TFS Tunnel | 2. The IP-TFS Tunnel | |||
| As mentioned in Section 1 IP-TFS utilizes an IPsec [RFC4303] tunnel | As mentioned in Section 1 IP-TFS utilizes an IPsec [RFC4303] tunnel | |||
| (SA) as it's transport. To provide for full TFC, fixed-sized | as its transport. To provide for full TFC, fixed-sized encapsulating | |||
| encapsulating packets are sent at a constant rate on the tunnel. | packets are sent at a constant rate on the tunnel. | |||
| The primary input to the tunnel algorithm is the requested bandwidth | The primary input to the tunnel algorithm is the requested bandwidth | |||
| used by the tunnel. Two values are then required to provide for this | to be used by the tunnel. Two values are then required to provide | |||
| bandwidth, the fixed size of the encapsulating packets, and rate at | for this bandwidth use, the fixed size of the encapsulating packets, | |||
| which to send them. | and rate at which to send them. | |||
| The fixed packet size MAY either be specified manually or could be | The fixed packet size MAY either be specified manually or be | |||
| determined through the other methods such as the Packetization Layer | determined through other methods such as the Packetization Layer MTU | |||
| MTU Discovery (PLMTUD) ([RFC4821], [RFC8899]) or Path MTU discovery | Discovery (PLMTUD) ([RFC4821], [RFC8899]) or Path MTU discovery | |||
| (PMTUD) ([RFC1191], [RFC8201]). PMTUD is known to have issues so | (PMTUD) ([RFC1191], [RFC8201]). PMTUD is known to have issues so | |||
| PLMTUD is considered the more robust option. | PLMTUD is considered the more robust option. For PLMTUD, congestion | |||
| control payloads can be used as in-band probes (see Section 6.1.2 and | ||||
| [RFC8899]). | ||||
| Given the encapsulating packet size and the requested tunnel used | Given the encapsulating packet size and the requested bandwidth to be | |||
| bandwidth, the corresponding packet send rate can be calculated. The | used, the corresponding packet send rate can be calculated. The | |||
| packet send rate is the requested bandwidth divided by the size of | packet send rate is the requested bandwidth to be used divided by the | |||
| the encapsulating packet. | size of the encapsulating packet. | |||
| The egress of the IP-TFS tunnel MUST allow for and expect the ingress | The egress (receiving) side of the IP-TFS tunnel MUST allow for and | |||
| (sending) side of the IP-TFS tunnel to vary the size and rate of sent | expect the ingress (sending) side of the IP-TFS tunnel to vary the | |||
| encapsulating packets, unless constrained by other policy. | size and rate of sent encapsulating packets, unless constrained by | |||
| other policy. | ||||
| 2.1. Tunnel Content | 2.1. Tunnel Content | |||
| As previously mentioned, one issue with the TFC padding solution in | As previously mentioned, one issue with the TFC padding solution in | |||
| [RFC4303] is the large amount of wasted bandwidth as only one IP | [RFC4303] is the large amount of wasted bandwidth as only one IP | |||
| packet can be sent per encapsulating packet. In order to maximize | packet can be sent per encapsulating packet. In order to maximize | |||
| bandwidth IP-TFS breaks this one-to-one association. | bandwidth, IP-TFS breaks this one-to-one association. | |||
| IP-TFS aggregates as well as fragments the inner IP traffic flow into | IP-TFS aggregates as well as fragments the inner IP traffic flow into | |||
| fixed-sized encapsulating IPsec tunnel packets. Padding is only | fixed-sized encapsulating IPsec tunnel packets. Padding is only | |||
| added to the the tunnel packets if there is no data available to be | added to the the tunnel packets if there is no data available to be | |||
| sent at the time of tunnel packet transmission, or if fragmentation | sent at the time of tunnel packet transmission, or if fragmentation | |||
| has been disabled by the receiver. | has been disabled by the receiver. | |||
| This is accomplished using a new Encapsulating Security Payload (ESP, | This is accomplished using a new Encapsulating Security Payload (ESP, | |||
| [RFC4303]) type which is identified by the number AGGFRAG_PAYLOAD | [RFC4303]) Next Header field value AGGFRAG_PAYLOAD (Section 6.1). | |||
| (Section 6.1). | ||||
| Other non-IP-TFS uses of this aggregation and fragmentation | Other non-IP-TFS uses of this aggregation and fragmentation | |||
| encapsulation have been identified, such as increased performance | encapsulation have been identified, such as increased performance | |||
| through packet aggregation, as well as handling MTU issues using | through packet aggregation, as well as handling MTU issues using | |||
| fragmentation. These uses are not defined here, but are also not | fragmentation. These uses are not defined here, but are also not | |||
| restricted by this document. | restricted by this document. | |||
| 2.2. Payload Content | 2.2. Payload Content | |||
| The AGGFRAG_PAYLOAD payload content defined in this document is | The AGGFRAG_PAYLOAD payload content defined in this document is | |||
| comprised of a 4 or 24 octet header followed by either a partial, a | comprised of a 4 or 24 octet header followed by either a partial | |||
| full or multiple partial or full data blocks. The following diagram | datablock, a full datablock, or multiple partial or full datablocks. | |||
| illustrates this payload within the ESP packet. See Section 6.1 for | The following diagram illustrates this payload within the ESP packet. | |||
| the exact formats of the AGGFRAG_PAYLOAD payload. | See Section 6.1 for the exact formats of the AGGFRAG_PAYLOAD payload. | |||
| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | |||
| . Outer Encapsulating Header ... . | . Outer Encapsulating Header ... . | |||
| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . | |||
| . ESP Header... . | . ESP Header... . | |||
| +---------------------------------------------------------------+ | +---------------------------------------------------------------+ | |||
| | [AGGFRAG subtype/flags] : BlockOffset | | | [AGGFRAG subtype/flags] : BlockOffset | | |||
| +---------------------------------------------------------------+ | +---------------------------------------------------------------+ | |||
| : [Optional Congestion Info] : | : [Optional Congestion Info] : | |||
| +---------------------------------------------------------------+ | +---------------------------------------------------------------+ | |||
| skipping to change at page 5, line 39 ¶ | skipping to change at page 6, line 7 ¶ | |||
| Figure 1: Layout of an IP-TFS IPsec Packet | Figure 1: Layout of an IP-TFS IPsec Packet | |||
| The "BlockOffset" value is either zero or some offset into or past | The "BlockOffset" value is either zero or some offset into or past | |||
| the end of the "DataBlocks" data. | the end of the "DataBlocks" data. | |||
| If the "BlockOffset" value is zero it means that the "DataBlocks" | If the "BlockOffset" value is zero it means that the "DataBlocks" | |||
| data begins with a new data block. | data begins with a new data block. | |||
| Conversely, if the "BlockOffset" value is non-zero it points to the | Conversely, if the "BlockOffset" value is non-zero it points to the | |||
| start of the new data block, and the initial "DataBlocks" data | start of the new data block, and the initial "DataBlocks" data | |||
| belongs to a previous data block that is still being re-assembled. | belongs to the data block that is still being re-assembled. | |||
| The "BlockOffset" can point past the end of the "DataBlocks" data | If the "BlockOffset" points past the end of the "DataBlocks" data | |||
| which indicates that the next data block occurs in a subsequent | then the next data block occurs in a subsequent encapsulating packet. | |||
| encapsulating packet. | ||||
| Having the "BlockOffset" always point at the next available data | Having the "BlockOffset" always point at the next available data | |||
| block allows for recovering the next inner packet in the presence of | block allows for recovering the next inner packet in the presence of | |||
| outer encapsulating packet loss. | outer encapsulating packet loss. | |||
| An example IP-TFS packet flow can be found in Appendix A. | An example IP-TFS packet flow can be found in Appendix A. | |||
| 2.2.1. Data Blocks | 2.2.1. Data Blocks | |||
| +---------------------------------------------------------------+ | +---------------------------------------------------------------+ | |||
| | Type | rest of IPv4, IPv6 or pad. | | Type | rest of IPv4, IPv6 or pad. | |||
| +-------- | +-------- | |||
| Figure 2: Layout of IP-TFS data block | Figure 2: Layout of a DataBlock | |||
| A data block is defined by a 4-bit type code followed by the data | A data block is defined by a 4-bit type code followed by the data | |||
| block data. The type values have been carefully chosen to coincide | block data. The type values have been carefully chosen to coincide | |||
| with the IPv4/IPv6 version field values so that no per-data block | with the IPv4/IPv6 version field values so that no per-data block | |||
| type overhead is required to encapsulate an IP packet. Likewise, the | type overhead is required to encapsulate an IP packet. Likewise, the | |||
| length of the data block is extracted from the encapsulated IPv4 or | length of the data block is extracted from the encapsulated IPv4's | |||
| IPv6 packet's length field. | "Total Length" or IPv6's "Payload Length" fields. | |||
| 2.2.2. No Implicit End Padding Required | 2.2.2. End Padding | |||
| It's worth noting that since a data block type is identified by its | Since a data block's type is identified in its first 4-bits, the only | |||
| first octet there is never a need for an implicit pad at the end of | time padding is required is when there is no data to encapsulate. | |||
| an encapsulating packet. Even when the start of a data block occurs | For this end padding a "Pad Data Block" is used. | |||
| near the end of a encapsulating packet such that there is no room for | ||||
| the length field of the encapsulated header to be included in the | ||||
| current encapsulating packet, the fact that the length comes at a | ||||
| known location and is guaranteed to be present is enough to fetch the | ||||
| length field from the subsequent encapsulating packet payload. Only | ||||
| when there is no data to encapsulated is end padding required, and | ||||
| then an explicit "Pad Data Block" would be used to identify the | ||||
| padding. | ||||
| 2.2.3. Fragmentation, Sequence Numbers and All-Pad Payloads | 2.2.3. Fragmentation, Sequence Numbers and All-Pad Payloads | |||
| In order for a receiver to be able to reassemble fragmented inner- | In order for a receiver to reassemble fragmented inner-packets, the | |||
| packets, the sender MUST send the inner-packet fragments back-to-back | sender MUST send the inner-packet fragments back-to-back in the | |||
| in the logical outer packet stream (i.e., using consecutive ESP | logical outer packet stream (i.e., using consecutive ESP sequence | |||
| sequence numbers). However, the sender is allowed to insert "all- | numbers). However, the sender is allowed to insert "all-pad" | |||
| pad" payloads (i.e., payloads with a "BlockOffset" of zero and a | payloads (i.e., payloads with a "BlockOffset" of zero and a single | |||
| single pad "DataBlock") in between the packets carrying the inner- | pad "DataBlock") in between the packets carrying the inner-packet | |||
| packet fragment payloads. This possible interleaving of all-pad | fragment payloads. This interleaving of all-pad payloads allows the | |||
| payloads allows the sender to always be able to send a tunnel packet, | sender to always send a tunnel packet, regardless of the | |||
| regardless of the encapsulation computational requirements. | encapsulation computational requirements. | |||
| When a receiver is reassembling an inner-packet, and it receives an | When a receiver is reassembling an inner-packet, and it receives an | |||
| "all-pad" payload, it increments the expected sequence number that | "all-pad" payload, it increments the expected sequence number that | |||
| the next inner-packet fragment is expected to arrive in. | the next inner-packet fragment is expected to arrive in. | |||
| Given the above, the receiver will need to handle out-of-order | Given the above, the receiver will need to handle out-of-order | |||
| arrival of outer ESP packets prior to reassembly processing. ESP | arrival of outer ESP packets prior to reassembly processing. ESP | |||
| already provides for optionally detecting replay attacks. Detecting | already provides for optionally detecting replay attacks. Detecting | |||
| replay attacks normally utilizes a window method. A similar sequence | replay attacks normally utilizes a window method. A similar sequence | |||
| number based sliding window can be used to correct re-ordering of the | number based sliding window can be used to correct re-ordering of the | |||
| outer packet stream. Receiving a larger (newer) sequence number | outer packet stream. Receiving a larger (newer) sequence number | |||
| packet advances the window, and received older ESP packets whose | packet advances the window, and received older ESP packets whose | |||
| sequence numbers the window has passed by are dropped. A good choice | sequence numbers the window has passed by are dropped. A good choice | |||
| for the size of this window depends on the amount of re-ordering the | for the size of this window depends on the amount of re-ordering the | |||
| user may normally experience. | user may normally experience. | |||
| As the amount of reordering that may be present is hard to predict | As the amount of reordering that may be present is hard to predict, | |||
| the window size SHOULD be configurable by the user. Implementations | the window size SHOULD be configurable by the user. Implementations | |||
| MAY also dynamically adjust the reordering window based on actual | MAY also dynamically adjust the reordering window based on actual | |||
| reordering seen in arriving packets. Finally, we note that as IP-TFS | reordering seen in arriving packets. Finally, note that as IP-TFS is | |||
| is sending a continuous stream of packets there is no requirement for | sending a continuous stream of packets there is no requirement for | |||
| timers (although there's no prohibition either) as newly arrived | timers (although there's no prohibition either) as newly arrived | |||
| packets will cause the window to advance and older packets will then | packets will cause the window to advance and older packets will then | |||
| be processed as they leave the window. Implementations that are | be processed as they leave the window. Implementations that are | |||
| concerned about memory use when packets are delayed (e.g., when an SA | concerned about memory use when packets are delayed (e.g., when an SA | |||
| deletion is delayed) can of course use timers to drop packets as | deletion is delayed) can of course use timers to drop packets as | |||
| well. | well. | |||
| While ESP guarantees an increasing sequence number with subsequently | While ESP guarantees an increasing sequence number with subsequently | |||
| sent packets, it does not actually require the sequence numbers to be | sent packets, it does not actually require the sequence numbers to be | |||
| generated with no gaps (e.g., sending only even numbered sequence | generated with no gaps (e.g., sending only even numbered sequence | |||
| numbers would be allowed as long as they are always increasing). | numbers would be allowed as long as they are always increasing). | |||
| Gaps in the sequence numbers will not work for this specification so | Gaps in the sequence numbers will not work for this document so the | |||
| the sequence number stream is further restricted to not contain gaps | sequence number stream MUST increase monotonically by 1 for each | |||
| (i.e., each subsequent outer packet must be sent with the sequence | subsequent packet. | |||
| number incremented by 1). | ||||
| When using the AGGFRAG_PAYLOAD in conjunction with replay detection, | When using the AGGFRAG_PAYLOAD in conjunction with replay detection, | |||
| the window size for both MAY be reduced to share the smaller of the | the window size for both MAY be reduced to share the smaller of the | |||
| two window sizes. This is b/c packets outside of the smaller window | two window sizes. This is because packets outside of the smaller | |||
| but inside the larger would still be dropped by the mechanism with | window but inside the larger would still be dropped by the mechanism | |||
| the smaller window size. | with the smaller window size. | |||
| Finally, as sequence numbers are reset when switching SAs (e.g., when | Finally, as sequence numbers are reset when switching SAs (e.g., when | |||
| re-keying a child SA), an implementation SHOULD NOT send initial | re-keying a child SA), senders MUST NOT send initial fragments of an | |||
| fragments of an inner packet using one SA and subsequent fragments in | inner packet using one SA and subsequent fragments in a different SA. | |||
| a different SA. | ||||
| 2.2.3.1. Optional Extra Padding | 2.2.3.1. Optional Extra Padding | |||
| When the tunnel bandwidth is not being fully utilized, an | When the tunnel bandwidth is not being fully utilized, a sender MAY | |||
| implementation MAY pad-out the current encapsulating packet in order | pad-out the current encapsulating packet in order to deliver an inner | |||
| to deliver an inner packet un-fragmented in the following outer | packet un-fragmented in the following outer packet. The benefit | |||
| packet. The benefit would be to avoid inner-packet fragmentation in | would be to avoid inner-packet fragmentation in the presence of a | |||
| the presence of a bursty offered load (non-bursty traffic will | bursty offered load (non-bursty traffic will naturally not fragment). | |||
| naturally not fragment). An implementation MAY also choose to allow | Senders MAY also choose to allow for a minimum fragment size to be | |||
| for a minimum fragment size to be configured (e.g., as a percentage | configured (e.g., as a percentage of the AGGFRAG_PAYLOAD payload | |||
| of the AGGFRAG_PAYLOAD payload size) to avoid fragmentation at the | size) to avoid fragmentation at the cost of tunnel bandwidth. The | |||
| cost of tunnel bandwidth. The cost with these methods is complexity | cost with these methods is complexity and added delay of inner | |||
| and added delay of inner traffic. The main advantage to avoiding | traffic. The main advantage to avoiding fragmentation is to minimize | |||
| fragmentation is to minimize inner packet loss in the presence of | inner packet loss in the presence of outer packet loss. When this is | |||
| outer packet loss. When this is worthwhile (e.g., how much loss and | worthwhile (e.g., how much loss and what type of loss is required, | |||
| what type of loss is required, given different inner traffic shapes | given different inner traffic shapes and utilization, for this to | |||
| and utilization, for this to make sense), and what values to use for | make sense), and what values to use for the allowable/added delay may | |||
| the allowable/added delay may be worth researching, but is outside | be worth researching, but is outside the scope of this document. | |||
| the scope of this document. | ||||
| While use of padding to avoid fragmentation does not impact | While use of padding to avoid fragmentation does not impact | |||
| interoperability, used inappropriately it can reduce the effective | interoperability, used inappropriately it can reduce the effective | |||
| throughput of a tunnel. Implementations implementing either of the | throughput of a tunnel. Senders implementing either of the above | |||
| above approaches will need to take care to not reduce the effective | approaches will need to take care to not reduce the effective | |||
| capacity, and overall utility, of the tunnel through the overuse of | capacity, and overall utility, of the tunnel through the overuse of | |||
| padding. | padding. | |||
| 2.2.4. Empty Payload | 2.2.4. Empty Payload | |||
| In order to support reporting of congestion control information | To support reporting of congestion control information (described | |||
| (described later) on a non-AGGFRAG_PAYLOAD enabled SA, IP-TFS allows | later) on a non-AGGFRAG_PAYLOAD enabled SA, IP-TFS allows for the | |||
| for the sending of an AGGFRAG_PAYLOAD payload with no data blocks | sending of an AGGFRAG_PAYLOAD payload with no data blocks (i.e., the | |||
| (i.e., the ESP payload length is equal to the AGGFRAG_PAYLOAD header | ESP payload length is equal to the AGGFRAG_PAYLOAD header length). | |||
| length). This special payload is called an empty payload. | This special payload is called an empty payload. | |||
| Currently this situation is only applicable in non-IKEv2 use cases. | ||||
| 2.2.5. IP Header Value Mapping | 2.2.5. IP Header Value Mapping | |||
| [RFC4301] provides some direction on when and how to map various | [RFC4301] provides some direction on when and how to map various | |||
| values from an inner IP header to the outer encapsulating header, | values from an inner IP header to the outer encapsulating header, | |||
| namely the Don't-Fragment (DF) bit ([RFC0791] and [RFC8200]), the | namely the Don't-Fragment (DF) bit ([RFC0791] and [RFC8200]), the | |||
| Differentiated Services (DS) field [RFC2474] and the Explicit | Differentiated Services (DS) field [RFC2474] and the Explicit | |||
| Congestion Notification (ECN) field [RFC3168]. Unlike [RFC4301], IP- | Congestion Notification (ECN) field [RFC3168]. Unlike [RFC4301], IP- | |||
| TFS may and often will be encapsulating more than one IP packet per | TFS may and often will be encapsulating more than one IP packet per | |||
| ESP packet. To deal with this, these mappings are restricted | ESP packet. To deal with this, these mappings are restricted | |||
| further. In particular IP-TFS never maps the inner DF bit as it is | further. | |||
| unrelated to the IP-TFS tunnel functionality; IP-TFS never IP | ||||
| fragments the inner packets and the inner packets will not affect the | ||||
| fragmentation of the outer encapsulation packets. Likewise, the ECN | ||||
| value need not be mapped as any congestion related to the constant- | ||||
| send-rate IP-TFS tunnel is unrelated (by design!) to the inner | ||||
| traffic flow. Finally, by default the DS field SHOULD NOT be copied | ||||
| although an implementation MAY choose to allow for configuration to | ||||
| override this behavior. An implementation SHOULD also allow the DS | ||||
| value to be set by configuration. | ||||
| It is worth noting that an implementation MAY still set the ECN value | 2.2.5.1. DF bit | |||
| of inner packets based on the normal ECN specification ([RFC3168]). | ||||
| IP-TFS never maps the inner DF bit as it is unrelated to the IP-TFS | ||||
| tunnel functionality; IP-TFS never needs to IP fragment the inner | ||||
| packets and the inner packets will not affect the fragmentation of | ||||
| the outer encapsulation packets. | ||||
| 2.2.5.2. ECN value | ||||
| The ECN value need not be mapped as any congestion related to the | ||||
| constant-send-rate IP-TFS tunnel is unrelated (by design) to the | ||||
| inner traffic flow. The sender MAY still set the ECN value of inner | ||||
| packets based on the normal ECN specification [RFC3168]. | ||||
| 2.2.5.3. DS field | ||||
| By default the DS field SHOULD NOT be copied, although a sender MAY | ||||
| choose to allow for configuration to override this behavior. A | ||||
| sender SHOULD also allow the DS value to be set by configuration. | ||||
| 2.2.6. IP Time-To-Live (TTL) and Tunnel errors | 2.2.6. IP Time-To-Live (TTL) and Tunnel errors | |||
| [RFC4301] specifies how to modify the inner packet TTL ([RFC0791]). | [RFC4301] specifies how to modify the inner packet TTL [RFC0791]. | |||
| Any errors (e.g., ICMP errors arriving back at the tunnel ingress due | Any errors (e.g., ICMP errors arriving back at the tunnel ingress due | |||
| to tunnel traffic) should be handled the same as with non IP-TFS | to tunnel traffic) are handled the same as with non IP-TFS IPsec | |||
| IPsec tunnels. | tunnels. | |||
| 2.2.7. Effective MTU of the Tunnel | 2.2.7. Effective MTU of the Tunnel | |||
| Unlike [RFC4301], there is normally no effective MTU (EMTU) on an IP- | Unlike [RFC4301], there is normally no effective MTU (EMTU) on an IP- | |||
| TFS tunnel as all IP packet sizes are properly transmitted without | TFS tunnel as all IP packet sizes are properly transmitted without | |||
| requiring IP fragmentation prior to tunnel ingress. That said, an | requiring IP fragmentation prior to tunnel ingress. That said, a | |||
| implementation MAY allow for explicitly configuring an MTU for the | sender MAY allow for explicitly configuring an MTU for the tunnel. | |||
| tunnel. | ||||
| If IP-TFS fragmentation has been disabled, then the tunnel's EMTU and | If IP-TFS fragmentation has been disabled, then the tunnel's EMTU and | |||
| behaviors are the same as normal IPsec tunnels ([RFC4301]). | behaviors are the same as normal IPsec tunnels [RFC4301]. | |||
| 2.3. Exclusive SA Use | 2.3. Exclusive SA Use | |||
| It is not the intention of this specification to allow for mixed use | This document does not specify mixed use of an AGGFRAG_PAYLOAD | |||
| of an AGGFRAG_PAYLOAD enabled SA. In other words, an SA that has | enabled SA. A sender MUST only send AGGFRAG_PAYLOAD payloads over an | |||
| AGGFRAG_PAYLOAD enabled MUST NOT have non-AGGFRAG_PAYLOAD payloads | SA configured for AGGFRAG_PAYLOAD use. | |||
| such as IP (IP protocol 4), TCP transport (IP protocol 6), or ESP pad | ||||
| packets (protocol 59) intermixed with non-empty AGGFRAG_PAYLOAD | ||||
| payloads. Empty AGGFRAG_PAYLOAD payloads (Section 2.2.4) are used to | ||||
| transmit congestion control information on non-IP-TFS enabled SAs, so | ||||
| intermixing is allowed in this specific case. While it's possible to | ||||
| envision making the algorithm work in the presence of sequence number | ||||
| skips in the AGGFRAG_PAYLOAD payload stream, the added complexity is | ||||
| not deemed worthwhile. Other IPsec uses can configure and use their | ||||
| own SAs. | ||||
| 2.4. Modes of Operation | 2.4. Modes of Operation | |||
| Just as with normal IPsec/ESP tunnels, IP-TFS tunnels are | Just as with normal IPsec/ESP tunnels, IP-TFS tunnels are | |||
| unidirectional. Bidirectional IP-TFS functionality is achieved by | unidirectional. Bidirectional IP-TFS functionality is achieved by | |||
| setting up 2 IP-TFS tunnels, one in either direction. | setting up 2 IP-TFS tunnels, one in either direction. | |||
| An IP-TFS tunnel can operate in 2 modes, a non-congestion controlled | An IP-TFS tunnel can operate in 2 modes, a non-congestion controlled | |||
| mode and congestion controlled mode. | mode and congestion controlled mode. | |||
| 2.4.1. Non-Congestion Controlled Mode | 2.4.1. Non-Congestion Controlled Mode | |||
| In the non-congestion controlled mode IP-TFS sends fixed-sized | In the non-congestion controlled mode, IP-TFS sends fixed-sized | |||
| packets at a constant rate. The packet send rate is constant and is | packets at a constant rate. The packet send rate is constant and is | |||
| not automatically adjusted regardless of any network congestion | not automatically adjusted regardless of any network congestion | |||
| (e.g., packet loss). | (e.g., packet loss). | |||
| For similar reasons as given in [RFC7510] the non-congestion | For similar reasons as given in [RFC7510] the non-congestion | |||
| controlled mode should only be used where the user has full | controlled mode should only be used where the user has full | |||
| administrative control over the path the tunnel will take. This is | administrative control over the path the tunnel will take. This is | |||
| required so the user can guarantee the bandwidth and also be sure as | required so the user can guarantee the bandwidth and also be sure as | |||
| to not be negatively affecting network congestion [RFC2914]. In this | to not be negatively affecting network congestion [RFC2914]. In this | |||
| case packet loss should be reported to the administrator (e.g., via | case packet loss should be reported to the administrator (e.g., via | |||
| syslog, YANG notification, SNMP traps, etc) so that any failures due | syslog, YANG notification, SNMP traps, etc) so that any failures due | |||
| to a lack of bandwidth can be corrected. | to a lack of bandwidth can be corrected. | |||
| Non-congestion control mode is also appropriate if ESP over TCP is in | ||||
| use [RFC8229]. | ||||
| 2.4.2. Congestion Controlled Mode | 2.4.2. Congestion Controlled Mode | |||
| With the congestion controlled mode, IP-TFS adapts to network | With the congestion controlled mode, IP-TFS adapts to network | |||
| congestion by lowering the packet send rate to accommodate the | congestion by lowering the packet send rate to accommodate the | |||
| congestion, as well as raising the rate when congestion subsides. | congestion, as well as raising the rate when congestion subsides. | |||
| Since overhead is per packet, by allowing for maximal fixed-size | Since overhead is per packet, by allowing for maximal fixed-size | |||
| packets and varying the send rate transport overhead is minimized. | packets and varying the send rate transport overhead is minimized. | |||
| The output of the congestion control algorithm will adjust the rate | The output of the congestion control algorithm will adjust the rate | |||
| at which the ingress sends packets. While this document does not | at which the ingress sends packets. While this document does not | |||
| require a specific congestion control algorithm, best current | require a specific congestion control algorithm, best current | |||
| practice RECOMMENDS that the algorithm conform to [RFC5348]. | practice RECOMMENDS that the algorithm conform to [RFC5348]. | |||
| Congestion control principles are documented in [RFC2914] as well. | Congestion control principles are documented in [RFC2914] as well. | |||
| An example of an implementation of the [RFC5348] algorithm which | [RFC4342] provides an example of the [RFC5348] algorithm which | |||
| matches the requirements of IP-TFS (i.e., designed for fixed-size | matches the requirements of IP-TFS (i.e., designed for fixed-size | |||
| packet and send rate varied based on congestion) is documented in | packet and send rate varied based on congestion. | |||
| [RFC4342]. | ||||
| The required inputs for the TCP friendly rate control algorithm | The required inputs for the TCP friendly rate control algorithm | |||
| described in [RFC5348] are the receiver's loss event rate and the | described in [RFC5348] are the receiver's loss event rate and the | |||
| sender's estimated round-trip time (RTT). These values are provided | sender's estimated round-trip time (RTT). These values are provided | |||
| by IP-TFS using the congestion information header fields described in | by IP-TFS using the congestion information header fields described in | |||
| Section 3. In particular these values are sufficient to implement | Section 3. In particular, these values are sufficient to implement | |||
| the algorithm described in [RFC5348]. | the algorithm described in [RFC5348]. | |||
| At a minimum, the congestion information must be sent, from the | At a minimum, the congestion information MUST be sent, from the | |||
| receiver and from the sender, at least once per RTT. Prior to | receiver and from the sender, at least once per RTT. Prior to | |||
| establishing an RTT the information SHOULD be sent constantly from | establishing an RTT the information SHOULD be sent constantly from | |||
| the sender and the receiver so that an RTT estimate can be | the sender and the receiver so that an RTT estimate can be | |||
| established. The lack of receiving this information over multiple | established. Not receiving this information over multiple | |||
| consecutive RTT intervals should be considered a congestion event | consecutive RTT intervals should be considered a congestion event | |||
| that causes the sender to adjust it's sending rate lower. For | that causes the sender to adjust its sending rate lower. For | |||
| example, [RFC4342] calls this the "no feedback timeout" and it is | example, [RFC4342] calls this the "no feedback timeout" and it is | |||
| equal to 4 RTT intervals. When a "no feedback timeout" has occurred | equal to 4 RTT intervals. When a "no feedback timeout" has occurred | |||
| [RFC4342] halves the sending rate. | [RFC4342] halves the sending rate. | |||
| An implementation MAY choose to always include the congestion | An implementation MAY choose to always include the congestion | |||
| information in it's IP-TFS payload header if sending on an IP-TFS | information in its IP-TFS payload header if sending on an IP-TFS | |||
| enabled SA. Since IP-TFS normally will operate with a large packet | enabled SA. Since IP-TFS normally will operate with a large packet | |||
| size, the congestion information should represent a small portion of | size, the congestion information should represent a small portion of | |||
| the available tunnel bandwidth. An implementation choosing to always | the available tunnel bandwidth. An implementation choosing to always | |||
| send the data MAY also choose to only update the "LossEventRate" and | send the data MAY also choose to only update the "LossEventRate" and | |||
| "RTT" header field values it sends every "RTT" though. | "RTT" header field values it sends every "RTT" though. | |||
| When an implementation is choosing a congestion control algorithm (or | When choosing a congestion control algorithm (or a selection of | |||
| a selection of algorithms) one should remember that IP-TFS is not | algorithms) note that IP-TFS is not providing for reliable delivery | |||
| providing for reliable delivery of IP traffic, and so per packet ACKs | of IP traffic, and so per packet ACKs are not required and are not | |||
| are not required and are not provided. | provided. | |||
| It's worth noting that the variable send-rate of a congestion | It is worth noting that the variable send-rate of a congestion | |||
| controlled IP-TFS tunnel, is not private; however, this send-rate is | controlled IP-TFS tunnel, is not private; however, this send-rate is | |||
| being driven by network congestion, and as long as the encapsulated | being driven by network congestion, and as long as the encapsulated | |||
| (inner) traffic flow shape and timing are not directly affecting the | (inner) traffic flow shape and timing are not directly affecting the | |||
| (outer) network congestion, the variations in the tunnel rate will | (outer) network congestion, the variations in the tunnel rate will | |||
| not weaken the provided inner traffic flow confidentiality. | not weaken the provided inner traffic flow confidentiality. | |||
| 2.4.2.1. Circuit Breakers | 2.4.2.1. Circuit Breakers | |||
| In additional to congestion control, implementations MAY choose to | In additional to congestion control, implementations MAY choose to | |||
| define and implement circuit breakers [RFC8084] as a recovery method | define and implement circuit breakers [RFC8084] as a recovery method | |||
| of last resort. Enabling circuit breakers is also a reason a user | of last resort. Enabling circuit breakers is also a reason a user | |||
| may wish to enable congestion information reports even when using the | may wish to enable congestion information reports even when using the | |||
| non-congestion controlled mode of operation. The definition of | non-congestion controlled mode of operation. The definition of | |||
| circuit breakers are outside the scope of this document. | circuit breakers are outside the scope of this document. | |||
| 2.5. Summary of Receiver Processing | ||||
| An IP-TFS receiver has a few tasks to perform. | ||||
| The receiver first reorders, possibly out-of-order ESP packets | ||||
| received on an SA into in-sequence-order AGGFRAG_PAYLOAD payloads | ||||
| (Section 2.2.3). If congestion control is enabled, the receiver | ||||
| considers a packet lost when it's sequence number is abandoned (e.g., | ||||
| pushed out of the re-ordering window, or timed-out) by the reordering | ||||
| algorithm. | ||||
| Additionally, if congestion control is enabled, the receiver sends | ||||
| congestion control data (Section 6.1.2) back to the sender as | ||||
| described in Section 2.4.2 and Section 3. | ||||
| Finally, the receiver processes the now in-order AGGFRAG_PAYLOAD | ||||
| payload stream to extract the inner-packets (Section 2.2.3, | ||||
| Section 6.1). | ||||
| 3. Congestion Information | 3. Congestion Information | |||
| In order to support the congestion control mode, the sender needs to | In order to support the congestion control mode, the sender needs to | |||
| know the loss event rate and also be able to approximate the RTT | know the loss event rate and to approximate the RTT [RFC5348]. In | |||
| ([RFC5348]). In order to obtain these values the receiver sends | order to obtain these values, the receiver sends congestion control | |||
| congestion control information on it's SA back to the sender. Thus, | information on it's SA back to the sender. Thus, to support | |||
| in order to support congestion control the receiver must have a | congestion control the receiver must have a paired SA back to the | |||
| paired SA back to the sender (this is always the case when the tunnel | sender (this is always the case when the tunnel was created using | |||
| was created using IKEv2). If the SA back to the sender is a non- | IKEv2). If the SA back to the sender is a non-AGGFRAG_PAYLOAD | |||
| AGGFRAG_PAYLOAD enabled SA then an AGGFRAG_PAYLOAD empty payload | enabled SA then an AGGFRAG_PAYLOAD empty payload (i.e., header only) | |||
| (i.e., header only) is used to convey the information. | is used to convey the information. | |||
| In order to calculate a loss event rate compatible with [RFC5348], | In order to calculate a loss event rate compatible with [RFC5348], | |||
| the receiver needs to have a round-trip time estimate. Thus the | the receiver needs to have a round-trip time estimate. Thus the | |||
| sender communicates this estimate in the "RTT" header field. On | sender communicates this estimate in the "RTT" header field. On | |||
| startup this value will be zero as no RTT estimate is yet known. | startup this value will be zero as no RTT estimate is yet known. | |||
| In order for the sender to estimate it's "RTT" value, the sender | In order for the sender to estimate its "RTT" value, the sender | |||
| places a timestamp value in the "TVal" header field. On first | places a timestamp value in the "TVal" header field. On first | |||
| receipt of this "TVal", the receiver records the new "TVal" value | receipt of this "TVal", the receiver records the new "TVal" value | |||
| along with the time it arrived locally, subsequent receipt of the | along with the time it arrived locally, subsequent receipt of the | |||
| same "TVal" MUST not update the recorded time. When the receiver | same "TVal" MUST NOT update the recorded time. | |||
| sends it's CC header it places this latest recorded value in the | ||||
| "TEcho" header field, along with 2 delay values, "Echo Delay" and | When the receiver sends its CC header it places this latest recorded | |||
| "Transmit Delay". The "Echo Delay" value is the time delta from the | "TVal" in the "TEcho" header field, along with 2 delay values, "Echo | |||
| recorded arrival time of "TVal" and the current clock in | Delay" and "Transmit Delay". The "Echo Delay" value is the time | |||
| microseconds. The second value, "Transmit Delay", is the receiver's | delta from the recorded arrival time of "TVal" and the current clock | |||
| current transmission delay on the tunnel (i.e., the average time | in microseconds. The second value, "Transmit Delay", is the | |||
| between sending packets on it's half of the IP-TFS tunnel). When the | receiver's current transmission delay on the tunnel (i.e., the | |||
| sender receives back it's "TVal" in the "TEcho" header field it | average time between sending packets on its half of the IP-TFS | |||
| calculates 2 RTT estimates. The first is the actual delay found by | tunnel). | |||
| subtracting the "TEcho" value from it's current clock and then | ||||
| When the sender receives back its "TVal" in the "TEcho" header field | ||||
| it calculates 2 RTT estimates. The first is the actual delay found | ||||
| by subtracting the "TEcho" value from its current clock and then | ||||
| subtracting "Echo Delay" as well. The second RTT estimate is found | subtracting "Echo Delay" as well. The second RTT estimate is found | |||
| by adding the received "Transmit Delay" header value to the senders | by adding the received "Transmit Delay" header value to the senders | |||
| own transmission delay (i.e., the average time between sending | own transmission delay (i.e., the average time between sending | |||
| packets on it's half of the IP-TFS tunnel). The larger of these 2 | packets on its half of the IP-TFS tunnel). The larger of these 2 RTT | |||
| RTT estimates SHOULD be used as the "RTT" value. The two estimates | estimates SHOULD be used as the "RTT" value. | |||
| are required to handle different combinations of faster or slower | ||||
| tunnel packet paths with faster or slower fixed tunnel rates. | The two RTT estimates are required to handle different combinations | |||
| Choosing the larger of the two values guarantees that the "RTT" is | of faster or slower tunnel packet paths with faster or slower fixed | |||
| never considered faster than the aggregate transmission delay based | tunnel rates. Choosing the larger of the two values guarantees that | |||
| on the IP-TFS tunnel rate (the second estimate), as well as never | the "RTT" is never considered faster than the aggregate transmission | |||
| being considered faster than the actual RTT along the tunnel packet | delay based on the IP-TFS tunnel rate (the second estimate), as well | |||
| path (the first estimate). | as never being considered faster than the actual RTT along the tunnel | |||
| packet path (the first estimate). | ||||
| The receiver also calculates, and communicates in the "LossEventRate" | The receiver also calculates, and communicates in the "LossEventRate" | |||
| header field, the loss event rate for use by the sender. This is | header field, the loss event rate for use by the sender. This is | |||
| slightly different from [RFC4342] which periodically sends all the | slightly different from [RFC4342] which periodically sends all the | |||
| loss interval data back to the sender so that it can do the | loss interval data back to the sender so that it can do the | |||
| calculation. See Appendix B for a suggested way to calculate the | calculation. See Appendix B for a suggested way to calculate the | |||
| loss event rate value. Initially this value will be zero (indicating | loss event rate value. Initially this value will be zero (indicating | |||
| no loss) until enough data has been collected by the receiver to | no loss) until enough data has been collected by the receiver to | |||
| update it. | update it. | |||
| 3.1. ECN Support | 3.1. ECN Support | |||
| In additional to normal packet loss information IP-TFS supports use | In additional to normal packet loss information IP-TFS supports use | |||
| of the ECN bits in the encapsulating IP header [RFC3168] for | of the ECN bits in the encapsulating IP header [RFC3168] for | |||
| identifying congestion. If ECN use is enabled and a packet arrives | identifying congestion. If ECN use is enabled and a packet arrives | |||
| at the egress endpoint with the Congestion Experienced (CE) value | at the egress (receiving) side with the Congestion Experienced (CE) | |||
| set, then the receiver considers that packet as being dropped, | value set, then the receiver considers that packet as being dropped, | |||
| although it does not drop it. The receiver MUST set the E bit in any | although it does not drop it. The receiver MUST set the E bit in any | |||
| AGGFRAG_PAYLOAD payload header containing a "LossEventRate" value | AGGFRAG_PAYLOAD payload header containing a "LossEventRate" value | |||
| derived from a CE value being considered. | derived from a CE value being considered. | |||
| As noted in [RFC3168] the ECN bits are not protected by IPsec and | As noted in [RFC3168] the ECN bits are not protected by IPsec and | |||
| thus may constitute a covert channel. For this reason ECN use SHOULD | thus may constitute a covert channel. For this reason, ECN use | |||
| NOT be enabled by default. | SHOULD NOT be enabled by default. | |||
| 4. Configuration | 4. Configuration | |||
| IP-TFS is meant to be deployable with a minimal amount of | IP-TFS is meant to be deployable with a minimal amount of | |||
| configuration. All IP-TFS specific configuration should be able to | configuration. All IP-TFS specific configuration should be specified | |||
| be specified at the unidirectional tunnel ingress (sending) side. It | at the unidirectional tunnel ingress (sending) side. It is intended | |||
| is intended that non-IKEv2 operation is supported, at least, with | that non-IKEv2 operation is supported, at least, with local static | |||
| local static configuration. | configuration. | |||
| 4.1. Bandwidth | 4.1. Bandwidth | |||
| Bandwidth is a local configuration option. For non-congestion | Bandwidth is a local configuration option. For non-congestion | |||
| controlled mode the bandwidth SHOULD be configured. For congestion | controlled mode, the bandwidth SHOULD be configured. For congestion | |||
| controlled mode one can configure the bandwidth or have no | controlled mode, the bandwidth can be configured or the congestion | |||
| configuration and let congestion control discover the maximum | control algorithm discovers and uses the maximum bandwidth available. | |||
| bandwidth available. No standardized configuration method is | No standardized configuration method is required. | |||
| required. | ||||
| 4.2. Fixed Packet Size | 4.2. Fixed Packet Size | |||
| The fixed packet size to be used for the tunnel encapsulation packets | The fixed packet size to be used for the tunnel encapsulation packets | |||
| MAY be configured manually or can be automatically determined using | MAY be configured manually or can be automatically determined using | |||
| other methods such as PLMTUD ([RFC4821], [RFC8899]) or PMTUD | other methods such as PLMTUD ([RFC4821], [RFC8899]) or PMTUD | |||
| ([RFC1191], [RFC8201]). As PMTUD is known to have issues, PLMTUD is | ([RFC1191], [RFC8201]). As PMTUD is known to have issues, PLMTUD is | |||
| considered the more robust option. No standardized configuration | considered the more robust option. No standardized configuration | |||
| method is required. | method is required. | |||
| skipping to change at page 13, line 47 ¶ | skipping to change at page 14, line 42 ¶ | |||
| Congestion control is a local configuration option. No standardized | Congestion control is a local configuration option. No standardized | |||
| configuration method is required. | configuration method is required. | |||
| 5. IKEv2 | 5. IKEv2 | |||
| 5.1. USE_AGGFRAG Notification Message | 5.1. USE_AGGFRAG Notification Message | |||
| As mentioned previously IP-TFS tunnels utilize ESP payloads of type | As mentioned previously IP-TFS tunnels utilize ESP payloads of type | |||
| AGGFRAG_PAYLOAD. | AGGFRAG_PAYLOAD. | |||
| When using IKEv2, a new "USE_AGGFRAG" Notification Message is used to | When using IKEv2, a new "USE_AGGFRAG" Notification Message enables | |||
| enable use of the AGGFRAG_PAYLOAD payload on a child SA pair. The | the AGGFRAG_PAYLOAD payload on a child SA pair. The method used is | |||
| method used is similar to how USE_TRANSPORT_MODE is negotiated, as | similar to how USE_TRANSPORT_MODE is negotiated, as described in | |||
| described in [RFC7296]. | [RFC7296]. | |||
| To request using the AGGFRAG_PAYLOAD payload on the Child SA pair, | To request use of the AGGFRAG_PAYLOAD payload on the Child SA pair, | |||
| the initiator includes the USE_AGGFRAG notification in an SA payload | the initiator includes the USE_AGGFRAG notification in an SA payload | |||
| requesting a new Child SA (either during the initial IKE_AUTH or | requesting a new Child SA (either during the initial IKE_AUTH or | |||
| during non-rekeying CREATE_CHILD_SA exchanges). If the request is | during CREATE_CHILD_SA exchanges). If the request is accepted then | |||
| accepted then response MUST also include a notification of type | the response MUST also include a notification of type USE_AGGFRAG. | |||
| USE_AGGFRAG. If the responder declines the request the child SA will | If the responder declines the request the child SA will be | |||
| be established without AGGFRAG_PAYLOAD payload use enabled. If this | established without AGGFRAG_PAYLOAD payload use enabled. If this is | |||
| is unacceptable to the initiator, the initiator MUST delete the child | unacceptable to the initiator, the initiator MUST delete the child | |||
| SA. | SA. | |||
| The USE_AGGFRAG notification MUST NOT be sent, and MUST be ignored, | As the use of the AGGFRAG_PAYLOAD payload is currently only defined | |||
| during a CREATE_CHILD_SA rekeying exchange as it is not allowed to | for non-transport mode tunnels, the USE_AGGFRAG notification MUST NOT | |||
| change use of the AGGFRAG_PAYLOAD payload type during rekeying. A | be combined with USE_TRANSPORT notification. | |||
| new child SA due to re-keying inherits the use of AGGFRAG_PAYLOAD | ||||
| from the re-keyed child SA. | ||||
| The USE_AGGFRAG notification contains a 1 octet payload of flags that | The USE_AGGFRAG notification contains a 1 octet payload of flags that | |||
| specify any requirements from the sender of the message. If any | specify requirements from the sender of the notification. If any | |||
| requirement flags are not understood or cannot be supported by the | requirement flags are not understood or cannot be supported by the | |||
| receiver then the receiver should not enable use of AGGFRAG_PAYLOAD | receiver then the receiver SHOULD NOT enable use of AGGFRAG_PAYLOAD | |||
| payload type (either by not responding with the USE_AGGFRAG | (either by not responding with the USE_AGGFRAG notification, or in | |||
| notification, or in the case of the initiator, by deleting the child | the case of the initiator, by deleting the child SA if the now | |||
| SA if the now established non-AGGFRAG_PAYLOAD using SA is | established non-AGGFRAG_PAYLOAD using SA is unacceptable). | |||
| unacceptable). | ||||
| The notification type and payload flag values are defined in | The notification type and payload flag values are defined in | |||
| Section 6.1.4. | Section 6.1.4. | |||
| 6. Packet and Data Formats | 6. Packet and Data Formats | |||
| The packet and data formats defined below are generic with the intent | ||||
| of allowing for non-IP-TFS uses, but such uses are outside the scope | ||||
| of this document. | ||||
| 6.1. AGGFRAG_PAYLOAD Payload | 6.1. AGGFRAG_PAYLOAD Payload | |||
| ESP Payload Type: 0x5 | ESP Next Header value: 0x5 | |||
| An IP-TFS payload is identified by the ESP payload type | An IP-TFS payload is identified by the ESP Next Header value | |||
| AGGFRAG_PAYLOAD which has the value 0x5. The first octet of this | AGGFRAG_PAYLOAD which has the value 0x5. The value 5 was chosen to | |||
| payload indicates the format of the remaining payload data. | not conflict with other used values. The first octet of this payload | |||
| indicates the format of the remaining payload data. | ||||
| 0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 | |||
| +-+-+-+-+-+-+-+-+-+-+- | +-+-+-+-+-+-+-+-+-+-+- | |||
| | Sub-type | ... | | Sub-type | ... | |||
| +-+-+-+-+-+-+-+-+-+-+- | +-+-+-+-+-+-+-+-+-+-+- | |||
| Sub-type: | Sub-type: | |||
| An 8 bit value indicating the payload format. | An 8-bit value indicating the payload format. | |||
| This specification defines 2 payload sub-types. These payload | This document defines 2 payload sub-types. These payload formats are | |||
| formats are defined in the following sections. | defined in the following sections. | |||
| 6.1.1. Non-Congestion Control AGGFRAG_PAYLOAD Payload Format | 6.1.1. Non-Congestion Control AGGFRAG_PAYLOAD Payload Format | |||
| The non-congestion control AGGFRAG_PAYLOAD payload is comprised of a | The non-congestion control AGGFRAG_PAYLOAD payload is comprised of a | |||
| 4 octet header followed by a variable amount of "DataBlocks" data as | 4 octet header followed by a variable amount of "DataBlocks" data as | |||
| shown below. | shown below. | |||
| 1 2 3 | 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| skipping to change at page 15, line 27 ¶ | skipping to change at page 16, line 27 ¶ | |||
| +-+-+-+-+-+-+-+-+-+-+- | +-+-+-+-+-+-+-+-+-+-+- | |||
| Sub-type: | Sub-type: | |||
| An octet indicating the payload format. For this non-congestion | An octet indicating the payload format. For this non-congestion | |||
| control format, the value is 0. | control format, the value is 0. | |||
| Reserved: | Reserved: | |||
| An octet set to 0 on generation, and ignored on receipt. | An octet set to 0 on generation, and ignored on receipt. | |||
| BlockOffset: | BlockOffset: | |||
| A 16 bit unsigned integer counting the number of octets of | A 16-bit unsigned integer counting the number of octets of | |||
| "DataBlocks" data before the start of a new data block. | "DataBlocks" data before the start of a new data block. If the | |||
| "BlockOffset" can count past the end of the "DataBlocks" data in | start of a new data block occurs in a subsequent payload the | |||
| which case all the "DataBlocks" data belongs to the previous data | "BlockOffset" will point past the end of the "DataBlocks" data. | |||
| block being re-assembled. If the "BlockOffset" extends into | In this case all the "DataBlocks" data belongs to the current data | |||
| subsequent packets it continues to only count subsequent | block being assembled. When the "BlockOffset" extends into | |||
| "DataBlocks" data (i.e., it does not count subsequent packets | subsequent payloads it continues to only count "DataBlocks" data | |||
| non-"DataBlocks" octets). | (i.e., it does not count subsequent packets non-"DataBlocks" data | |||
| such as header octets). | ||||
| DataBlocks: | DataBlocks: | |||
| Variable number of octets that begins with the start of a data | Variable number of octets that begins with the start of a data | |||
| block, or the continuation of a previous data block, followed by | block, or the continuation of a previous data block, followed by | |||
| zero or more additional data blocks. | zero or more additional data blocks. | |||
| 6.1.2. Congestion Control AGGFRAG_PAYLOAD Payload Format | 6.1.2. Congestion Control AGGFRAG_PAYLOAD Payload Format | |||
| The congestion control AGGFRAG_PAYLOAD payload is comprised of a 24 | The congestion control AGGFRAG_PAYLOAD payload is comprised of a 24 | |||
| octet header followed by a variable amount of "DataBlocks" data as | octet header followed by a variable amount of "DataBlocks" data as | |||
| shown below. | shown below. | |||
| 1 2 3 | 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Sub-type (1) | Reserved |E| BlockOffset | | | Sub-type (1) | Reserved |P|E| BlockOffset | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | LossEventRate | | | LossEventRate | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | RTT | Echo Delay ... | | RTT | Echo Delay ... | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| ... Echo Delay | Transmit Delay | | ... Echo Delay | Transmit Delay | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | TVal | | | TVal | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | TEcho | | | TEcho | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | DataBlocks ... | | DataBlocks ... | |||
| +-+-+-+-+-+-+-+-+-+-+- | +-+-+-+-+-+-+-+-+-+-+- | |||
| Sub-type: | Sub-type: | |||
| An octet indicating the payload format. For this congestion | An octet indicating the payload format. For this congestion | |||
| control format, the value is 1. | control format, the value is 1. | |||
| Reserved: | Reserved: | |||
| A 7 bit field set to 0 on generation, and ignored on receipt. | A 6-bit field set to 0 on generation, and ignored on receipt. | |||
| P: | ||||
| A 1-bit value if set indicates that PLMTUD probing is in progress. | ||||
| This information can be used to avoid treating missing packets as | ||||
| loss events by the CC algorithm when running the PLMTUD probe | ||||
| algorithm. | ||||
| E: | E: | |||
| A 1 bit value if set indicates that Congestion Experienced (CE) | A 1-bit value if set indicates that Congestion Experienced (CE) | |||
| ECN bits were received and used in deriving the reported | ECN bits were received and used in deriving the reported | |||
| "LossEventRate". | "LossEventRate". | |||
| BlockOffset: | BlockOffset: | |||
| The same value as the non-congestion controlled payload format | The same value as the non-congestion controlled payload format | |||
| value. | value. | |||
| LossEventRate: | LossEventRate: | |||
| A 32 bit value specifying the inverse of the current loss event | A 32-bit value specifying the inverse of the current loss event | |||
| rate as calculated by the receiver. A value of zero indicates no | rate as calculated by the receiver. A value of zero indicates no | |||
| loss. Otherwise the loss event rate is "1/LossEventRate". | loss. Otherwise the loss event rate is "1/LossEventRate". | |||
| RTT: | RTT: | |||
| A 22 bit value specifying the sender's current round-trip time | A 22-bit value specifying the sender's current round-trip time | |||
| estimate in microseconds. The value MAY be zero prior to the | estimate in microseconds. The value MAY be zero prior to the | |||
| sender having calculated a round-trip time estimate. The value | sender having calculated a round-trip time estimate. The value | |||
| SHOULD be set to zero on non-AGGFRAG_PAYLOAD enabled SAs. If the | SHOULD be set to zero on non-AGGFRAG_PAYLOAD enabled SAs. If the | |||
| value is equal to or larger than "0x3FFFFF" it MUST be set to | value is equal to or larger than "0x3FFFFF" it MUST be set to | |||
| "0x3FFFFF". | "0x3FFFFF". | |||
| Echo Delay: | Echo Delay: | |||
| A 21-bit value specifying the delay in microseconds incurred | ||||
| A 21 bit value specifying the delay in microseconds incurred | ||||
| between the receiver first receiving the "TVal" value which it is | between the receiver first receiving the "TVal" value which it is | |||
| sending back in "TEcho". If the value is equal to or larger than | sending back in "TEcho". If the value is equal to or larger than | |||
| "0x1FFFFF" it MUST be set to "0x1FFFFF". | "0x1FFFFF" it MUST be set to "0x1FFFFF". | |||
| Transmit Delay: | Transmit Delay: | |||
| A 21 bit value specifying the transmission delay in microseconds. | A 21-bit value specifying the transmission delay in microseconds. | |||
| This is the fixed (or average) delay on the receiver between it | This is the fixed (or average) delay on the receiver between it | |||
| sending packets on the IPTFS tunnel. If the value is equal to or | sending packets on the IPTFS tunnel. If the value is equal to or | |||
| larger than "0x1FFFFF" it MUST be set to "0x1FFFFF". | larger than "0x1FFFFF" it MUST be set to "0x1FFFFF". | |||
| TVal: | TVal: | |||
| An opaque 32 bit value that will be echoed back by the receiver in | An opaque 32-bit value that will be echoed back by the receiver in | |||
| later packets in the "TEcho" field, along with an "Echo Delay" | later packets in the "TEcho" field, along with an "Echo Delay" | |||
| value of how long that echo took. | value of how long that echo took. | |||
| TEcho: | TEcho: | |||
| The opaque 32 bit value from a received packet's "TVal" field. | The opaque 32-bit value from a received packet's "TVal" field. | |||
| The received "TVal" is placed in "TEcho" along with an "Echo | The received "TVal" is placed in "TEcho" along with an "Echo | |||
| Delay" value indicating how long it has been since receiving the | Delay" value indicating how long it has been since receiving the | |||
| "TVal" value. | "TVal" value. | |||
| DataBlocks: | DataBlocks: | |||
| Variable number of octets that begins with the start of a data | Variable number of octets that begins with the start of a data | |||
| block, or the continuation of a previous data block, followed by | block, or the continuation of a previous data block, followed by | |||
| zero or more additional data blocks. For the special case of | zero or more additional data blocks. For the special case of | |||
| sending congestion control information on an non-IP-TFS enabled SA | sending congestion control information on an non-IP-TFS enabled SA | |||
| this value MUST be empty (i.e., be zero octets long). | this value MUST be empty (i.e., be zero octets long). | |||
| 6.1.3. Data Blocks | 6.1.3. Data Blocks | |||
| 1 2 3 | 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Type | IPv4, IPv6 or pad... | | Type | IPv4, IPv6 or pad... | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- | |||
| Type: | Type: | |||
| A 4 bit field where 0x0 identifies a pad data block, 0x4 indicates | A 4-bit field where 0x0 identifies a pad data block, 0x4 indicates | |||
| an IPv4 data block, and 0x6 indicates an IPv6 data block. | an IPv4 data block, and 0x6 indicates an IPv6 data block. | |||
| 6.1.3.1. IPv4 Data Block | 6.1.3.1. IPv4 Data Block | |||
| 1 2 3 | 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | 0x4 | IHL | TypeOfService | TotalLength | | | 0x4 | IHL | TypeOfService | TotalLength | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Rest of the inner packet ... | | Rest of the inner packet ... | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- | |||
| These values are the actual values within the encapsulated IPv4 | These values are the actual values within the encapsulated IPv4 | |||
| header. In other words, the start of this data block is the start of | header. In other words, the start of this data block is the start of | |||
| the encapsulated IP packet. | the encapsulated IP packet. | |||
| Type: | Type: | |||
| A 4 bit value of 0x4 indicating IPv4 (i.e., first nibble of the | A 4-bit value of 0x4 indicating IPv4 (i.e., first nibble of the | |||
| IPv4 packet). | IPv4 packet). | |||
| TotalLength: | TotalLength: | |||
| The 16 bit unsigned integer "Total Length" field of the IPv4 inner | The 16-bit unsigned integer "Total Length" field of the IPv4 inner | |||
| packet. | packet. | |||
| 6.1.3.2. IPv6 Data Block | 6.1.3.2. IPv6 Data Block | |||
| 1 2 3 | 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | 0x6 | TrafficClass | FlowLabel | | | 0x6 | TrafficClass | FlowLabel | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | PayloadLength | Rest of the inner packet ... | | PayloadLength | Rest of the inner packet ... | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- | |||
| These values are the actual values within the encapsulated IPv6 | These values are the actual values within the encapsulated IPv6 | |||
| header. In other words, the start of this data block is the start of | header. In other words, the start of this data block is the start of | |||
| the encapsulated IP packet. | the encapsulated IP packet. | |||
| Type: | Type: | |||
| A 4 bit value of 0x6 indicating IPv6 (i.e., first nibble of the | A 4-bit value of 0x6 indicating IPv6 (i.e., first nibble of the | |||
| IPv6 packet). | IPv6 packet). | |||
| PayloadLength: | PayloadLength: | |||
| The 16 bit unsigned integer "Payload Length" field of the inner | The 16-bit unsigned integer "Payload Length" field of the inner | |||
| IPv6 inner packet. | IPv6 inner packet. | |||
| 6.1.3.3. Pad Data Block | 6.1.3.3. Pad Data Block | |||
| 1 2 3 | 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | 0x0 | Padding ... | | 0x0 | Padding ... | |||
| +-+-+-+-+-+-+-+-+-+-+- | +-+-+-+-+-+-+-+-+-+-+- | |||
| Type: | Type: | |||
| A 4 bit value of 0x0 indicating a padding data block. | A 4-bit value of 0x0 indicating a padding data block. | |||
| Padding: | Padding: | |||
| extends to end of the encapsulating packet. | Extends to end of the encapsulating packet. | |||
| 6.1.4. IKEv2 USE_AGGFRAG Notification Message | 6.1.4. IKEv2 USE_AGGFRAG Notification Message | |||
| As discussed in Section 5.1 a notification message USE_AGGFRAG is | As discussed in Section 5.1, a notification message USE_AGGFRAG is | |||
| used to negotiate use of the ESP AGGFRAG_PAYLOAD payload type. | used to negotiate use of the ESP AGGFRAG_PAYLOAD Next Header value. | |||
| The USE_AGGFRAG Notification Message State Type is (TBD2). | The USE_AGGFRAG Notification Message State Type is (TBD2). | |||
| The notification payload contains 1 octet of requirement flags. | The notification payload contains 1 octet of requirement flags. | |||
| There are currently 2 requirement flags defined. This may be revised | There are currently 2 requirement flags defined. This may be revised | |||
| by later specifications. | by later specifications. | |||
| +-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+ | |||
| |0|0|0|0|0|0|C|D| | |0|0|0|0|0|0|C|D| | |||
| +-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+ | |||
| skipping to change at page 19, line 41 ¶ | skipping to change at page 20, line 44 ¶ | |||
| 0: | 0: | |||
| 6 bits - reserved, MUST be zero on send, unless defined by later | 6 bits - reserved, MUST be zero on send, unless defined by later | |||
| specifications. | specifications. | |||
| C: | C: | |||
| Congestion Control bit. If set, then the sender is requiring that | Congestion Control bit. If set, then the sender is requiring that | |||
| congestion control information MUST be returned to it periodically | congestion control information MUST be returned to it periodically | |||
| as defined in Section 3. | as defined in Section 3. | |||
| D: | D: | |||
| Don't Fragment bit, if set indicates the sender of the notify | Don't Fragment bit. If set, indicates the sender of the notify | |||
| message does not support receiving packet fragments (i.e., inner | message does not support receiving packet fragments (i.e., inner | |||
| packets MUST be sent using a single "Data Block"). This value | packets MUST be sent using a single "Data Block"). This value | |||
| only applies to what the sender is capable of receiving; the | only applies to what the sender is capable of receiving; the | |||
| sender MAY still send packet fragments unless similarly restricted | sender MAY still send packet fragments unless similarly restricted | |||
| by the receiver in it's USE_AGGFRAG notification. | by the receiver in it's USE_AGGFRAG notification. | |||
| 7. IANA Considerations | 7. IANA Considerations | |||
| 7.1. AGGFRAG_PAYLOAD Sub-Type Registry | 7.1. AGGFRAG_PAYLOAD Sub-Type Registry | |||
| This document requests IANA create a registry called "AGGFRAG_PAYLOAD | This document requests IANA create a registry called "AGGFRAG_PAYLOAD | |||
| Sub-Type Registry" under a new category named "ESP AGGFRAG_PAYLOAD | Sub-Type Registry" under a new category named "ESP AGGFRAG_PAYLOAD | |||
| Parameters". The registration policy for this registry is "Standards | Parameters". The registration policy for this registry is "Expert | |||
| Action" ([RFC8126] and [RFC7120]). | Review" ([RFC8126] and [RFC7120]). | |||
| Name: | Name: | |||
| AGGFRAG_PAYLOAD Sub-Type Registry | AGGFRAG_PAYLOAD Sub-Type Registry | |||
| Description: | Description: | |||
| AGGFRAG_PAYLOAD Payload Formats. | AGGFRAG_PAYLOAD Payload Formats. | |||
| Reference: | Reference: | |||
| This document | This document | |||
| skipping to change at page 20, line 47 ¶ | skipping to change at page 21, line 47 ¶ | |||
| TBD2 | TBD2 | |||
| Name: | Name: | |||
| USE_AGGFRAG | USE_AGGFRAG | |||
| Reference: | Reference: | |||
| This document | This document | |||
| 8. Security Considerations | 8. Security Considerations | |||
| This document describes a mechanism to add Traffic Flow | This document describes a mechanism to add TFC to IP traffic. Use of | |||
| Confidentiality to IP traffic. Use of this mechanism is expected to | this mechanism is expected to increase the security of the traffic | |||
| increase the security of the traffic being transported. Other than | being transported. Other than the additional security afforded by | |||
| the additional security afforded by using this mechanism, IP-TFS | using this mechanism, IP-TFS utilizes the security protocols | |||
| utilizes the security protocols [RFC4303] and [RFC7296] and so their | [RFC4303] and [RFC7296] and so their security considerations apply to | |||
| security considerations apply to IP-TFS as well. | IP-TFS as well. | |||
| As noted in (Section 3.1) the ECN bits are not protected by IPsec and | ||||
| thus may constitute a covert channel. For this reason, ECN use | ||||
| SHOULD NOT be enabled by default. | ||||
| As noted previously in Section 2.4.2, for TFC to be fully maintained | As noted previously in Section 2.4.2, for TFC to be fully maintained | |||
| the encapsulated traffic flow should not be affecting network | the encapsulated traffic flow should not be affecting network | |||
| congestion in a predictable way, and if it would be then non- | congestion in a predictable way, and if it would be then non- | |||
| congestion controlled mode use should be considered instead. | congestion controlled mode use should be considered instead. | |||
| 9. References | 9. References | |||
| 9.1. Normative References | 9.1. Normative References | |||
| skipping to change at page 21, line 38 ¶ | skipping to change at page 22, line 42 ¶ | |||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
| 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
| May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
| 9.2. Informative References | 9.2. Informative References | |||
| [AppCrypt] | [AppCrypt] | |||
| Schneier, B., "Applied Cryptography: Protocols, | Schneier, B., "Applied Cryptography: Protocols, | |||
| Algorithms, and Source Code in C", 11 2017. | Algorithms, and Source Code in C", 11 2017. | |||
| [I-D.iab-wire-image] | ||||
| Trammell, B. and M. Kuehlewind, "The Wire Image of a | ||||
| Network Protocol", draft-iab-wire-image-01 (work in | ||||
| progress), November 2018. | ||||
| [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, | [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, | |||
| DOI 10.17487/RFC0791, September 1981, | DOI 10.17487/RFC0791, September 1981, | |||
| <https://www.rfc-editor.org/info/rfc791>. | <https://www.rfc-editor.org/info/rfc791>. | |||
| [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, | [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, | |||
| DOI 10.17487/RFC1191, November 1990, | DOI 10.17487/RFC1191, November 1990, | |||
| <https://www.rfc-editor.org/info/rfc1191>. | <https://www.rfc-editor.org/info/rfc1191>. | |||
| [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, | [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, | |||
| "Definition of the Differentiated Services Field (DS | "Definition of the Differentiated Services Field (DS | |||
| skipping to change at page 23, line 20 ¶ | skipping to change at page 24, line 20 ¶ | |||
| [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | |||
| (IPv6) Specification", STD 86, RFC 8200, | (IPv6) Specification", STD 86, RFC 8200, | |||
| DOI 10.17487/RFC8200, July 2017, | DOI 10.17487/RFC8200, July 2017, | |||
| <https://www.rfc-editor.org/info/rfc8200>. | <https://www.rfc-editor.org/info/rfc8200>. | |||
| [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., | [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., | |||
| "Path MTU Discovery for IP version 6", STD 87, RFC 8201, | "Path MTU Discovery for IP version 6", STD 87, RFC 8201, | |||
| DOI 10.17487/RFC8201, July 2017, | DOI 10.17487/RFC8201, July 2017, | |||
| <https://www.rfc-editor.org/info/rfc8201>. | <https://www.rfc-editor.org/info/rfc8201>. | |||
| [RFC8229] Pauly, T., Touati, S., and R. Mantha, "TCP Encapsulation | ||||
| of IKE and IPsec Packets", RFC 8229, DOI 10.17487/RFC8229, | ||||
| August 2017, <https://www.rfc-editor.org/info/rfc8229>. | ||||
| [RFC8546] Trammell, B. and M. Kuehlewind, "The Wire Image of a | ||||
| Network Protocol", RFC 8546, DOI 10.17487/RFC8546, April | ||||
| 2019, <https://www.rfc-editor.org/info/rfc8546>. | ||||
| [RFC8899] Fairhurst, G., Jones, T., Tuexen, M., Ruengeler, I., and | [RFC8899] Fairhurst, G., Jones, T., Tuexen, M., Ruengeler, I., and | |||
| T. Voelker, "Packetization Layer Path MTU Discovery for | T. Voelker, "Packetization Layer Path MTU Discovery for | |||
| Datagram Transports", RFC 8899, DOI 10.17487/RFC8899, | Datagram Transports", RFC 8899, DOI 10.17487/RFC8899, | |||
| September 2020, <https://www.rfc-editor.org/info/rfc8899>. | September 2020, <https://www.rfc-editor.org/info/rfc8899>. | |||
| Appendix A. Example Of An Encapsulated IP Packet Flow | Appendix A. Example Of An Encapsulated IP Packet Flow | |||
| Below an example inner IP packet flow within the encapsulating tunnel | Below an example inner IP packet flow within the encapsulating tunnel | |||
| packet stream is shown. Notice how encapsulated IP packets can start | packet stream is shown. Notice how encapsulated IP packets can start | |||
| and end anywhere, and more than one or less than 1 may occur in a | and end anywhere, and more than one or less than 1 may occur in a | |||
| skipping to change at page 24, line 35 ¶ | skipping to change at page 25, line 42 ¶ | |||
| The IP-TFS receiver, having the RTT estimate from the sender can use | The IP-TFS receiver, having the RTT estimate from the sender can use | |||
| the same method as described in [RFC5348] and [RFC4342] to collect | the same method as described in [RFC5348] and [RFC4342] to collect | |||
| the loss intervals and calculate the loss event rate value using the | the loss intervals and calculate the loss event rate value using the | |||
| weighted average as indicated. The receiver communicates the inverse | weighted average as indicated. The receiver communicates the inverse | |||
| of this value back to the sender in the AGGFRAG_PAYLOAD payload | of this value back to the sender in the AGGFRAG_PAYLOAD payload | |||
| header field "LossEventRate". | header field "LossEventRate". | |||
| The IP-TFS sender now has both the "R" and "p" values and can | The IP-TFS sender now has both the "R" and "p" values and can | |||
| calculate the correct sending rate. If following [RFC5348] the | calculate the correct sending rate. If following [RFC5348] the | |||
| sender SHOULD also use the slow start mechanism described therein | sender should also use the slow start mechanism described therein | |||
| when the IP-TFS SA is first established. | when the IP-TFS SA is first established. | |||
| Appendix C. Comparisons of IP-TFS | Appendix C. Comparisons of IP-TFS | |||
| C.1. Comparing Overhead | C.1. Comparing Overhead | |||
| For comparing overhead the overhead of ESP for both normal and IP-TFS | ||||
| tunnel packets must be calculated, and so an algorithm for encryption | ||||
| and authentication must be chosen. For the data below AES-GCM-256 | ||||
| was selected. This leads to an IP+ESP overhead of 54. | ||||
| 54 = 20 (IP) + 8 (ESPH) + 2 (ESPF) + 8 (IV) + 16 (ICV) | ||||
| Additionally, for IP-TFS, non-congestion control AGGFRAG_PAYLOAD | ||||
| headers were chosen which adds 4 octets for a total overhead of 58. | ||||
| C.1.1. IP-TFS Overhead | C.1.1. IP-TFS Overhead | |||
| The overhead of IP-TFS is 40 bytes per outer packet. Therefore the | For comparison the overhead of IP-TFS is 58 octets per outer packet. | |||
| octet overhead per inner packet is 40 divided by the number of outer | Therefore the octet overhead per inner packet is 58 divided by the | |||
| packets required (fractional allowed). The overhead as a percentage | number of outer packets required (fractional allowed). The overhead | |||
| of inner packet size is a constant based on the Outer MTU size. | as a percentage of inner packet size is a constant based on the Outer | |||
| MTU size. | ||||
| OH = 40 / Outer Payload Size / Inner Packet Size | OH = 58 / Outer Payload Size / Inner Packet Size | |||
| OH % of Inner Packet Size = 100 * OH / Inner Packet Size | OH % of Inner Packet Size = 100 * OH / Inner Packet Size | |||
| OH % of Inner Packet Size = 4000 / Outer Payload Size | OH % of Inner Packet Size = 5800 / Outer Payload Size | |||
| Type IP-TFS IP-TFS IP-TFS | Type IP-TFS IP-TFS IP-TFS | |||
| MTU 576 1500 9000 | MTU 576 1500 9000 | |||
| PSize 536 1460 8960 | PSize 518 1442 8942 | |||
| ------------------------------- | ------------------------------- | |||
| 40 7.46% 2.74% 0.45% | 40 11.20% 4.02% 0.65% | |||
| 576 7.46% 2.74% 0.45% | 576 11.20% 4.02% 0.65% | |||
| 1500 7.46% 2.74% 0.45% | 1500 11.20% 4.02% 0.65% | |||
| 9000 7.46% 2.74% 0.45% | 9000 11.20% 4.02% 0.65% | |||
| Figure 4: IP-TFS Overhead as Percentage of Inner Packet Size | Figure 4: IP-TFS Overhead as Percentage of Inner Packet Size | |||
| C.1.2. ESP with Padding Overhead | C.1.2. ESP with Padding Overhead | |||
| The overhead per inner packet for constant-send-rate padded ESP | The overhead per inner packet for constant-send-rate padded ESP | |||
| (i.e., traditional IPsec TFC) is 36 octets plus any padding, unless | (i.e., traditional IPsec TFC) is 36 octets plus any padding, unless | |||
| fragmentation is required. | fragmentation is required. | |||
| When fragmentation of the inner packet is required to fit in the | When fragmentation of the inner packet is required to fit in the | |||
| outer IPsec packet, overhead is the number of outer packets required | outer IPsec packet, overhead is the number of outer packets required | |||
| to carry the fragmented inner packet times both the inner IP overhead | to carry the fragmented inner packet times both the inner IP overhead | |||
| (20) and the outer packet overhead (36) minus the initial inner IP | (20) and the outer packet overhead (54) minus the initial inner IP | |||
| overhead plus any required tail padding in the last encapsulation | overhead plus any required tail padding in the last encapsulation | |||
| packet. The required tail padding is the number of required packets | packet. The required tail padding is the number of required packets | |||
| times the difference of the Outer Payload Size and the IP Overhead | times the difference of the Outer Payload Size and the IP Overhead | |||
| minus the Inner Payload Size. So: | minus the Inner Payload Size. So: | |||
| Inner Paylaod Size = IP Packet Size - IP Overhead | Inner Paylaod Size = IP Packet Size - IP Overhead | |||
| Outer Payload Size = MTU - IPsec Overhead | Outer Payload Size = MTU - IPsec Overhead | |||
| Inner Payload Size | Inner Payload Size | |||
| NF0 = ---------------------------------- | NF0 = ---------------------------------- | |||
| skipping to change at page 26, line 12 ¶ | skipping to change at page 27, line 32 ¶ | |||
| OH = NF * (IPsec Overhead + Outer Payload Size) | OH = NF * (IPsec Overhead + Outer Payload Size) | |||
| - Inner Packet Size | - Inner Packet Size | |||
| C.2. Overhead Comparison | C.2. Overhead Comparison | |||
| The following tables collect the overhead values for some common L3 | The following tables collect the overhead values for some common L3 | |||
| MTU sizes in order to compare them. The first table is the number of | MTU sizes in order to compare them. The first table is the number of | |||
| octets of overhead for a given L3 MTU sized packet. The second table | octets of overhead for a given L3 MTU sized packet. The second table | |||
| is the percentage of overhead in the same MTU sized packet. | is the percentage of overhead in the same MTU sized packet. | |||
| XXX rerun these. | ||||
| Type ESP+Pad ESP+Pad ESP+Pad IP-TFS IP-TFS IP-TFS | Type ESP+Pad ESP+Pad ESP+Pad IP-TFS IP-TFS IP-TFS | |||
| L3 MTU 576 1500 9000 576 1500 9000 | L3 MTU 576 1500 9000 576 1500 9000 | |||
| PSize 540 1464 8964 536 1460 8960 | PSize 522 1446 8946 518 1442 8942 | |||
| ----------------------------------------------------------- | ----------------------------------------------------------- | |||
| 40 500 1424 8924 3.0 1.1 0.2 | 40 482 1406 8906 4.5 1.6 0.3 | |||
| 128 412 1336 8836 9.6 3.5 0.6 | 128 394 1318 8818 14.3 5.1 0.8 | |||
| 256 284 1208 8708 19.1 7.0 1.1 | 256 266 1190 8690 28.7 10.3 1.7 | |||
| 536 4 928 8428 40.0 14.7 2.4 | 518 4 928 8428 58.0 20.8 3.4 | |||
| 576 576 888 8388 43.0 15.8 2.6 | 576 576 870 8370 64.5 23.2 3.7 | |||
| 1460 268 4 7504 109.0 40.0 6.5 | 1442 286 4 7504 161.5 58.0 9.4 | |||
| 1500 228 1500 7464 111.9 41.1 6.7 | 1500 228 1500 7446 168.0 60.3 9.7 | |||
| 8960 1408 1540 4 668.7 245.5 40.0 | 8942 1426 1558 4 1001.2 359.7 58.0 | |||
| 9000 1368 1500 9000 671.6 246.6 40.2 | 9000 1368 1500 9000 1007.7 362.0 58.4 | |||
| Figure 5: Overhead comparison in octets | Figure 5: Overhead comparison in octets | |||
| Type ESP+Pad ESP+Pad ESP+Pad IP-TFS IP-TFS IP-TFS | Type ESP+Pad ESP+Pad ESP+Pad IP-TFS IP-TFS IP-TFS | |||
| MTU 576 1500 9000 576 1500 9000 | MTU 576 1500 9000 576 1500 9000 | |||
| PSize 540 1464 8964 536 1460 8960 | PSize 522 1446 8946 518 1442 8942 | |||
| ----------------------------------------------------------- | ----------------------------------------------------------- | |||
| 40 1250.0% 3560.0% 22310.0% 7.46% 2.74% 0.45% | 40 1205.0% 3515.0% 22265.0% 11.20% 4.02% 0.65% | |||
| 128 321.9% 1043.8% 6903.1% 7.46% 2.74% 0.45% | 128 307.8% 1029.7% 6889.1% 11.20% 4.02% 0.65% | |||
| 256 110.9% 471.9% 3401.6% 7.46% 2.74% 0.45% | 256 103.9% 464.8% 3394.5% 11.20% 4.02% 0.65% | |||
| 536 0.7% 173.1% 1572.4% 7.46% 2.74% 0.45% | 518 0.8% 179.2% 1627.0% 11.20% 4.02% 0.65% | |||
| 576 100.0% 154.2% 1456.2% 7.46% 2.74% 0.45% | 576 100.0% 151.0% 1453.1% 11.20% 4.02% 0.65% | |||
| 1460 18.4% 0.3% 514.0% 7.46% 2.74% 0.45% | 1442 19.8% 0.3% 520.4% 11.20% 4.02% 0.65% | |||
| 1500 15.2% 100.0% 497.6% 7.46% 2.74% 0.45% | 1500 15.2% 100.0% 496.4% 11.20% 4.02% 0.65% | |||
| 8960 15.7% 17.2% 0.0% 7.46% 2.74% 0.45% | 8942 15.9% 17.4% 0.0% 11.20% 4.02% 0.65% | |||
| 9000 15.2% 16.7% 100.0% 7.46% 2.74% 0.45% | 9000 15.2% 16.7% 100.0% 11.20% 4.02% 0.65% | |||
| Figure 6: Overhead as Percentage of Inner Packet Size | Figure 6: Overhead as Percentage of Inner Packet Size | |||
| C.3. Comparing Available Bandwidth | C.3. Comparing Available Bandwidth | |||
| Another way to compare the two solutions is to look at the amount of | Another way to compare the two solutions is to look at the amount of | |||
| available bandwidth each solution provides. The following sections | available bandwidth each solution provides. The following sections | |||
| consider and compare the percentage of available bandwidth. For the | consider and compare the percentage of available bandwidth. For the | |||
| sake of providing a well understood baseline normal (unencrypted) | sake of providing a well understood baseline normal (unencrypted) | |||
| Ethernet as well as normal ESP values are included. | Ethernet as well as normal ESP values are included. | |||
| skipping to change at page 27, line 15 ¶ | skipping to change at page 28, line 39 ¶ | |||
| C.3.1. Ethernet | C.3.1. Ethernet | |||
| In order to calculate the available bandwidth the per packet overhead | In order to calculate the available bandwidth the per packet overhead | |||
| is calculated first. The total overhead of Ethernet is 14+4 octets | is calculated first. The total overhead of Ethernet is 14+4 octets | |||
| of header and CRC plus and additional 20 octets of framing (preamble, | of header and CRC plus and additional 20 octets of framing (preamble, | |||
| start, and inter-packet gap) for a total of 38 octets. Additionally | start, and inter-packet gap) for a total of 38 octets. Additionally | |||
| the minimum payload is 46 octets. | the minimum payload is 46 octets. | |||
| Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP | Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP | |||
| MTU 590 1514 9014 590 1514 9014 any any | MTU 590 1514 9014 590 1514 9014 any any | |||
| OH 74 74 74 78 78 78 38 74 | OH 92 92 92 96 96 96 38 74 | |||
| ------------------------------------------------------------ | ------------------------------------------------------------ | |||
| 40 614 1538 9038 45 42 40 84 114 | 40 614 1538 9038 47 42 40 84 114 | |||
| 128 614 1538 9038 146 134 129 166 202 | 128 614 1538 9038 151 136 129 166 202 | |||
| 256 614 1538 9038 293 269 258 294 330 | 256 614 1538 9038 303 273 258 294 330 | |||
| 536 614 1538 9038 614 564 540 574 610 | 518 614 1538 9038 614 552 523 574 610 | |||
| 576 1228 1538 9038 659 606 581 614 650 | 576 1228 1538 9038 682 614 582 614 650 | |||
| 1460 1842 1538 9038 1672 1538 1472 1498 1534 | 1442 1842 1538 9038 1709 1538 1457 1498 1534 | |||
| 1500 1842 3076 9038 1718 1580 1513 1538 1574 | 1500 1842 3076 9038 1777 1599 1516 1538 1574 | |||
| 8960 11052 10766 9038 10263 9438 9038 8998 9034 | 8942 11052 10766 9038 10599 9537 9038 8998 9034 | |||
| 9000 11052 10766 18076 10309 9480 9078 9038 9074 | 9000 11052 10766 18076 10667 9599 9096 9038 9074 | |||
| Figure 7: L2 Octets Per Packet | Figure 7: L2 Octets Per Packet | |||
| Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP | Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP | |||
| MTU 590 1514 9014 590 1514 9014 any any | MTU 590 1514 9014 590 1514 9014 any any | |||
| OH 74 74 74 78 78 78 38 74 | OH 92 92 92 96 96 96 38 74 | |||
| -------------------------------------------------------------- | -------------------------------------------------------------- | |||
| 40 2.0M 0.8M 0.1M 27.3M 29.7M 31.0M 14.9M 11.0M | 40 2.0M 0.8M 0.1M 26.4M 29.3M 30.9M 14.9M 11.0M | |||
| 128 2.0M 0.8M 0.1M 8.5M 9.3M 9.7M 7.5M 6.2M | 128 2.0M 0.8M 0.1M 8.2M 9.2M 9.7M 7.5M 6.2M | |||
| 256 2.0M 0.8M 0.1M 4.3M 4.6M 4.8M 4.3M 3.8M | 256 2.0M 0.8M 0.1M 4.1M 4.6M 4.8M 4.3M 3.8M | |||
| 536 2.0M 0.8M 0.1M 2.0M 2.2M 2.3M 2.2M 2.0M | 518 2.0M 0.8M 0.1M 2.0M 2.3M 2.4M 2.2M 2.1M | |||
| 576 1.0M 0.8M 0.1M 1.9M 2.1M 2.2M 2.0M 1.9M | 576 1.0M 0.8M 0.1M 1.8M 2.0M 2.1M 2.0M 1.9M | |||
| 1460 678K 812K 138K 747K 812K 848K 834K 814K | 1442 678K 812K 138K 731K 812K 857K 844K 824K | |||
| 1500 678K 406K 138K 727K 791K 826K 812K 794K | 1500 678K 406K 138K 703K 781K 824K 812K 794K | |||
| 8960 113K 116K 138K 121K 132K 138K 138K 138K | 8942 113K 116K 138K 117K 131K 138K 139K 138K | |||
| 9000 113K 116K 69K 121K 131K 137K 138K 137K | 9000 113K 116K 69K 117K 130K 137K 138K 137K | |||
| Figure 8: Packets Per Second on 10G Ethernet | Figure 8: Packets Per Second on 10G Ethernet | |||
| Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP | Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP | |||
| 590 1514 9014 590 1514 9014 any any | 590 1514 9014 590 1514 9014 any any | |||
| 74 74 74 78 78 78 38 74 | 92 92 92 96 96 96 38 74 | |||
| ---------------------------------------------------------------------- | ---------------------------------------------------------------------- | |||
| 40 6.51% 2.60% 0.44% 87.30% 94.93% 99.14% 47.62% 35.09% | 40 6.51% 2.60% 0.44% 84.36% 93.76% 98.94% 47.62% 35.09% | |||
| 128 20.85% 8.32% 1.42% 87.30% 94.93% 99.14% 77.11% 63.37% | 128 20.85% 8.32% 1.42% 84.36% 93.76% 98.94% 77.11% 63.37% | |||
| 256 41.69% 16.64% 2.83% 87.30% 94.93% 99.14% 87.07% 77.58% | 256 41.69% 16.64% 2.83% 84.36% 93.76% 98.94% 87.07% 77.58% | |||
| 536 87.30% 34.85% 5.93% 87.30% 94.93% 99.14% 93.38% 87.87% | 518 84.36% 33.68% 5.73% 84.36% 93.76% 98.94% 93.17% 87.50% | |||
| 576 46.91% 37.45% 6.37% 87.30% 94.93% 99.14% 93.81% 88.62% | 576 46.91% 37.45% 6.37% 84.36% 93.76% 98.94% 93.81% 88.62% | |||
| 1460 79.26% 94.93% 16.15% 87.30% 94.93% 99.14% 97.46% 95.18% | 1442 78.28% 93.76% 15.95% 84.36% 93.76% 98.94% 97.43% 95.12% | |||
| 1500 81.43% 48.76% 16.60% 87.30% 94.93% 99.14% 97.53% 95.30% | 1500 81.43% 48.76% 16.60% 84.36% 93.76% 98.94% 97.53% 95.30% | |||
| 8960 81.07% 83.22% 99.14% 87.30% 94.93% 99.14% 99.58% 99.18% | 8942 80.91% 83.06% 98.94% 84.36% 93.76% 98.94% 99.58% 99.18% | |||
| 9000 81.43% 83.60% 49.79% 87.30% 94.93% 99.14% 99.58% 99.18% | 9000 81.43% 83.60% 49.79% 84.36% 93.76% 98.94% 99.58% 99.18% | |||
| Figure 9: Percentage of Bandwidth on 10G Ethernet | Figure 9: Percentage of Bandwidth on 10G Ethernet | |||
| A sometimes unexpected result of using IP-TFS (or any packet | A sometimes unexpected result of using IP-TFS (or any packet | |||
| aggregating tunnel) is that, for small to medium sized packets, the | aggregating tunnel) is that, for small to medium sized packets, the | |||
| available bandwidth is actually greater than native Ethernet. This | available bandwidth is actually greater than native Ethernet. This | |||
| is due to the reduction in Ethernet framing overhead. This increased | is due to the reduction in Ethernet framing overhead. This increased | |||
| bandwidth is paid for with an increase in latency. This latency is | bandwidth is paid for with an increase in latency. This latency is | |||
| the time to send the unrelated octets in the outer tunnel frame. The | the time to send the unrelated octets in the outer tunnel frame. The | |||
| following table illustrates the latency for some common values on a | following table illustrates the latency for some common values on a | |||
| 10G Ethernet link. The table also includes latency introduced by | 10G Ethernet link. The table also includes latency introduced by | |||
| padding if using ESP with padding. | padding if using ESP with padding. | |||
| ESP+Pad ESP+Pad IP-TFS IP-TFS | ESP+Pad ESP+Pad IP-TFS IP-TFS | |||
| 1500 9000 1500 9000 | 1500 9000 1500 9000 | |||
| ------------------------------------------ | ------------------------------------------ | |||
| 40 1.14 us 7.14 us 1.17 us 7.17 us | 40 1.12 us 7.12 us 1.17 us 7.17 us | |||
| 128 1.07 us 7.07 us 1.10 us 7.10 us | 128 1.05 us 7.05 us 1.10 us 7.10 us | |||
| 256 0.97 us 6.97 us 1.00 us 7.00 us | 256 0.95 us 6.95 us 1.00 us 7.00 us | |||
| 536 0.74 us 6.74 us 0.77 us 6.77 us | 518 0.74 us 6.74 us 0.79 us 6.79 us | |||
| 576 0.71 us 6.71 us 0.74 us 6.74 us | 576 0.70 us 6.70 us 0.74 us 6.74 us | |||
| 1460 0.00 us 6.00 us 0.04 us 6.04 us | 1442 0.00 us 6.00 us 0.05 us 6.05 us | |||
| 1500 1.20 us 5.97 us 0.00 us 6.00 us | 1500 1.20 us 5.96 us 0.00 us 6.00 us | |||
| Figure 10: Added Latency | Figure 10: Added Latency | |||
| Notice that the latency values are very similar between the two | Notice that the latency values are very similar between the two | |||
| solutions; however, whereas IP-TFS provides for constant high | solutions; however, whereas IP-TFS provides for constant high | |||
| bandwidth, in some cases even exceeding native Ethernet, ESP with | bandwidth, in some cases even exceeding native Ethernet, ESP with | |||
| padding often greatly reduces available bandwidth. | padding often greatly reduces available bandwidth. | |||
| Appendix D. Acknowledgements | Appendix D. Acknowledgements | |||
| We would like to thank Don Fedyk for help in reviewing and editing | We would like to thank Don Fedyk for help in reviewing and editing | |||
| this work. We would also like to thank Valery Smyslov for reviews | this work. We would also like to thank Sean Turner and Valery | |||
| and suggestions for improvements as well as Joseph Touch for the | Smyslov for reviews and many suggestions for improvements, as well as | |||
| transport area review and suggested improvements. | Joseph Touch for the transport area review and suggested | |||
| improvements. | ||||
| Appendix E. Contributors | Appendix E. Contributors | |||
| The following people made significant contributions to this document. | The following people made significant contributions to this document. | |||
| Lou Berger | Lou Berger | |||
| LabN Consulting, L.L.C. | LabN Consulting, L.L.C. | |||
| Email: lberger@labn.net | Email: lberger@labn.net | |||
| End of changes. 121 change blocks. | ||||
| 384 lines changed or deleted | 437 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||