idnits 2.17.1 draft-ietf-ipsecme-iptfs-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1085 has weird spacing: '...4 any any...' == Line 1101 has weird spacing: '...4 any any...' -- The document date (September 30, 2020) is 1304 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: '--800--' is mentioned on line 922, but not defined -- Looks like a reference, but probably isn't: '60' on line 922 == Missing Reference: '-240-' is mentioned on line 922, but not defined == Missing Reference: '--4000----------------------' is mentioned on line 922, but not defined Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Hopps 3 Internet-Draft LabN Consulting, L.L.C. 4 Intended status: Standards Track September 30, 2020 5 Expires: April 3, 2021 7 IP Traffic Flow Security 8 draft-ietf-ipsecme-iptfs-02 10 Abstract 12 This document describes a mechanism to enhance IPsec traffic flow 13 security by adding traffic flow confidentiality to encrypted IP 14 encapsulated traffic. Traffic flow confidentiality is provided by 15 obscuring the size and frequency of IP traffic using a fixed-sized, 16 constant-send-rate IPsec tunnel. The solution allows for congestion 17 control as well. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on April 3, 2021. 36 Copyright Notice 38 Copyright (c) 2020 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (https://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 54 1.1. Terminology & Concepts . . . . . . . . . . . . . . . . . 3 55 2. The IP-TFS Tunnel . . . . . . . . . . . . . . . . . . . . . . 4 56 2.1. Tunnel Content . . . . . . . . . . . . . . . . . . . . . 4 57 2.2. IPTFS_PROTOCOL Payload Content . . . . . . . . . . . . . 4 58 2.2.1. Data Blocks . . . . . . . . . . . . . . . . . . . . . 5 59 2.2.2. No Implicit End Padding Required . . . . . . . . . . 6 60 2.2.3. Fragmentation, Sequence Numbers and All-Pad Payloads 6 61 2.2.4. Empty Payload . . . . . . . . . . . . . . . . . . . . 6 62 2.2.5. IP Header Value Mapping . . . . . . . . . . . . . . . 7 63 2.3. Exclusive SA Use . . . . . . . . . . . . . . . . . . . . 7 64 2.4. Zero-Conf Receive-Side Operation On The SA. . . . . . . . 7 65 2.5. Modes of Operation . . . . . . . . . . . . . . . . . . . 7 66 2.5.1. Non-Congestion Controlled Mode . . . . . . . . . . . 8 67 2.5.2. Congestion Controlled Mode . . . . . . . . . . . . . 8 68 3. Congestion Information . . . . . . . . . . . . . . . . . . . 9 69 3.1. ECN Support . . . . . . . . . . . . . . . . . . . . . . . 10 70 4. Configuration . . . . . . . . . . . . . . . . . . . . . . . . 10 71 4.1. Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . 10 72 4.2. Fixed Packet Size . . . . . . . . . . . . . . . . . . . . 11 73 4.3. Congestion Control . . . . . . . . . . . . . . . . . . . 11 74 5. IKEv2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 75 5.1. USE_TFS Notification Message . . . . . . . . . . . . . . 11 76 6. Packet and Data Formats . . . . . . . . . . . . . . . . . . . 12 77 6.1. IP-TFS Payload . . . . . . . . . . . . . . . . . . . . . 12 78 6.1.1. Non-Congestion Control IPTFS_PROTOCOL Payload Format 12 79 6.1.2. Congestion Control IPTFS_PROTOCOL Payload Format . . 13 80 6.1.3. Data Blocks . . . . . . . . . . . . . . . . . . . . . 14 81 6.1.4. IKEv2 USE_IPTFS Notification Message . . . . . . . . 16 82 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 83 7.1. IPTFS_PROTOCOL Type . . . . . . . . . . . . . . . . . . . 17 84 7.2. IPTFS_PROTOCOL Sub-Type Registry . . . . . . . . . . . . 17 85 7.3. USE_IPTFS Notify Message Status Type . . . . . . . . . . 17 86 8. Security Considerations . . . . . . . . . . . . . . . . . . . 18 87 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 88 9.1. Normative References . . . . . . . . . . . . . . . . . . 18 89 9.2. Informative References . . . . . . . . . . . . . . . . . 18 90 Appendix A. Example Of An Encapsulated IP Packet Flow . . . . . 20 91 Appendix B. A Send and Loss Event Rate Calculation . . . . . . . 21 92 Appendix C. Comparisons of IP-TFS . . . . . . . . . . . . . . . 21 93 C.1. Comparing Overhead . . . . . . . . . . . . . . . . . . . 21 94 C.1.1. IP-TFS Overhead . . . . . . . . . . . . . . . . . . . 21 95 C.1.2. ESP with Padding Overhead . . . . . . . . . . . . . . 22 97 C.2. Overhead Comparison . . . . . . . . . . . . . . . . . . . 23 98 C.3. Comparing Available Bandwidth . . . . . . . . . . . . . . 23 99 C.3.1. Ethernet . . . . . . . . . . . . . . . . . . . . . . 24 100 Appendix D. Acknowledgements . . . . . . . . . . . . . . . . . . 26 101 Appendix E. Contributors . . . . . . . . . . . . . . . . . . . . 26 102 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 26 104 1. Introduction 106 Traffic Analysis ([RFC4301], [AppCrypt]) is the act of extracting 107 information about data being sent through a network. While one may 108 directly obscure the data through the use of encryption [RFC4303], 109 the traffic pattern itself exposes information due to variations in 110 it's shape and timing ([I-D.iab-wire-image], [AppCrypt]). Hiding the 111 size and frequency of traffic is referred to as Traffic Flow 112 Confidentiality (TFC) per [RFC4303]. 114 [RFC4303] provides for TFC by allowing padding to be added to 115 encrypted IP packets and allowing for transmission of all-pad packets 116 (indicated using protocol 59). This method has the major limitation 117 that it can significantly under-utilize the available bandwidth. 119 The IP-TFS solution provides for full TFC without the aforementioned 120 bandwidth limitation. This is accomplished by using a constant-send- 121 rate IPsec [RFC4303] tunnel with fixed-sized encapsulating packets; 122 however, these fixed-sized packets can contain partial, whole or 123 multiple IP packets to maximize the bandwidth of the tunnel. 125 For a comparison of the overhead of IP-TFS with the RFC4303 126 prescribed TFC solution see Appendix C. 128 Additionally, IP-TFS provides for dealing with network congestion 129 [RFC2914]. This is important for when the IP-TFS user is not in full 130 control of the domain through which the IP-TFS tunnel path flows. 132 1.1. Terminology & Concepts 134 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 135 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 136 "OPTIONAL" in this document are to be interpreted as described in 137 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, 138 as shown here. 140 This document assumes familiarity with IP security concepts described 141 in [RFC4301]. 143 2. The IP-TFS Tunnel 145 As mentioned in Section 1 IP-TFS utilizes an IPsec [RFC4303] tunnel 146 (SA) as it's transport. To provide for full TFC, fixed-sized 147 encapsulating packets are sent at a constant rate on the tunnel. 149 The primary input to the tunnel algorithm is the requested bandwidth 150 of the tunnel. Two values are then required to provide for this 151 bandwidth, the fixed size of the encapsulating packets, and rate at 152 which to send them. 154 The fixed packet size may either be specified manually or can be 155 determined through the use of Path MTU discovery [RFC1191] and 156 [RFC8201]. 158 Given the encapsulating packet size and the requested tunnel 159 bandwidth, the corresponding packet send rate can be calculated. The 160 packet send rate is the requested bandwidth divided by the payload 161 size of the encapsulating packet. 163 The egress of the IP-TFS tunnel MUST allow for and expect the ingress 164 (sending) side of the IP-TFS tunnel to vary the size and rate of sent 165 encapsulating packets, unless constrained by other policy. 167 2.1. Tunnel Content 169 As previously mentioned, one issue with the TFC padding solution in 170 [RFC4303] is the large amount of wasted bandwidth as only one IP 171 packet can be sent per encapsulating packet. In order to maximize 172 bandwidth IP-TFS breaks this one-to-one association. 174 IP-TFS aggregates as well as fragments the inner IP traffic flow into 175 fixed-sized encapsulating IPsec tunnel packets. Padding is only 176 added to the the tunnel packets if there is no data available to be 177 sent at the time of tunnel packet transmission, or if fragmentation 178 has been disabled by the receiver. 180 This is accomplished using a new Encapsulating Security Payload (ESP, 181 [RFC4303]) type which is identified by the IP protocol number 182 IPTFS_PROTOCOL (TBD1). 184 2.2. IPTFS_PROTOCOL Payload Content 186 The IPTFS_PROTOCOL payload content defined in this document is 187 comprised of a 4 or 16 octet header followed by either a partial, a 188 full or multiple partial or full data blocks. The following diagram 189 illustrates this IPTFS_PROTOCOL payload within the ESP packet. See 190 Section 6.1 for the exact formats of the IPTFS_PROTOCOL payload. 192 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 . Outer Encapsulating Header ... . 194 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 . ESP Header... . 196 +---------------------------------------------------------------+ 197 | ... : BlockOffset | 198 +---------------------------------------------------------------+ 199 : [Optional Congestion Info] : 200 +---------------------------------------------------------------+ 201 | DataBlocks ... ~ 202 ~ ~ 203 ~ | 204 +---------------------------------------------------------------| 205 . ESP Trailer... . 206 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Figure 1: Layout of an IP-TFS IPsec Packet 210 The "BlockOffset" value is either zero or some offset into or past 211 the end of the "DataBlocks" data. 213 If the "BlockOffset" value is zero it means that the "DataBlocks" 214 data begins with a new data block. 216 Conversely, if the "BlockOffset" value is non-zero it points to the 217 start of the new data block, and the initial "DataBlocks" data 218 belongs to a previous data block that is still being re-assembled. 220 The "BlockOffset" can point past the end of the "DataBlocks" data 221 which indicates that the next data block occurs in a subsequent 222 encapsulating packet. 224 Having the "BlockOffset" always point at the next available data 225 block allows for recovering the next full inner packet in the 226 presence of outer encapsulating packet loss. 228 An example IP-TFS packet flow can be found in Appendix A. 230 2.2.1. Data Blocks 232 +---------------------------------------------------------------+ 233 | Type | rest of IPv4, IPv6 or pad. 234 +-------- 236 Figure 2: Layout of IP-TFS data block 238 A data block is defined by a 4-bit type code followed by the data 239 block data. The type values have been carefully chosen to coincide 240 with the IPv4/IPv6 version field values so that no per-data block 241 type overhead is required to encapsulate an IP packet. Likewise, the 242 length of the data block is extracted from the encapsulated IPv4 or 243 IPv6 packet's length field. 245 2.2.2. No Implicit End Padding Required 247 It's worth noting that since a data block type is identified by its 248 first octet there is never a need for an implicit pad at the end of 249 an encapsulating packet. Even when the start of a data block occurs 250 near the end of a encapsulating packet such that there is no room for 251 the length field of the encapsulated header to be included in the 252 current encapsulating packet, the fact that the length comes at a 253 known location and is guaranteed to be present is enough to fetch the 254 length field from the subsequent encapsulating packet payload. Only 255 when there is no data to encapsulated is end padding required, and 256 then an explicit "Pad Data Block" would be used to identify the 257 padding. 259 2.2.3. Fragmentation, Sequence Numbers and All-Pad Payloads 261 In order for a receiver to be able to reassemble fragmented inner- 262 packets, the sender MUST send the inner-packet fragments back-to-back 263 in the logical IP-TFS packet stream (i.e., using consecutive ESP 264 sequence numbers). However, the sender is allowed to insert "all- 265 pad" IP-TFS packets (i.e., packets having payloads with a 266 "BlockOffset" of zero and a single pad "DataBlock") in between the 267 IP-TFS packets carrying the inner-packet fragment payloads. This 268 possible interleaving of all-pad packets allows the sender to always 269 be able to send an IP-TFS tunnel packet, regardless of the 270 encapsulation computational requirements. 272 When a receiver is reassembling an inner-packet, and it receives an 273 "all-pad" IP-TFS tunnel packet, it increments the expected sequence 274 number that the next inner-packet fragment is expected to arrive in. 276 2.2.4. Empty Payload 278 In order to support reporting of congestion control information 279 (described later) on a non-IP-TFS enabled SA, IP-TFS allows for the 280 sending of an IP-TFS payload with no data blocks (i.e., the ESP 281 payload length is equal to the IP-TFS header length). This special 282 payload is called an empty payload. 284 2.2.5. IP Header Value Mapping 286 [RFC4301] provides some direction on when and how to map various 287 values from an inner IP header to the outer encapsulating header, 288 namely the Don't-Fragment (DF) bit ([RFC0791] and [RFC8200]), the 289 Differentiated Services (DS) field [RFC2474] and the Explicit 290 Congestion Notification (ECN) field [RFC3168]. Unlike [RFC4301], IP- 291 TFS may and often will be encapsulating more than one IP packet per 292 ESP packet. To deal with this, these mappings are restricted 293 further. In particular IP-TFS never maps the inner DF bit as it is 294 unrelated to the IP-TFS tunnel functionality; IP-TFS never IP 295 fragments the inner packets and the inner packets will not affect the 296 fragmentation of the outer encapsulation packets. Likewise, the ECN 297 value need not be mapped as any congestion related to the constant- 298 send-rate IP-TFS tunnel is unrelated (by design!) to the inner 299 traffic flow. Finally, by default the DS field SHOULD NOT be copied 300 although an implementation MAY choose to allow for configuration to 301 override this behavior. An implementation SHOULD also allow the DS 302 value to be set by configuration. 304 2.3. Exclusive SA Use 306 It is not the intention of this specification to allow for mixed use 307 of an IP-TFS enabled SA. In other words, an SA that has IP-TFS 308 enabled is exclusively for IP-TFS use and MUST NOT have non-IP-TFS 309 payloads such as IP (IP protocol 4), TCP transport (IP protocol 6), 310 or ESP pad packets (protocol 59) intermixed with non-empty IP-TFS (IP 311 protocol TBD1) payloads. While it's possible to envision making the 312 algorithm work in the presence of sequence number skips in the IP-TFS 313 payload stream, the added complexity is not deemed worthwhile. Other 314 IPsec uses can configure and use their own SAs. 316 2.4. Zero-Conf Receive-Side Operation On The SA. 318 Receive-side operation of IP-TFS does not require any per-SA 319 configuration on the receiver; as such, an IP-TFS implementation 320 SHOULD support the option of switching to IP-TFS receive-side 321 operation on receipt of the first IP-TFS payload. 323 2.5. Modes of Operation 325 Just as with normal IPsec/ESP tunnels, IP-TFS tunnels are 326 unidirectional. Bidirectional IP-TFS functionality is achieved by 327 setting up 2 IP-TFS tunnels, one in either direction. 329 An IP-TFS tunnel can operate in 2 modes, a non-congestion controlled 330 mode and congestion controlled mode. 332 2.5.1. Non-Congestion Controlled Mode 334 In the non-congestion controlled mode IP-TFS sends fixed-sized 335 packets at a constant rate. The packet send rate is constant and is 336 not automatically adjusted regardless of any network congestion 337 (e.g., packet loss). 339 For similar reasons as given in [RFC7510] the non-congestion 340 controlled mode should only be used where the user has full 341 administrative control over the path the tunnel will take. This is 342 required so the user can guarantee the bandwidth and also be sure as 343 to not be negatively affecting network congestion [RFC2914]. In this 344 case packet loss should be reported to the administrator (e.g., via 345 syslog, YANG notification, SNMP traps, etc) so that any failures due 346 to a lack of bandwidth can be corrected. 348 2.5.2. Congestion Controlled Mode 350 With the congestion controlled mode, IP-TFS adapts to network 351 congestion by lowering the packet send rate to accommodate the 352 congestion, as well as raising the rate when congestion subsides. 353 Since overhead is per packet, by allowing for maximal fixed-size 354 packets and varying the send rate transport overhead is minimized. 356 The output of the congestion control algorithm will adjust the rate 357 at which the ingress sends packets. While this document does not 358 require a specific congestion control algorithm, best current 359 practice RECOMMENDS that the algorithm conform to [RFC5348]. 360 Congestion control principles are documented in [RFC2914] as well. 361 An example of an implementation of the [RFC5348] algorithm which 362 matches the requirements of IP-TFS (i.e., designed for fixed-size 363 packet and send rate varied based on congestion) is documented in 364 [RFC4342]. 366 The required inputs for the TCP friendly rate control algorithm 367 described in [RFC5348] are the receiver's loss event rate and the 368 sender's estimated round-trip time (RTT). These values are provided 369 by IP-TFS using the congestion information header fields described in 370 Section 3. In particular these values are sufficient to implement 371 the algorithm described in [RFC5348]. 373 At a minimum, the congestion information must be sent, from the 374 receiver and from the sender, at least once per RTT. Prior to 375 establishing an RTT the information SHOULD be sent constantly from 376 the sender and the receiver so that an RTT estimate can be 377 established. The lack of receiving this information over multiple 378 consecutive RTT intervals should be considered a congestion event 379 that causes the sender to adjust it's sending rate lower. For 380 example, [RFC4342] calls this the "no feedback timeout" and it is 381 equal to 4 RTT intervals. When a "no feedback timeout" has occurred 382 [RFC4342] halves the sending rate. 384 An implementation could choose to always include the congestion 385 information in it's IP-TFS payload header if sending on an IP-TFS 386 enabled SA. Since IP-TFS normally will operate with a large packet 387 size, the congestion information should represent a small portion of 388 the available tunnel bandwidth. 390 When an implementation is choosing a congestion control algorithm (or 391 a selection of algorithms) one should remember that IP-TFS is not 392 providing for reliable delivery of IP traffic, and so per packet ACKs 393 are not required and are not provided. 395 It's worth noting that the variable send-rate of a congestion 396 controlled IP-TFS tunnel, is not private; however, this send-rate is 397 being driven by network congestion, and as long as the encapsulated 398 (inner) traffic flow shape and timing are not directly affecting the 399 (outer) network congestion, the variations in the tunnel rate will 400 not weaken the provided inner traffic flow confidentiality. 402 2.5.2.1. Circuit Breakers 404 In additional to congestion control, implementations MAY choose to 405 define and implement circuit breakers [RFC8084] as a recovery method 406 of last resort. Enabling circuit breakers is also a reason a user 407 may wish to enable congestion information reports even when using the 408 non-congestion controlled mode of operation. The definition of 409 circuit breakers are outside the scope of this document. 411 3. Congestion Information 413 In order to support the congestion control mode, the sender needs to 414 know the loss event rate and also be able to approximate the RTT 415 ([RFC5348]). In order to obtain these values the receiver sends 416 congestion control information on it's SA back to the sender. Thus, 417 in order to support congestion control the receiver must have a 418 paired SA back to the sender (this is always the case when the tunnel 419 was created using IKEv2). If the SA back to the sender is a non-IP- 420 TFS enabled SA then an IPTFS_PROTOCOL empty payload (i.e., header 421 only) is used to convey the information. 423 In order to calculate a loss event rate compatible with [RFC5348], 424 the receiver needs to have a round-trip time estimate. Thus the 425 sender communicates this estimate in the "RTT" header field. On 426 startup this value will be zero as no RTT estimate is yet known. 428 In order to allow the sender to calculate the "RTT" value, the 429 receiver communicates the last sequence number it has seen to the 430 sender in the "LastSeqNum" header field. In addition to the 431 "LastSeqNum" value, the receiver sends an estimate of the amount of 432 time between receiving the "LastSeqNum" packet and transmitting the 433 "LastSeqNum" value back to the sender in the congestion information. 434 It places this time estimate in the "Delay" header field along with 435 the "LastSeqNum". 437 The receiver also calculates, and communicates in the "LossEventRate" 438 header field, the loss event rate for use by the sender. This is 439 slightly different from [RFC4342] which periodically sends all the 440 loss interval data back to the sender so that it can do the 441 calculation. See Appendix B for a suggested way to calculate the 442 loss event rate value. Initially this value will be zero (indicating 443 no loss) until enough data has been collected by the receiver to 444 update it. 446 3.1. ECN Support 448 In additional to normal packet loss information IP-TFS supports use 449 of the ECN bits in the encapsulating IP header [RFC3168] for 450 identifying congestion. If ECN use is enabled and a packet arrives 451 at the egress endpoint with the Congestion Experienced (CE) value 452 set, then the receiver considers that packet as being dropped, 453 although it does not drop it. The receiver MUST set the E bit in any 454 IPTFS_PROTOCOL payload header containing a "LossEventRate" value 455 derived from a CE value being considered. 457 As noted in [RFC3168] the ECN bits are not protected by IPsec and 458 thus may constitute a covert channel. For this reason ECN use SHOULD 459 NOT be enabled by default. 461 4. Configuration 463 IP-TFS is meant to be deployable with a minimal amount of 464 configuration. All IP-TFS specific configuration should be able to 465 be specified at the unidirectional tunnel ingress (sending) side. It 466 is intended that non-IKEv2 operation is supported, at least, with 467 local static configuration. 469 4.1. Bandwidth 471 Bandwidth is a local configuration option. For non-congestion 472 controlled mode the bandwidth SHOULD be configured. For congestion 473 controlled mode one can configure the bandwidth or have no 474 configuration and let congestion control discover the maximum 475 bandwidth available. No standardized configuration method is 476 required. 478 4.2. Fixed Packet Size 480 The fixed packet size to be used for the tunnel encapsulation packets 481 can be configured manually or can be automatically determined using 482 Path MTU discovery (see [RFC1191] and [RFC8201]). No standardized 483 configuration method is required. 485 4.3. Congestion Control 487 Congestion control is a local configuration option. No standardized 488 configuration method is required. 490 5. IKEv2 492 5.1. USE_TFS Notification Message 494 When using IKEv2, a new "USE_IPTFS" Notification Message is used to 495 enable operation of IP-TFS on a child SA pair. The method used is 496 similar to how USE_TRANSPORT_MODE is negotiated, as described in 497 [RFC7296]. 499 To request IP-TFS operation on the Child SA pair, the initiator 500 includes the USE_IPTFS notification in an SA payload requesting a new 501 Child SA (either during the initial IKE_AUTH or during non-rekeying 502 CREATE_CHILD_SA exchanges). If the request is accepted then response 503 MUST also include a notification of type USE_IPTFS. If the responder 504 declines the request the child SA will be established without IP-TFS 505 enabled. If this is unacceptable to the initiator, the initiator 506 MUST delete the child SA. 508 The USE_IPTFS notification MUST NOT be sent, and MUST be ignored, 509 during a CREATE_CHILD_SA rekeying exchange as it is not allowed to 510 change IP-TFS operation during rekeying. 512 The USE_IPTFS notification contains a 1 octet payload of flags that 513 specify any requirements from the sender of the message. If any 514 requirement flags are not understood or cannot be supported by the 515 receiver then the receiver should not enable IP-TFS mode (either by 516 not responding with the USE_IPTFS notification, or in the case of the 517 initiator, by deleting the child SA if the now established non-IP-TFS 518 operation is unacceptable). 520 The notification type and payload flag values are defined in 521 Section 6.1.4. 523 6. Packet and Data Formats 525 6.1. IP-TFS Payload 527 An IP-TFS payload is identified by the IP protocol number 528 IPTFS_PROTOCOL (TBD1). The first octet of this payload indicates the 529 format of the remaining payload data. 531 0 1 2 3 4 5 6 7 532 +-+-+-+-+-+-+-+-+-+-+- 533 | Sub-type | ... 534 +-+-+-+-+-+-+-+-+-+-+- 536 Sub-type: 537 An 8 bit value indicating the payload format. 539 This specification defines 2 payload sub-types. These payload 540 formats are defined in the following sections. 542 6.1.1. Non-Congestion Control IPTFS_PROTOCOL Payload Format 544 The non-congestion control IPTFS_PROTOCOL payload is comprised of a 4 545 octet header followed by a variable amount of "DataBlocks" data as 546 shown below. 548 1 2 3 549 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 550 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 551 | Sub-Type (0) | Reserved | BlockOffset | 552 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 553 | DataBlocks ... 554 +-+-+-+-+-+-+-+-+-+-+- 556 Sub-type: 557 An octet indicating the payload format. For this non-congestion 558 control format, the value is 0. 560 Reserved: 561 An octet set to 0 on generation, and ignored on receipt. 563 BlockOffset: 564 A 16 bit unsigned integer counting the number of octets of 565 "DataBlocks" data before the start of a new data block. 566 "BlockOffset" can count past the end of the "DataBlocks" data in 567 which case all the "DataBlocks" data belongs to the previous data 568 block being re-assembled. If the "BlockOffset" extends into 569 subsequent packets it continues to only count subsequent 570 "DataBlocks" data (i.e., it does not count subsequent packets 571 non-"DataBlocks" octets). 573 DataBlocks: 574 Variable number of octets that begins with the start of a data 575 block, or the continuation of a previous data block, followed by 576 zero or more additional data blocks. 578 6.1.2. Congestion Control IPTFS_PROTOCOL Payload Format 580 The congestion control IPTFS_PROTOCOL payload is comprised of a 16 581 octet header followed by a variable amount of "DataBlocks" data as 582 shown below. 584 1 2 3 585 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 586 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 587 | Sub-type (1) | Reserved |E| BlockOffset | 588 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 589 | RTT | Delay | 590 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 591 | LossEventRate | 592 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 593 | LastSeqNum | 594 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 595 | DataBlocks ... 596 +-+-+-+-+-+-+-+-+-+-+- 598 Sub-type: 599 An octet indicating the payload format. For this congestion 600 control format, the value is 1. 602 Reserved: 603 A 7 bit field set to 0 on generation, and ignored on receipt. 605 E: 606 A 1 bit value if set indicates that Congestion Experienced (CE) 607 ECN bits were received and used in deriving the reported 608 "LossEventRate". 610 BlockOffset: 611 The same value as the non-congestion controlled payload format 612 value. 614 RTT: 615 A 16 bit value specifying the sender's current round-trip time 616 estimate in milliseconds. The value MAY be zero prior to the 617 sender having calculated a round-trip time estimate. The value 618 SHOULD be set to zero on non-IP-TFS enabled SAs. 620 Delay: 621 A 16 bit value specifying the delay in milliseconds incurred 622 between the receiver receiving the "LastSeqNum" packet and the 623 sending of this acknowledgement of it. 625 LossEventRate: 626 A 32 bit value specifying the inverse of the current loss event 627 rate as calculated by the receiver. A value of zero indicates no 628 loss. Otherwise the loss event rate is "1/LossEventRate". 630 LastSeqNum: 631 A 32 bit value containing the lower 32 bits of the largest 632 sequence number last received. This is the latest in the sequence 633 not necessarily the most recent (in the case of re-ordering of 634 packets it may be less recent). When determining largest and 64 635 bit extended sequence numbers are in use, the upper 32 bits should 636 be used during the comparison. 638 DataBlocks: 639 Variable number of octets that begins with the start of a data 640 block, or the continuation of a previous data block, followed by 641 zero or more additional data blocks. For the special case of 642 sending congestion control information on an non-IP-TFS enabled SA 643 this value MUST be empty (i.e., be zero octets long). 645 6.1.3. Data Blocks 647 1 2 3 648 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 649 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 650 | Type | IPv4, IPv6 or pad... 651 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 653 Type: 654 A 4 bit field where 0x0 identifies a pad data block, 0x4 indicates 655 an IPv4 data block, and 0x6 indicates an IPv6 data block. 657 6.1.3.1. IPv4 Data Block 658 1 2 3 659 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 660 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 661 | 0x4 | IHL | TypeOfService | TotalLength | 662 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 663 | Rest of the inner packet ... 664 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 666 These values are the actual values within the encapsulated IPv4 667 header. In other words, the start of this data block is the start of 668 the encapsulated IP packet. 670 Type: 671 A 4 bit value of 0x4 indicating IPv4 (i.e., first nibble of the 672 IPv4 packet). 674 TotalLength: 675 The 16 bit unsigned integer "Total Length" field of the IPv4 inner 676 packet. 678 6.1.3.2. IPv6 Data Block 680 1 2 3 681 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 682 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 683 | 0x6 | TrafficClass | FlowLabel | 684 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 685 | PayloadLength | Rest of the inner packet ... 686 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 688 These values are the actual values within the encapsulated IPv6 689 header. In other words, the start of this data block is the start of 690 the encapsulated IP packet. 692 Type: 693 A 4 bit value of 0x6 indicating IPv6 (i.e., first nibble of the 694 IPv6 packet). 696 PayloadLength: 697 The 16 bit unsigned integer "Payload Length" field of the inner 698 IPv6 inner packet. 700 6.1.3.3. Pad Data Block 701 1 2 3 702 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 703 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 704 | 0x0 | Padding ... 705 +-+-+-+-+-+-+-+-+-+-+- 707 Type: 708 A 4 bit value of 0x0 indicating a padding data block. 710 Padding: 711 extends to end of the encapsulating packet. 713 6.1.4. IKEv2 USE_IPTFS Notification Message 715 As discussed in Section 5.1 a notification message USE_IPTFS is used 716 to negotiate IP-TFS operation in IKEv2. 718 The USE_IPTFS Notification Message State Type is (TBD2). 720 The notification payload contains 1 octet of requirement flags. 721 There are currently 2 requirement flags defined. This may be revised 722 by later specifications. 724 +-+-+-+-+-+-+-+-+ 725 |0|0|0|0|0|0|C|D| 726 +-+-+-+-+-+-+-+-+ 728 0: 729 6 bits - reserved, MUST be zero on send, unless defined by later 730 specifications. 732 C: 733 Congestion Control bit. If set, then the sender is requiring that 734 congestion control information MUST be returned to it periodically 735 as defined in Section 3. 737 D: 738 Don't Fragment bit, if set indicates the sender of the notify 739 message does not support receiving packet fragments (i.e., inner 740 packets MUST be sent using a single "Data Block"). This value 741 only applies to what the sender is capable of receiving; the 742 sender MAY still send packet fragments unless similarly restricted 743 by the receiver in it's USE_IPTFS notification. 745 7. IANA Considerations 747 7.1. IPTFS_PROTOCOL Type 749 This document requests a protocol number IPTFS_PROTOCOL be allocated 750 by IANA from "Assigned Internet Protocol Numbers" registry for 751 identifying the IP-TFS payload. 753 Type: 754 TBD1 756 Description: 757 An IP-TFS payload. 759 Reference: 760 This document 762 7.2. IPTFS_PROTOCOL Sub-Type Registry 764 This document requests IANA create a registry called "IPTFS_PROTOCOL 765 Sub-Type Registry" under "IPTFS_PROTOCOL Parameters" IANA registries. 766 The registration policy for this registry is "Standards Action" 767 ([RFC8126] and [RFC7120]). 769 Name: 770 IPTFS_PROTOCOL Sub-Type Registry 772 Description: 773 IPTFS_PROTOCOL Payload Formats. 775 Reference: 776 This document 778 This initial content for this registry is as follows: 780 Sub-Type Name Reference 781 -------------------------------------------------------- 782 0 Non-Congestion Control Format This document 783 1 Congestion Control Format This document 784 3-255 Reserved 786 7.3. USE_IPTFS Notify Message Status Type 788 This document requests a status type USE_IPTFS be allocated from the 789 "IKEv2 Notify Message Types - Status Types" registry. 791 Value: 792 TBD2 794 Name: 795 USE_IPTFS 797 Reference: 798 This document 800 8. Security Considerations 802 This document describes a mechanism to add Traffic Flow 803 Confidentiality to IP traffic. Use of this mechanism is expected to 804 increase the security of the traffic being transported. Other than 805 the additional security afforded by using this mechanism, IP-TFS 806 utilizes the security protocols [RFC4303] and [RFC7296] and so their 807 security considerations apply to IP-TFS as well. 809 As noted previously in Section 2.5.2, for TFC to be fully maintained 810 the encapsulated traffic flow should not be affecting network 811 congestion in a predictable way, and if it would be then non- 812 congestion controlled mode use should be considered instead. 814 9. References 816 9.1. Normative References 818 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 819 Requirement Levels", BCP 14, RFC 2119, 820 DOI 10.17487/RFC2119, March 1997, 821 . 823 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 824 RFC 4303, DOI 10.17487/RFC4303, December 2005, 825 . 827 [RFC7296] Kaufman, C., Hoffman, P., Nir, Y., Eronen, P., and T. 828 Kivinen, "Internet Key Exchange Protocol Version 2 829 (IKEv2)", STD 79, RFC 7296, DOI 10.17487/RFC7296, October 830 2014, . 832 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 833 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 834 May 2017, . 836 9.2. Informative References 838 [AppCrypt] 839 Schneier, B., "Applied Cryptography: Protocols, 840 Algorithms, and Source Code in C", 11 2017. 842 [I-D.iab-wire-image] 843 Trammell, B. and M. Kuehlewind, "The Wire Image of a 844 Network Protocol", draft-iab-wire-image-01 (work in 845 progress), November 2018. 847 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 848 DOI 10.17487/RFC0791, September 1981, 849 . 851 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 852 DOI 10.17487/RFC1191, November 1990, 853 . 855 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 856 "Definition of the Differentiated Services Field (DS 857 Field) in the IPv4 and IPv6 Headers", RFC 2474, 858 DOI 10.17487/RFC2474, December 1998, 859 . 861 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 862 RFC 2914, DOI 10.17487/RFC2914, September 2000, 863 . 865 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 866 of Explicit Congestion Notification (ECN) to IP", 867 RFC 3168, DOI 10.17487/RFC3168, September 2001, 868 . 870 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 871 Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, 872 December 2005, . 874 [RFC4342] Floyd, S., Kohler, E., and J. Padhye, "Profile for 875 Datagram Congestion Control Protocol (DCCP) Congestion 876 Control ID 3: TCP-Friendly Rate Control (TFRC)", RFC 4342, 877 DOI 10.17487/RFC4342, March 2006, 878 . 880 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 881 Friendly Rate Control (TFRC): Protocol Specification", 882 RFC 5348, DOI 10.17487/RFC5348, September 2008, 883 . 885 [RFC7120] Cotton, M., "Early IANA Allocation of Standards Track Code 886 Points", BCP 100, RFC 7120, DOI 10.17487/RFC7120, January 887 2014, . 889 [RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, 890 "Encapsulating MPLS in UDP", RFC 7510, 891 DOI 10.17487/RFC7510, April 2015, 892 . 894 [RFC8084] Fairhurst, G., "Network Transport Circuit Breakers", 895 BCP 208, RFC 8084, DOI 10.17487/RFC8084, March 2017, 896 . 898 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 899 Writing an IANA Considerations Section in RFCs", BCP 26, 900 RFC 8126, DOI 10.17487/RFC8126, June 2017, 901 . 903 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 904 (IPv6) Specification", STD 86, RFC 8200, 905 DOI 10.17487/RFC8200, July 2017, 906 . 908 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 909 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 910 DOI 10.17487/RFC8201, July 2017, 911 . 913 Appendix A. Example Of An Encapsulated IP Packet Flow 915 Below an example inner IP packet flow within the encapsulating tunnel 916 packet stream is shown. Notice how encapsulated IP packets can start 917 and end anywhere, and more than one or less than 1 may occur in a 918 single encapsulating packet. 920 Offset: 0 Offset: 100 Offset: 2900 Offset: 1400 921 [ ESP1 (1500) ][ ESP2 (1500) ][ ESP3 (1500) ][ ESP4 (1500) ] 922 [--800--][--800--][60][-240-][--4000----------------------][pad] 924 Figure 3: Inner and Outer Packet Flow 926 The encapsulated IP packet flow (lengths include IP header and 927 payload) is as follows: an 800 octet packet, an 800 octet packet, a 928 60 octet packet, a 240 octet packet, a 4000 octet packet. 930 The "BlockOffset" values in the 4 IP-TFS payload headers for this 931 packet flow would thus be: 0, 100, 2900, 1400 respectively. The 932 first encapsulating packet ESP1 has a zero "BlockOffset" which points 933 at the IP data block immediately following the IP-TFS header. The 934 following packet ESP2s "BlockOffset" points inward 100 octets to the 935 start of the 60 octet data block. The third encapsulating packet 936 ESP3 contains the middle portion of the 4000 octet data block so the 937 offset points past its end and into the forth encapsulating packet. 938 The fourth packet ESP4s offset is 1400 pointing at the padding which 939 follows the completion of the continued 4000 octet packet. 941 Appendix B. A Send and Loss Event Rate Calculation 943 The current best practice indicates that congestion control should be 944 done in a TCP friendly way. A TCP friendly congestion control 945 algorithm is described in [RFC5348]. For this IP-TFS use case (as 946 with [RFC4342]) the (fixed) packet size is used as the segment size 947 for the algorithm. The formula for the send rate is then as follows: 949 1 950 X_Pps = ----------------------------------------------- 951 R * (sqrt(2*p/3) + 12*sqrt(3*p/8)*p*(1+32*p^2)) 953 Where "X_Pps" is the send rate in packets per second, "R" is the 954 round trip time estimate and "p" is the loss event rate (the inverse 955 of which is provided by the receiver). 957 The IP-TFS receiver, having the RTT estimate from the sender MAY use 958 the same method as described in [RFC4342] to collect the loss 959 intervals and calculate the loss event rate value using the weighted 960 average as indicated. The receiver communicates the inverse of this 961 value back to the sender in the IPTFS_PROTOCOL payload header field 962 "LossEventRate". 964 The IP-TFS sender now has both the "R" and "p" values and can 965 calculate the correct sending rate ("X_Pps"). If following [RFC5348] 966 the sender SHOULD also use the slow start mechanism described therein 967 when the IP-TFS SA is first established. 969 Appendix C. Comparisons of IP-TFS 971 C.1. Comparing Overhead 973 C.1.1. IP-TFS Overhead 975 The overhead of IP-TFS is 40 bytes per outer packet. Therefore the 976 octet overhead per inner packet is 40 divided by the number of outer 977 packets required (fractional allowed). The overhead as a percentage 978 of inner packet size is a constant based on the Outer MTU size. 980 OH = 40 / Outer Payload Size / Inner Packet Size 981 OH % of Inner Packet Size = 100 * OH / Inner Packet Size 982 OH % of Inner Packet Size = 4000 / Outer Payload Size 983 Type IP-TFS IP-TFS IP-TFS 984 MTU 576 1500 9000 985 PSize 536 1460 8960 986 ------------------------------- 987 40 7.46% 2.74% 0.45% 988 576 7.46% 2.74% 0.45% 989 1500 7.46% 2.74% 0.45% 990 9000 7.46% 2.74% 0.45% 992 Figure 4: IP-TFS Overhead as Percentage of Inner Packet Size 994 C.1.2. ESP with Padding Overhead 996 The overhead per inner packet for constant-send-rate padded ESP 997 (i.e., traditional IPsec TFC) is 36 octets plus any padding, unless 998 fragmentation is required. 1000 When fragmentation of the inner packet is required to fit in the 1001 outer IPsec packet, overhead is the number of outer packets required 1002 to carry the fragmented inner packet times both the inner IP overhead 1003 (20) and the outer packet overhead (36) minus the initial inner IP 1004 overhead plus any required tail padding in the last encapsulation 1005 packet. The required tail padding is the number of required packets 1006 times the difference of the Outer Payload Size and the IP Overhead 1007 minus the Inner Payload Size. So: 1009 Inner Paylaod Size = IP Packet Size - IP Overhead 1010 Outer Payload Size = MTU - IPsec Overhead 1012 Inner Payload Size 1013 NF0 = ---------------------------------- 1014 Outer Payload Size - IP Overhead 1016 NF = CEILING(NF0) 1018 OH = NF * (IP Overhead + IPsec Overhead) 1019 - IP Overhead 1020 + NF * (Outer Payload Size - IP Overhead) 1021 - Inner Payload Size 1023 OH = NF * (IPsec Overhead + Outer Payload Size) 1024 - (IP Overhead + Inner Payload Size) 1026 OH = NF * (IPsec Overhead + Outer Payload Size) 1027 - Inner Packet Size 1029 C.2. Overhead Comparison 1031 The following tables collect the overhead values for some common L3 1032 MTU sizes in order to compare them. The first table is the number of 1033 octets of overhead for a given L3 MTU sized packet. The second table 1034 is the percentage of overhead in the same MTU sized packet. 1036 Type ESP+Pad ESP+Pad ESP+Pad IP-TFS IP-TFS IP-TFS 1037 L3 MTU 576 1500 9000 576 1500 9000 1038 PSize 540 1464 8964 536 1460 8960 1039 ----------------------------------------------------------- 1040 40 500 1424 8924 3.0 1.1 0.2 1041 128 412 1336 8836 9.6 3.5 0.6 1042 256 284 1208 8708 19.1 7.0 1.1 1043 536 4 928 8428 40.0 14.7 2.4 1044 576 576 888 8388 43.0 15.8 2.6 1045 1460 268 4 7504 109.0 40.0 6.5 1046 1500 228 1500 7464 111.9 41.1 6.7 1047 8960 1408 1540 4 668.7 245.5 40.0 1048 9000 1368 1500 9000 671.6 246.6 40.2 1050 Figure 5: Overhead comparison in octets 1052 Type ESP+Pad ESP+Pad ESP+Pad IP-TFS IP-TFS IP-TFS 1053 MTU 576 1500 9000 576 1500 9000 1054 PSize 540 1464 8964 536 1460 8960 1055 ----------------------------------------------------------- 1056 40 1250.0% 3560.0% 22310.0% 7.46% 2.74% 0.45% 1057 128 321.9% 1043.8% 6903.1% 7.46% 2.74% 0.45% 1058 256 110.9% 471.9% 3401.6% 7.46% 2.74% 0.45% 1059 536 0.7% 173.1% 1572.4% 7.46% 2.74% 0.45% 1060 576 100.0% 154.2% 1456.2% 7.46% 2.74% 0.45% 1061 1460 18.4% 0.3% 514.0% 7.46% 2.74% 0.45% 1062 1500 15.2% 100.0% 497.6% 7.46% 2.74% 0.45% 1063 8960 15.7% 17.2% 0.0% 7.46% 2.74% 0.45% 1064 9000 15.2% 16.7% 100.0% 7.46% 2.74% 0.45% 1066 Figure 6: Overhead as Percentage of Inner Packet Size 1068 C.3. Comparing Available Bandwidth 1070 Another way to compare the two solutions is to look at the amount of 1071 available bandwidth each solution provides. The following sections 1072 consider and compare the percentage of available bandwidth. For the 1073 sake of providing a well understood baseline normal (unencrypted) 1074 Ethernet as well as normal ESP values are included. 1076 C.3.1. Ethernet 1078 In order to calculate the available bandwidth the per packet overhead 1079 is calculated first. The total overhead of Ethernet is 14+4 octets 1080 of header and CRC plus and additional 20 octets of framing (preamble, 1081 start, and inter-packet gap) for a total of 38 octets. Additionally 1082 the minimum payload is 46 octets. 1084 Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP 1085 MTU 590 1514 9014 590 1514 9014 any any 1086 OH 74 74 74 78 78 78 38 74 1087 ------------------------------------------------------------ 1088 40 614 1538 9038 45 42 40 84 114 1089 128 614 1538 9038 146 134 129 166 202 1090 256 614 1538 9038 293 269 258 294 330 1091 536 614 1538 9038 614 564 540 574 610 1092 576 1228 1538 9038 659 606 581 614 650 1093 1460 1842 1538 9038 1672 1538 1472 1498 1534 1094 1500 1842 3076 9038 1718 1580 1513 1538 1574 1095 8960 11052 10766 9038 10263 9438 9038 8998 9034 1096 9000 11052 10766 18076 10309 9480 9078 9038 9074 1098 Figure 7: L2 Octets Per Packet 1100 Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP 1101 MTU 590 1514 9014 590 1514 9014 any any 1102 OH 74 74 74 78 78 78 38 74 1103 -------------------------------------------------------------- 1104 40 2.0M 0.8M 0.1M 27.3M 29.7M 31.0M 14.9M 11.0M 1105 128 2.0M 0.8M 0.1M 8.5M 9.3M 9.7M 7.5M 6.2M 1106 256 2.0M 0.8M 0.1M 4.3M 4.6M 4.8M 4.3M 3.8M 1107 536 2.0M 0.8M 0.1M 2.0M 2.2M 2.3M 2.2M 2.0M 1108 576 1.0M 0.8M 0.1M 1.9M 2.1M 2.2M 2.0M 1.9M 1109 1460 678K 812K 138K 747K 812K 848K 834K 814K 1110 1500 678K 406K 138K 727K 791K 826K 812K 794K 1111 8960 113K 116K 138K 121K 132K 138K 138K 138K 1112 9000 113K 116K 69K 121K 131K 137K 138K 137K 1114 Figure 8: Packets Per Second on 10G Ethernet 1116 Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP 1117 590 1514 9014 590 1514 9014 any any 1118 74 74 74 78 78 78 38 74 1119 ---------------------------------------------------------------------- 1120 40 6.51% 2.60% 0.44% 87.30% 94.93% 99.14% 47.62% 35.09% 1121 128 20.85% 8.32% 1.42% 87.30% 94.93% 99.14% 77.11% 63.37% 1122 256 41.69% 16.64% 2.83% 87.30% 94.93% 99.14% 87.07% 77.58% 1123 536 87.30% 34.85% 5.93% 87.30% 94.93% 99.14% 93.38% 87.87% 1124 576 46.91% 37.45% 6.37% 87.30% 94.93% 99.14% 93.81% 88.62% 1125 1460 79.26% 94.93% 16.15% 87.30% 94.93% 99.14% 97.46% 95.18% 1126 1500 81.43% 48.76% 16.60% 87.30% 94.93% 99.14% 97.53% 95.30% 1127 8960 81.07% 83.22% 99.14% 87.30% 94.93% 99.14% 99.58% 99.18% 1128 9000 81.43% 83.60% 49.79% 87.30% 94.93% 99.14% 99.58% 99.18% 1130 Figure 9: Percentage of Bandwidth on 10G Ethernet 1132 A sometimes unexpected result of using IP-TFS (or any packet 1133 aggregating tunnel) is that, for small to medium sized packets, the 1134 available bandwidth is actually greater than native Ethernet. This 1135 is due to the reduction in Ethernet framing overhead. This increased 1136 bandwidth is paid for with an increase in latency. This latency is 1137 the time to send the unrelated octets in the outer tunnel frame. The 1138 following table illustrates the latency for some common values on a 1139 10G Ethernet link. The table also includes latency introduced by 1140 padding if using ESP with padding. 1142 ESP+Pad ESP+Pad IP-TFS IP-TFS 1143 1500 9000 1500 9000 1145 ------------------------------------------ 1146 40 1.14 us 7.14 us 1.17 us 7.17 us 1147 128 1.07 us 7.07 us 1.10 us 7.10 us 1148 256 0.97 us 6.97 us 1.00 us 7.00 us 1149 536 0.74 us 6.74 us 0.77 us 6.77 us 1150 576 0.71 us 6.71 us 0.74 us 6.74 us 1151 1460 0.00 us 6.00 us 0.04 us 6.04 us 1152 1500 1.20 us 5.97 us 0.00 us 6.00 us 1154 Figure 10: Added Latency 1156 Notice that the latency values are very similar between the two 1157 solutions; however, whereas IP-TFS provides for constant high 1158 bandwidth, in some cases even exceeding native Ethernet, ESP with 1159 padding often greatly reduces available bandwidth. 1161 Appendix D. Acknowledgements 1163 We would like to thank Don Fedyk for help in reviewing and editing 1164 this work. 1166 Appendix E. Contributors 1168 The following people made significant contributions to this document. 1170 Lou Berger 1171 LabN Consulting, L.L.C. 1173 Email: lberger@labn.net 1175 Author's Address 1177 Christian Hopps 1178 LabN Consulting, L.L.C. 1180 Email: chopps@chopps.org