idnits 2.17.1 draft-ietf-ipsecme-iptfs-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1077 has weird spacing: '...4 any any...' == Line 1093 has weird spacing: '...4 any any...' -- The document date (March 2, 2020) is 1517 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: '--800--' is mentioned on line 913, but not defined -- Looks like a reference, but probably isn't: '60' on line 913 == Missing Reference: '-240-' is mentioned on line 913, but not defined == Missing Reference: '--4000----------------------' is mentioned on line 913, but not defined Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Hopps 3 Internet-Draft LabN Consulting, L.L.C. 4 Intended status: Standards Track March 2, 2020 5 Expires: September 3, 2020 7 IP Traffic Flow Security 8 draft-ietf-ipsecme-iptfs-01 10 Abstract 12 This document describes a mechanism to enhance IPsec traffic flow 13 security by adding traffic flow confidentiality to encrypted IP 14 encapsulated traffic. Traffic flow confidentiality is provided by 15 obscuring the size and frequency of IP traffic using a fixed-sized, 16 constant-send-rate IPsec tunnel. The solution allows for congestion 17 control as well. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on September 3, 2020. 36 Copyright Notice 38 Copyright (c) 2020 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (https://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 54 1.1. Terminology & Concepts . . . . . . . . . . . . . . . . . 3 55 2. The IP-TFS Tunnel . . . . . . . . . . . . . . . . . . . . . . 4 56 2.1. Tunnel Content . . . . . . . . . . . . . . . . . . . . . 4 57 2.2. IPTFS_PROTOCOL Payload Content . . . . . . . . . . . . . 4 58 2.2.1. Data Blocks . . . . . . . . . . . . . . . . . . . . . 5 59 2.2.2. No Implicit End Padding Required . . . . . . . . . . 6 60 2.2.3. Empty Payload . . . . . . . . . . . . . . . . . . . . 6 61 2.2.4. IP Header Value Mapping . . . . . . . . . . . . . . . 6 62 2.3. Exclusive SA Use . . . . . . . . . . . . . . . . . . . . 7 63 2.4. Initiating IP-TFS Operation On The SA. . . . . . . . . . 7 64 2.5. Modes of Operation . . . . . . . . . . . . . . . . . . . 7 65 2.5.1. Non-Congestion Controlled Mode . . . . . . . . . . . 7 66 2.5.2. Congestion Controlled Mode . . . . . . . . . . . . . 8 67 3. Congestion Information . . . . . . . . . . . . . . . . . . . 9 68 3.1. ECN Support . . . . . . . . . . . . . . . . . . . . . . . 10 69 4. Configuration . . . . . . . . . . . . . . . . . . . . . . . . 10 70 4.1. Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . 10 71 4.2. Fixed Packet Size . . . . . . . . . . . . . . . . . . . . 10 72 4.3. Congestion Control . . . . . . . . . . . . . . . . . . . 11 73 5. IKEv2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 74 5.1. USE_TFS Notification Message . . . . . . . . . . . . . . 11 75 6. Packet and Data Formats . . . . . . . . . . . . . . . . . . . 11 76 6.1. IP-TFS Payload . . . . . . . . . . . . . . . . . . . . . 11 77 6.1.1. Non-Congestion Control IPTFS_PROTOCOL Payload Format 12 78 6.1.2. Congestion Control IPTFS_PROTOCOL Payload Format . . 13 79 6.1.3. Data Blocks . . . . . . . . . . . . . . . . . . . . . 14 80 6.1.4. IKEv2 USE_IPTFS Notification Message . . . . . . . . 15 81 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 82 7.1. IPTFS_PROTOCOL Type . . . . . . . . . . . . . . . . . . . 16 83 7.2. IPTFS_PROTOCOL Sub-Type Registry . . . . . . . . . . . . 16 84 7.3. USE_IPTFS Notify Message Status Type . . . . . . . . . . 17 85 8. Security Considerations . . . . . . . . . . . . . . . . . . . 17 86 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 87 9.1. Normative References . . . . . . . . . . . . . . . . . . 18 88 9.2. Informative References . . . . . . . . . . . . . . . . . 18 89 Appendix A. Example Of An Encapsulated IP Packet Flow . . . . . 20 90 Appendix B. A Send and Loss Event Rate Calculation . . . . . . . 20 91 Appendix C. Comparisons of IP-TFS . . . . . . . . . . . . . . . 21 92 C.1. Comparing Overhead . . . . . . . . . . . . . . . . . . . 21 93 C.1.1. IP-TFS Overhead . . . . . . . . . . . . . . . . . . . 21 94 C.1.2. ESP with Padding Overhead . . . . . . . . . . . . . . 21 95 C.2. Overhead Comparison . . . . . . . . . . . . . . . . . . . 22 96 C.3. Comparing Available Bandwidth . . . . . . . . . . . . . . 23 97 C.3.1. Ethernet . . . . . . . . . . . . . . . . . . . . . . 23 98 Appendix D. Acknowledgements . . . . . . . . . . . . . . . . . . 25 99 Appendix E. Contributors . . . . . . . . . . . . . . . . . . . . 25 100 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 25 102 1. Introduction 104 Traffic Analysis ([RFC4301], [AppCrypt]) is the act of extracting 105 information about data being sent through a network. While one may 106 directly obscure the data through the use of encryption [RFC4303], 107 the traffic pattern itself exposes information due to variations in 108 it's shape and timing ([I-D.iab-wire-image], [AppCrypt]). Hiding the 109 size and frequency of traffic is referred to as Traffic Flow 110 Confidentiality (TFC) per [RFC4303]. 112 [RFC4303] provides for TFC by allowing padding to be added to 113 encrypted IP packets and allowing for transmission of all-pad packets 114 (indicated using protocol 59). This method has the major limitation 115 that it can significantly under-utilize the available bandwidth. 117 The IP-TFS solution provides for full TFC without the aforementioned 118 bandwidth limitation. To do this, we use a constant-send-rate IPsec 119 [RFC4303] tunnel with fixed-sized encapsulating packets; however, 120 these fixed-sized packets can contain partial, whole or multiple IP 121 packets to maximize the bandwidth of the tunnel. 123 For a comparison of the overhead of IP-TFS with the RFC4303 124 prescribed TFC solution see Appendix C. 126 Additionally, IP-TFS provides for dealing with network congestion 127 [RFC2914]. This is important for when the IP-TFS user is not in full 128 control of the domain through which the IP-TFS tunnel path flows. 130 1.1. Terminology & Concepts 132 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 133 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 134 "OPTIONAL" in this document are to be interpreted as described in 135 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, 136 as shown here. 138 This document assumes familiarity with IP security concepts described 139 in [RFC4301]. 141 2. The IP-TFS Tunnel 143 As mentioned in Section 1 IP-TFS utilizes an IPsec [RFC4303] tunnel 144 (SA) as it's transport. To provide for full TFC we send fixed-sized 145 encapsulating packets at a constant rate on the tunnel. 147 The primary input to the tunnel algorithm is the requested bandwidth 148 of the tunnel. Two values are then required to provide for this 149 bandwidth, the fixed size of the encapsulating packets, and rate at 150 which to send them. 152 The fixed packet size may either be specified manually or can be 153 determined through the use of Path MTU discovery [RFC1191] and 154 [RFC8201]. 156 Given the encapsulating packet size and the requested tunnel 157 bandwidth, the corresponding packet send rate can be calculated. The 158 packet send rate is the requested bandwidth divided by the payload 159 size of the encapsulating packet. 161 The egress of the IP-TFS tunnel MUST allow for and expect the ingress 162 (sending) side of the IP-TFS tunnel to vary the size and rate of sent 163 encapsulating packets, unless constrained by other policy. 165 2.1. Tunnel Content 167 As previously mentioned, one issue with the TFC padding solution in 168 [RFC4303] is the large amount of wasted bandwidth as only one IP 169 packet can be sent per encapsulating packet. In order to maximize 170 bandwidth IP-TFS breaks this one-to-one association. 172 With IP-TFS we aggregate as well as fragment the inner IP traffic 173 flow into fixed-sized encapsulating IPsec tunnel packets. We only 174 pad the tunnel packets if there is no data available to be sent at 175 the time of tunnel packet transmission, or if fragmentation has been 176 disabled by the receiver. 178 In order to do this we use a new Encapsulating Security Payload (ESP, 179 [RFC4303]) type which is identified by the IP protocol number 180 IPTFS_PROTOCOL (TBD1). 182 2.2. IPTFS_PROTOCOL Payload Content 184 The IPTFS_PROTOCOL payload content defined in this document is 185 comprised of a 4 or 16 octet header followed by either a partial, a 186 full or multiple partial or full data blocks. The following diagram 187 illustrates this IPTFS_PROTOCOL payload within the ESP packet. See 188 Section 6.1 for the exact formats of the IPTFS_PROTOCOL payload. 190 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 . Outer Encapsulating Header ... . 192 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 . ESP Header... . 194 +---------------------------------------------------------------+ 195 | ... : BlockOffset | 196 +---------------------------------------------------------------+ 197 : [Optional Congestion Info] : 198 +---------------------------------------------------------------+ 199 | DataBlocks ... ~ 200 ~ ~ 201 ~ | 202 +---------------------------------------------------------------| 203 . ESP Trailer... . 204 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 Figure 1: Layout of an IP-TFS IPsec Packet 208 The "BlockOffset" value is either zero or some offset into or past 209 the end of the "DataBlocks" data. 211 If the "BlockOffset" value is zero it means that the "DataBlocks" 212 data begins with a new data block. 214 Conversely, if the "BlockOffset" value is non-zero it points to the 215 start of the new data block, and the initial "DataBlocks" data 216 belongs to a previous data block that is still being re-assembled. 218 The "BlockOffset" can point past the end of the "DataBlocks" data 219 which indicates that the next data block occurs in a subsequent 220 encapsulating packet. 222 Having the "BlockOffset" always point at the next available data 223 block allows for quick recovery with minimal inner packet loss in the 224 presence of outer encapsulating packet loss. 226 An example IP-TFS packet flow can be found in Appendix A. 228 2.2.1. Data Blocks 230 +---------------------------------------------------------------+ 231 | Type | rest of IPv4, IPv6 or pad. 232 +-------- 234 Figure 2: Layout of IP-TFS data block 236 A data block is defined by a 4-bit type code followed by the data 237 block data. The type values have been carefully chosen to coincide 238 with the IPv4/IPv6 version field values so that no per-data block 239 type overhead is required to encapsulate an IP packet. Likewise, the 240 length of the data block is extracted from the encapsulated IPv4 or 241 IPv6 packet's length field. 243 2.2.2. No Implicit End Padding Required 245 It's worth noting that since a data block type is identified by its 246 first octet there is never a need for an implicit pad at the end of 247 an encapsulating packet. Even when the start of a data block occurs 248 near the end of a encapsulating packet such that there is no room for 249 the length field of the encapsulated header to be included in the 250 current encapsulating packet, the fact that the length comes at a 251 known location and is guaranteed to be present is enough to fetch the 252 length field from the subsequent encapsulating packet payload. Only 253 when there is no data to encapsulated is end padding required, and 254 then an explicit "Pad Data Block" would be used to identify the 255 padding. 257 2.2.3. Empty Payload 259 In order to support reporting of congestion control information 260 (described later) on a non-IP-TFS enabled SA, IP-TFS allows for the 261 sending of an IP-TFS payload with no data blocks (i.e., the ESP 262 payload length is equal to the IP-TFS header length). This special 263 payload is called an empty payload. 265 2.2.4. IP Header Value Mapping 267 [RFC4301] provides some direction on when and how to map various 268 values from an inner IP header to the outer encapsulating header, 269 namely the Don't-Fragment (DF) bit ([RFC0791] and [RFC8200]), the 270 Differentiated Services (DS) field [RFC2474] and the Explicit 271 Congestion Notification (ECN) field [RFC3168]. Unlike [RFC4301] with 272 IP-TFS we may and often will be encapsulating more than 1 IP packet 273 per ESP packet. To deal with this we further restrict these 274 mappings. In particular we never map the inner DF bit as it is 275 unrelated to the IP-TFS tunnel functionality; we never IP fragment 276 the inner packets and the inner packets will not affect the 277 fragmentation of the outer encapsulation packets. Likewise, the ECN 278 value need not be mapped as any congestion related to the constant- 279 send-rate IP-TFS tunnel is unrelated (by design!) to the inner 280 traffic flow. Finally, by default the DS field SHOULD NOT be copied 281 although an implementation MAY choose to allow for configuration to 282 override this behavior. An implementation SHOULD also allow the DS 283 value to be set by configuration. 285 2.3. Exclusive SA Use 287 It is not the intention of this specification to allow for mixed use 288 of an IP-TFS enabled SA. In other words, an SA that has IP-TFS 289 enabled is exclusively for IP-TFS use and MUST NOT have non-IP-TFS 290 payloads such as IP (IP protocol 4), TCP transport (IP protocol 6), 291 or ESP pad packets (protocol 59) intermixed with non-empty IP-TFS (IP 292 protocol TBD1) payloads. While it's possible to envision making the 293 algorithm work in the presence of sequence number skips in the IP-TFS 294 payload stream, the added complexity is not deemed worthwhile. Other 295 IPsec uses can configure and use their own SAs. 297 2.4. Initiating IP-TFS Operation On The SA. 299 While a user will normally configure their IPsec tunnel (SA) to 300 operate using IP-TFS to start, we also allow IP-TFS operation to be 301 enabled post-SA creation and use. This late-enabling may be useful 302 for debugging or other purposes. To support this late-enabled 303 operation the receiver switches to IP-TFS operation on receipt of the 304 first ESP payload with the IPTFS_PROTOCOL indicated as the payload 305 type which also contains a data block (i.e., a non-empty IP-TFS 306 payload). The the receipt of an empty IPTFS_PROTOCOL payload (i.e., 307 one without any data blocks) is used to communicate congestion 308 control information from the receiver back to the sender on a non-IP- 309 TFS enabled SA, and MUST NOT cause IP-TFS to be enabled on that SA. 311 2.5. Modes of Operation 313 Just as with normal IPsec/ESP tunnels, IP-TFS tunnels are 314 unidirectional. Bidirectional IP-TFS functionality is achieved by 315 setting up 2 IP-TFS tunnels, one in either direction. 317 An IP-TFS tunnel can operate in 2 modes, a non-congestion controlled 318 mode and congestion controlled mode. 320 2.5.1. Non-Congestion Controlled Mode 322 In the non-congestion controlled mode IP-TFS sends fixed-sized 323 packets at a constant rate. The packet send rate is constant and is 324 not automatically adjusted regardless of any network congestion 325 (e.g., packet loss). 327 For similar reasons as given in [RFC7510] the non-congestion 328 controlled mode should only be used where the user has full 329 administrative control over the path the tunnel will take. This is 330 required so the user can guarantee the bandwidth and also be sure as 331 to not be negatively affecting network congestion [RFC2914]. In this 332 case packet loss should be reported to the administrator (e.g., via 333 syslog, YANG notification, SNMP traps, etc) so that any failures due 334 to a lack of bandwidth can be corrected. 336 2.5.2. Congestion Controlled Mode 338 With the congestion controlled mode, IP-TFS adapts to network 339 congestion by lowering the packet send rate to accommodate the 340 congestion, as well as raising the rate when congestion subsides. 341 Since overhead is per packet, by allowing for maximal fixed-size 342 packets and varying the send rate we minimize transport overhead. 344 The output of the congestion control algorithm will adjust the rate 345 at which the ingress sends packets. While this document does not 346 require a specific congestion control algorithm, best current 347 practice RECOMMENDS that the algorithm conform to [RFC5348]. 348 Congestion control principles are documented in [RFC2914] as well. 349 An example of an implementation of the [RFC5348] algorithm which 350 matches the requirements of IP-TFS (i.e., designed for fixed-size 351 packet and send rate varied based on congestion) is documented in 352 [RFC4342]. 354 The required inputs for the TCP friendly rate control algorithm 355 described in [RFC5348] are the receivers loss event rate and the 356 senders estimated round-trip time (RTT). These values are provided 357 by IP-TFS using the congestion information header fields described in 358 Section 3. In particular these values are sufficient to implement 359 the algorithm described in [RFC5348]. 361 At a minimum, the congestion information must be sent, from the 362 receiver as well as from the sender, at least once per RTT. Prior to 363 establishing an RTT the information SHOULD be sent constantly from 364 the sender and the receiver so that an RTT estimate can be 365 established. The lack of receiving this information over multiple 366 consecutive RTT intervals should be considered a congestion event 367 that causes the sender to adjust it's sending rate lower. For 368 example, [RFC4342] calls this the "no feedback timeout" and it is 369 equal to 4 RTT intervals. When a "no feedback timeout" has occurred 370 [RFC4342] halves the sending rate. 372 An implementation could choose to always include the congestion 373 information in it's IP-TFS payload header if sending on an IP-TFS 374 enabled SA. Since IP-TFS normally will operate with a large packet 375 size, the congestion information should represent a small portion of 376 the available tunnel bandwidth. 378 When an implementation is choosing a congestion control algorithm (or 379 a selection of algorithms) one should remember that IP-TFS is not 380 providing for reliable delivery of IP traffic, and so per packet ACKs 381 are not required and are not provided. 383 It's worth noting that the variable send-rate of a congestion 384 controlled IP-TFS tunnel, is not private; however, this send-rate is 385 being driven by network congestion, and as long as the encapsulated 386 (inner) traffic flow shape and timing are not directly affecting the 387 (outer) network congestion, the variations in the tunnel rate will 388 not weaken the provided inner traffic flow confidentiality. 390 2.5.2.1. Circuit Breakers 392 In additional to congestion control, implementations MAY choose to 393 define and implement circuit breakers [RFC8084] as a recovery method 394 of last resort. Enabling circuit breakers is also a reason a user 395 may wish to enable congestion information reports even when using the 396 non-congestion controlled mode of operation. The definition of 397 circuit breakers are outside the scope of this document. 399 3. Congestion Information 401 In order to support the congestion control mode, the sender needs to 402 know the loss event rate and also be able to approximate the RTT 403 ([RFC5348]). In order to obtain these values the receiver sends 404 congestion control information on it's SA back to the sender. Thus, 405 in order to support congestion control the receiver must have a 406 paired SA back to the sender (this is always the case when the tunnel 407 was created using IKEv2). If the SA back to the sender is a non-IP- 408 TFS enabled SA then an IPTFS_PROTOCOL empty payload (i.e., header 409 only) is used to convey the information. 411 In order to calculate a loss event rate compatible with [RFC5348], 412 the receiver needs to have a round-trip time estimate. Thus the 413 sender communicates this estimate in the "RTT" header field. On 414 startup this value will be zero as no RTT estimate is yet known. 416 In order to allow the sender to calculate the "RTT" value, the 417 receiver communicates the last sequence number it has seen to the 418 sender in the "LastSeqNum" header field. In addition to the 419 "LastSeqNum" value, the receiver sends an estimate of the amount of 420 time between receiving the "LastSeqNum" packet and transmitting the 421 "LastSeqNum" value back to the sender in the congestion information. 422 It places this time estimate in the "Delay" header field along with 423 the "LastSeqNum". 425 The receiver also calculates, and communicates in the "LossEventRate" 426 header field, the loss event rate for use by the sender. This is 427 slightly different from [RFC4342] which periodically sends all the 428 loss interval data back to the sender so that it can do the 429 calculation. See Appendix B for a suggested way to calculate the 430 loss event rate value. Initially this value will be zero (indicating 431 no loss) until enough data has been collected by the receiver to 432 update it. 434 3.1. ECN Support 436 In additional to normal packet loss information IP-TFS supports use 437 of the ECN bits in the encapsulating IP header [RFC3168] for 438 identifying congestion. If ECN use is enabled and a packet arrives 439 at the egress endpoint with the Congestion Experienced (CE) value 440 set, then the receiver considers that packet as being dropped, 441 although it does not drop it. The receiver MUST set the E bit in any 442 IPTFS_PROTOCOL payload header containing a "LossEventRate" value 443 derived from a CE value being considered. 445 As noted in [RFC3168] the ECN bits are not protected by IPsec and 446 thus may constitute a covert channel. For this reason ECN use SHOULD 447 NOT be enabled by default. 449 4. Configuration 451 IP-TFS is meant to be deployable with a minimal amount of 452 configuration. All IP-TFS specific configuration should be able to 453 be specified at the unidirectional tunnel ingress (sending) side. It 454 is intended that non-IKEv2 operation is supported, at least, with 455 local static configuration. 457 4.1. Bandwidth 459 Bandwidth is a local configuration option. For non-congestion 460 controlled mode the bandwidth SHOULD be configured. For congestion 461 controlled mode one can configure the bandwidth or have no 462 configuration and let congestion control discover the maximum 463 bandwidth available. No standardized configuration method is 464 required. 466 4.2. Fixed Packet Size 468 The fixed packet size to be used for the tunnel encapsulation packets 469 can be configured manually or can be automatically determined using 470 Path MTU discovery (see [RFC1191] and [RFC8201]). No standardized 471 configuration method is required. 473 4.3. Congestion Control 475 Congestion control is a local configuration option. No standardized 476 configuration method is required. 478 5. IKEv2 480 5.1. USE_TFS Notification Message 482 When using IKEv2, a new "USE_IPTFS" Notification Message is used to 483 enable operation of IP-TFS on a child SA pair. The method used is 484 similar to how USE_TRANSPORT_MODE is negotiated, as described in 485 [RFC7296]. 487 To request IP-TFS operation on the Child SA pair, the initiator 488 includes the USE_IPTFS notification in an SA payload requesting a new 489 Child SA (either during the initial IKE_AUTH or during non-rekeying 490 CREATE_CHILD_SA exchanges). If the request is accepted then response 491 MUST also include a notification of type USE_IPTFS. If the responder 492 declines the request the child SA will be established without IP-TFS 493 enabled. If this is unacceptable to the initiator, the initiator 494 MUST delete the child SA. 496 The USE_IPTFS notification MUST NOT be sent, and MUST be ignored, 497 during a CREATE_CHILD_SA rekeying exchange as it is not allowed to 498 change IP-TFS operation during rekeying. 500 The USE_IPTFS notification contains a 1 octet payload of flags that 501 specify any requirements from the sender of the message. If any 502 requirement flags are not understood or cannot be supported by the 503 receiver then the receiver should not enable IP-TFS mode (either by 504 not responding with the USE_IPTFS notification, or in the case of the 505 initiator, by deleting the child SA if the now established non-IP-TFS 506 operation is unacceptable). 508 The notification type and payload flag values are defined in 509 Section 6.1.4. 511 6. Packet and Data Formats 513 6.1. IP-TFS Payload 515 An IP-TFS payload is identified by the IP protocol number 516 IPTFS_PROTOCOL (TBD1). The first octet of this payload indicates the 517 format of the remaining payload data. 519 0 1 2 3 4 5 6 7 520 +-+-+-+-+-+-+-+-+-+-+- 521 | Sub-type | ... 522 +-+-+-+-+-+-+-+-+-+-+- 524 Sub-type: 525 An 8 bit value indicating the payload format. 527 This specification defines 2 payload sub-types. These payload 528 formats are defined in the following sections. 530 6.1.1. Non-Congestion Control IPTFS_PROTOCOL Payload Format 532 The non-congestion control IPTFS_PROTOCOL payload is comprised of a 4 533 octet header followed by a variable amount of "DataBlocks" data as 534 shown below. 536 1 2 3 537 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 538 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 539 | Sub-Type (0) | Reserved | BlockOffset | 540 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 541 | DataBlocks ... 542 +-+-+-+-+-+-+-+-+-+-+- 544 Sub-type: 545 An octet indicating the payload format. For this non-congestion 546 control format, the value is 0. 548 Reserved: 549 An octet set to 0 on generation, and ignored on receipt. 551 BlockOffset: 552 A 16 bit unsigned integer counting the number of octets of 553 "DataBlocks" data before the start of a new data block. 554 "BlockOffset" can count past the end of the "DataBlocks" data in 555 which case all the "DataBlocks" data belongs to the previous data 556 block being re-assembled. If the "BlockOffset" extends into 557 subsequent packets it continues to only count subsequent 558 "DataBlocks" data (i.e., it does not count subsequent packets 559 non-"DataBlocks" octets). 561 DataBlocks: 562 Variable number of octets that begins with the start of a data 563 block, or the continuation of a previous data block, followed by 564 zero or more additional data blocks. 566 6.1.2. Congestion Control IPTFS_PROTOCOL Payload Format 568 The congestion control IPTFS_PROTOCOL payload is comprised of a 16 569 octet header followed by a variable amount of "DataBlocks" data as 570 shown below. 572 1 2 3 573 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 574 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 575 | Sub-type (1) | Reserved |E| BlockOffset | 576 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 577 | RTT | Delay | 578 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 579 | LossEventRate | 580 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 581 | LastSeqNum | 582 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 583 | DataBlocks ... 584 +-+-+-+-+-+-+-+-+-+-+- 586 Sub-type: 587 An octet indicating the payload format. For this congestion 588 control format, the value is 1. 590 Reserved: 591 A 7 bit field set to 0 on generation, and ignored on receipt. 593 E: 594 A 1 bit value if set indicates that Congestion Experienced (CE) 595 ECN bits were received and used in deriving the reported 596 "LossEventRate". 598 BlockOffset: 599 The same value as the non-congestion controlled payload format 600 value. 602 RTT: 603 A 16 bit value specifying the sender's current round-trip time 604 estimate in milliseconds. The value MAY be zero prior to the 605 sender having calculated a round-trip time estimate. The value 606 SHOULD be set to zero on non-IP-TFS enabled SAs. 608 Delay: 609 A 16 bit value specifying the delay in milliseconds incurred 610 between the receiver receiving the "LastSeqNum" packet and the 611 sending of this acknowledgement of it. 613 LossEventRate: 615 A 32 bit value specifying the inverse of the current loss event 616 rate as calculated by the receiver. A value of zero indicates no 617 loss. Otherwise the loss event rate is "1/LossEventRate". 619 LastSeqNum: 620 A 32 bit value containing the lower 32 bits of the largest 621 sequence number last received. This is the latest in the sequence 622 not necessarily the most recent (in the case of re-ordering of 623 packets it may be less recent). When determining largest and 64 624 bit extended sequence numbers are in use, the upper 32 bits should 625 be used during the comparison. 627 DataBlocks: 628 Variable number of octets that begins with the start of a data 629 block, or the continuation of a previous data block, followed by 630 zero or more additional data blocks. For the special case of 631 sending congestion control information on an non-IP-TFS enabled SA 632 this value MUST be empty (i.e., be zero octets long). 634 6.1.3. Data Blocks 636 1 2 3 637 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 638 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 639 | Type | IPv4, IPv6 or pad... 640 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 642 Type: 643 A 4 bit field where 0x0 identifies a pad data block, 0x4 indicates 644 an IPv4 data block, and 0x6 indicates an IPv6 data block. 646 6.1.3.1. IPv4 Data Block 648 1 2 3 649 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 650 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 651 | 0x4 | IHL | TypeOfService | TotalLength | 652 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 653 | Rest of the inner packet ... 654 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 656 These values are the actual values within the encapsulated IPv4 657 header. In other words, the start of this data block is the start of 658 the encapsulated IP packet. 660 Type: 661 A 4 bit value of 0x4 indicating IPv4 (i.e., first nibble of the 662 IPv4 packet). 664 TotalLength: 665 The 16 bit unsigned integer "Total Length" field of the IPv4 inner 666 packet. 668 6.1.3.2. IPv6 Data Block 670 1 2 3 671 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 672 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 673 | 0x6 | TrafficClass | FlowLabel | 674 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 675 | PayloadLength | Rest of the inner packet ... 676 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 678 These values are the actual values within the encapsulated IPv6 679 header. In other words, the start of this data block is the start of 680 the encapsulated IP packet. 682 Type: 683 A 4 bit value of 0x6 indicating IPv6 (i.e., first nibble of the 684 IPv6 packet). 686 PayloadLength: 687 The 16 bit unsigned integer "Payload Length" field of the inner 688 IPv6 inner packet. 690 6.1.3.3. Pad Data Block 692 1 2 3 693 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 694 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 695 | 0x0 | Padding ... 696 +-+-+-+-+-+-+-+-+-+-+- 698 Type: 699 A 4 bit value of 0x0 indicating a padding data block. 701 Padding: 702 extends to end of the encapsulating packet. 704 6.1.4. IKEv2 USE_IPTFS Notification Message 706 As discussed in Section 5.1 a notification message USE_IPTFS is used 707 to negotiate IP-TFS operation in IKEv2. 709 The USE_IPTFS Notification Message State Type is (TBD2). 711 The notification payload contains 1 octet of requirement flags. 712 There are currently 2 requirement flags defined. This may be revised 713 by later specifications. 715 +-+-+-+-+-+-+-+-+ 716 |0|0|0|0|0|0|C|D| 717 +-+-+-+-+-+-+-+-+ 719 0: 720 6 bits - reserved, MUST be zero on send, unless defined by later 721 specifications. 723 C: 724 Congestion Control bit. If set, then the sender is requiring that 725 congestion control information MUST be returned to it periodically 726 as defined in Section 3. 728 D: 729 Don't Fragment bit, if set indicates the sender of the notify 730 message does not support receiving packet fragments (i.e., inner 731 packets MUST be sent using a single "Data Block"). This value 732 only applies to what the sender is capable of receiving; the 733 sender MAY still send packet fragments unless similarly restricted 734 by the receiver in it's USE_IPTFS notification. 736 7. IANA Considerations 738 7.1. IPTFS_PROTOCOL Type 740 This document requests a protocol number IPTFS_PROTOCOL be allocated 741 by IANA from "Assigned Internet Protocol Numbers" registry for 742 identifying the IP-TFS payload. 744 Type: 745 TBD1 747 Description: 748 An IP-TFS payload. 750 Reference: 751 This document 753 7.2. IPTFS_PROTOCOL Sub-Type Registry 755 This document requests IANA create a registry called "IPTFS_PROTOCOL 756 Sub-Type Registry" under "IPTFS_PROTOCOL Parameters" IANA registries. 757 The registration policy for this registry is "Standards Action" 758 ([RFC8126] and [RFC7120]). 760 Name: 761 IPTFS_PROTOCOL Sub-Type Registry 763 Description: 764 IPTFS_PROTOCOL Payload Formats. 766 Reference: 767 This document 769 This initial content for this registry is as follows: 771 Sub-Type Name Reference 772 -------------------------------------------------------- 773 0 Non-Congestion Control Format This document 774 1 Congestion Control Format This document 775 3-255 Reserved 777 7.3. USE_IPTFS Notify Message Status Type 779 This document requests a status type USE_IPTFS be allocated from the 780 "IKEv2 Notify Message Types - Status Types" registry. 782 Value: 783 TBD2 785 Name: 786 USE_IPTFS 788 Reference: 789 This document 791 8. Security Considerations 793 This document describes a mechanism to add Traffic Flow 794 Confidentiality to IP traffic. Use of this mechanism is expected to 795 increase the security of the traffic being transported. Other than 796 the additional security afforded by using this mechanism, IP-TFS 797 utilizes the security protocols [RFC4303] and [RFC7296] and so their 798 security considerations apply to IP-TFS as well. 800 As noted previously in Section 2.5.2, for TFC to be fully maintained 801 the encapsulated traffic flow should not be affecting network 802 congestion in a predictable way, and if it would be then non- 803 congestion controlled mode use should be considered instead. 805 9. References 807 9.1. Normative References 809 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 810 Requirement Levels", BCP 14, RFC 2119, 811 DOI 10.17487/RFC2119, March 1997, 812 . 814 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 815 RFC 4303, DOI 10.17487/RFC4303, December 2005, 816 . 818 [RFC7296] Kaufman, C., Hoffman, P., Nir, Y., Eronen, P., and T. 819 Kivinen, "Internet Key Exchange Protocol Version 2 820 (IKEv2)", STD 79, RFC 7296, DOI 10.17487/RFC7296, October 821 2014, . 823 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 824 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 825 May 2017, . 827 9.2. Informative References 829 [AppCrypt] 830 Schneier, B., "Applied Cryptography: Protocols, 831 Algorithms, and Source Code in C", 11 2017. 833 [I-D.iab-wire-image] 834 Trammell, B. and M. Kuehlewind, "The Wire Image of a 835 Network Protocol", draft-iab-wire-image-01 (work in 836 progress), November 2018. 838 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 839 DOI 10.17487/RFC0791, September 1981, 840 . 842 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 843 DOI 10.17487/RFC1191, November 1990, 844 . 846 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 847 "Definition of the Differentiated Services Field (DS 848 Field) in the IPv4 and IPv6 Headers", RFC 2474, 849 DOI 10.17487/RFC2474, December 1998, 850 . 852 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 853 RFC 2914, DOI 10.17487/RFC2914, September 2000, 854 . 856 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 857 of Explicit Congestion Notification (ECN) to IP", 858 RFC 3168, DOI 10.17487/RFC3168, September 2001, 859 . 861 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 862 Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, 863 December 2005, . 865 [RFC4342] Floyd, S., Kohler, E., and J. Padhye, "Profile for 866 Datagram Congestion Control Protocol (DCCP) Congestion 867 Control ID 3: TCP-Friendly Rate Control (TFRC)", RFC 4342, 868 DOI 10.17487/RFC4342, March 2006, 869 . 871 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 872 Friendly Rate Control (TFRC): Protocol Specification", 873 RFC 5348, DOI 10.17487/RFC5348, September 2008, 874 . 876 [RFC7120] Cotton, M., "Early IANA Allocation of Standards Track Code 877 Points", BCP 100, RFC 7120, DOI 10.17487/RFC7120, January 878 2014, . 880 [RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, 881 "Encapsulating MPLS in UDP", RFC 7510, 882 DOI 10.17487/RFC7510, April 2015, 883 . 885 [RFC8084] Fairhurst, G., "Network Transport Circuit Breakers", 886 BCP 208, RFC 8084, DOI 10.17487/RFC8084, March 2017, 887 . 889 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 890 Writing an IANA Considerations Section in RFCs", BCP 26, 891 RFC 8126, DOI 10.17487/RFC8126, June 2017, 892 . 894 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 895 (IPv6) Specification", STD 86, RFC 8200, 896 DOI 10.17487/RFC8200, July 2017, 897 . 899 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 900 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 901 DOI 10.17487/RFC8201, July 2017, 902 . 904 Appendix A. Example Of An Encapsulated IP Packet Flow 906 Below we show an example inner IP packet flow within the 907 encapsulating tunnel packet stream. Notice how encapsulated IP 908 packets can start and end anywhere, and more than one or less than 1 909 may occur in a single encapsulating packet. 911 Offset: 0 Offset: 100 Offset: 2900 Offset: 1400 912 [ ESP1 (1500) ][ ESP2 (1500) ][ ESP3 (1500) ][ ESP4 (1500) ] 913 [--800--][--800--][60][-240-][--4000----------------------][pad] 915 Figure 3: Inner and Outer Packet Flow 917 The encapsulated IP packet flow (lengths include IP header and 918 payload) is as follows: an 800 octet packet, an 800 octet packet, a 919 60 octet packet, a 240 octet packet, a 4000 octet packet. 921 The "BlockOffset" values in the 4 IP-TFS payload headers for this 922 packet flow would thus be: 0, 100, 2900, 1400 respectively. The 923 first encapsulating packet ESP1 has a zero "BlockOffset" which points 924 at the IP data block immediately following the IP-TFS header. The 925 following packet ESP2s "BlockOffset" points inward 100 octets to the 926 start of the 60 octet data block. The third encapsulating packet 927 ESP3 contains the middle portion of the 4000 octet data block so the 928 offset points past its end and into the forth encapsulating packet. 929 The fourth packet ESP4s offset is 1400 pointing at the padding which 930 follows the completion of the continued 4000 octet packet. 932 Appendix B. A Send and Loss Event Rate Calculation 934 The current best practice indicates that congestion control should be 935 done in a TCP friendly way. A TCP friendly congestion control 936 algorithm is described in [RFC5348]. For our use case (as with 937 [RFC4342]) we consider our (fixed) packet size the segment size for 938 the algorithm. The formula for the send rate is then as follows: 940 1 941 X_Pps = ----------------------------------------------- 942 R * (sqrt(2*p/3) + 12*sqrt(3*p/8)*p*(1+32*p^2)) 944 Where "X_Pps" is the send rate in packets per second, "R" is the 945 round trip time estimate and "p" is the loss event rate (the inverse 946 of which is provided by the receiver). 948 The IP-TFS receiver, having the RTT estimate from the sender MAY use 949 the same method as described in [RFC4342] to collect the loss 950 intervals and calculate the loss event rate value using the weighted 951 average as indicated. The receiver communicates the inverse of this 952 value back to the sender in the IPTFS_PROTOCOL payload header field 953 "LossEventRate". 955 The IP-TFS sender now has both the "R" and "p" values and can 956 calculate the correct sending rate ("X_Pps"). If following [RFC5348] 957 the sender SHOULD also use the slow start mechanism described therein 958 when the IP-TFS SA is first established. 960 Appendix C. Comparisons of IP-TFS 962 C.1. Comparing Overhead 964 C.1.1. IP-TFS Overhead 966 The overhead of IP-TFS is 40 bytes per outer packet. Therefore the 967 octet overhead per inner packet is 40 divided by the number of outer 968 packets required (fractional allowed). The overhead as a percentage 969 of inner packet size is a constant based on the Outer MTU size. 971 OH = 40 / Outer Payload Size / Inner Packet Size 972 OH % of Inner Packet Size = 100 * OH / Inner Packet Size 973 OH % of Inner Packet Size = 4000 / Outer Payload Size 975 Type IP-TFS IP-TFS IP-TFS 976 MTU 576 1500 9000 977 PSize 536 1460 8960 978 ------------------------------- 979 40 7.46% 2.74% 0.45% 980 576 7.46% 2.74% 0.45% 981 1500 7.46% 2.74% 0.45% 982 9000 7.46% 2.74% 0.45% 984 Figure 4: IP-TFS Overhead as Percentage of Inner Packet Size 986 C.1.2. ESP with Padding Overhead 988 The overhead per inner packet for constant-send-rate padded ESP 989 (i.e., traditional IPsec TFC) is 36 octets plus any padding, unless 990 fragmentation is required. 992 When fragmentation of the inner packet is required to fit in the 993 outer IPsec packet, overhead is the number of outer packets required 994 to carry the fragmented inner packet times both the inner IP overhead 995 (20) and the outer packet overhead (36) minus the initial inner IP 996 overhead plus any required tail padding in the last encapsulation 997 packet. The required tail padding is the number of required packets 998 times the difference of the Outer Payload Size and the IP Overhead 999 minus the Inner Payload Size. So: 1001 Inner Paylaod Size = IP Packet Size - IP Overhead 1002 Outer Payload Size = MTU - IPsec Overhead 1004 Inner Payload Size 1005 NF0 = ---------------------------------- 1006 Outer Payload Size - IP Overhead 1008 NF = CEILING(NF0) 1010 OH = NF * (IP Overhead + IPsec Overhead) 1011 - IP Overhead 1012 + NF * (Outer Payload Size - IP Overhead) 1013 - Inner Payload Size 1015 OH = NF * (IPsec Overhead + Outer Payload Size) 1016 - (IP Overhead + Inner Payload Size) 1018 OH = NF * (IPsec Overhead + Outer Payload Size) 1019 - Inner Packet Size 1021 C.2. Overhead Comparison 1023 The following tables collect the overhead values for some common L3 1024 MTU sizes in order to compare them. The first table is the number of 1025 octets of overhead for a given L3 MTU sized packet. The second table 1026 is the percentage of overhead in the same MTU sized packet. 1028 Type ESP+Pad ESP+Pad ESP+Pad IP-TFS IP-TFS IP-TFS 1029 L3 MTU 576 1500 9000 576 1500 9000 1030 PSize 540 1464 8964 536 1460 8960 1031 ----------------------------------------------------------- 1032 40 500 1424 8924 3.0 1.1 0.2 1033 128 412 1336 8836 9.6 3.5 0.6 1034 256 284 1208 8708 19.1 7.0 1.1 1035 536 4 928 8428 40.0 14.7 2.4 1036 576 576 888 8388 43.0 15.8 2.6 1037 1460 268 4 7504 109.0 40.0 6.5 1038 1500 228 1500 7464 111.9 41.1 6.7 1039 8960 1408 1540 4 668.7 245.5 40.0 1040 9000 1368 1500 9000 671.6 246.6 40.2 1042 Figure 5: Overhead comparison in octets 1044 Type ESP+Pad ESP+Pad ESP+Pad IP-TFS IP-TFS IP-TFS 1045 MTU 576 1500 9000 576 1500 9000 1046 PSize 540 1464 8964 536 1460 8960 1047 ----------------------------------------------------------- 1048 40 1250.0% 3560.0% 22310.0% 7.46% 2.74% 0.45% 1049 128 321.9% 1043.8% 6903.1% 7.46% 2.74% 0.45% 1050 256 110.9% 471.9% 3401.6% 7.46% 2.74% 0.45% 1051 536 0.7% 173.1% 1572.4% 7.46% 2.74% 0.45% 1052 576 100.0% 154.2% 1456.2% 7.46% 2.74% 0.45% 1053 1460 18.4% 0.3% 514.0% 7.46% 2.74% 0.45% 1054 1500 15.2% 100.0% 497.6% 7.46% 2.74% 0.45% 1055 8960 15.7% 17.2% 0.0% 7.46% 2.74% 0.45% 1056 9000 15.2% 16.7% 100.0% 7.46% 2.74% 0.45% 1058 Figure 6: Overhead as Percentage of Inner Packet Size 1060 C.3. Comparing Available Bandwidth 1062 Another way to compare the two solutions is to look at the amount of 1063 available bandwidth each solution provides. The following sections 1064 consider and compare the percentage of available bandwidth. For the 1065 sake of providing a well understood baseline we will also include 1066 normal (unencrypted) Ethernet as well as normal ESP values. 1068 C.3.1. Ethernet 1070 In order to calculate the available bandwidth we first calculate the 1071 per packet overhead in bits. The total overhead of Ethernet is 14+4 1072 octets of header and CRC plus and additional 20 octets of framing 1073 (preamble, start, and inter-packet gap) for a total of 48 octets. 1074 Additionally the minimum payload is 46 octets. 1076 Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP 1077 MTU 590 1514 9014 590 1514 9014 any any 1078 OH 74 74 74 78 78 78 38 74 1079 ------------------------------------------------------------ 1080 40 614 1538 9038 45 42 40 84 114 1081 128 614 1538 9038 146 134 129 166 202 1082 256 614 1538 9038 293 269 258 294 330 1083 536 614 1538 9038 614 564 540 574 610 1084 576 1228 1538 9038 659 606 581 614 650 1085 1460 1842 1538 9038 1672 1538 1472 1498 1534 1086 1500 1842 3076 9038 1718 1580 1513 1538 1574 1087 8960 11052 10766 9038 10263 9438 9038 8998 9034 1088 9000 11052 10766 18076 10309 9480 9078 9038 9074 1090 Figure 7: L2 Octets Per Packet 1092 Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP 1093 MTU 590 1514 9014 590 1514 9014 any any 1094 OH 74 74 74 78 78 78 38 74 1095 -------------------------------------------------------------- 1096 40 2.0M 0.8M 0.1M 27.3M 29.7M 31.0M 14.9M 11.0M 1097 128 2.0M 0.8M 0.1M 8.5M 9.3M 9.7M 7.5M 6.2M 1098 256 2.0M 0.8M 0.1M 4.3M 4.6M 4.8M 4.3M 3.8M 1099 536 2.0M 0.8M 0.1M 2.0M 2.2M 2.3M 2.2M 2.0M 1100 576 1.0M 0.8M 0.1M 1.9M 2.1M 2.2M 2.0M 1.9M 1101 1460 678K 812K 138K 747K 812K 848K 834K 814K 1102 1500 678K 406K 138K 727K 791K 826K 812K 794K 1103 8960 113K 116K 138K 121K 132K 138K 138K 138K 1104 9000 113K 116K 69K 121K 131K 137K 138K 137K 1106 Figure 8: Packets Per Second on 10G Ethernet 1108 Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP 1109 590 1514 9014 590 1514 9014 any any 1110 74 74 74 78 78 78 38 74 1111 ---------------------------------------------------------------------- 1112 40 6.51% 2.60% 0.44% 87.30% 94.93% 99.14% 47.62% 35.09% 1113 128 20.85% 8.32% 1.42% 87.30% 94.93% 99.14% 77.11% 63.37% 1114 256 41.69% 16.64% 2.83% 87.30% 94.93% 99.14% 87.07% 77.58% 1115 536 87.30% 34.85% 5.93% 87.30% 94.93% 99.14% 93.38% 87.87% 1116 576 46.91% 37.45% 6.37% 87.30% 94.93% 99.14% 93.81% 88.62% 1117 1460 79.26% 94.93% 16.15% 87.30% 94.93% 99.14% 97.46% 95.18% 1118 1500 81.43% 48.76% 16.60% 87.30% 94.93% 99.14% 97.53% 95.30% 1119 8960 81.07% 83.22% 99.14% 87.30% 94.93% 99.14% 99.58% 99.18% 1120 9000 81.43% 83.60% 49.79% 87.30% 94.93% 99.14% 99.58% 99.18% 1122 Figure 9: Percentage of Bandwidth on 10G Ethernet 1124 A sometimes unexpected result of using IP-TFS (or any packet 1125 aggregating tunnel) is that, for small to medium sized packets, the 1126 available bandwidth is actually greater than native Ethernet. This 1127 is due to the reduction in Ethernet framing overhead. This increased 1128 bandwidth is paid for with an increase in latency. This latency is 1129 the time to send the unrelated octets in the outer tunnel frame. The 1130 following table illustrates the latency for some common values on a 1131 10G Ethernet link. The table also includes latency introduced by 1132 padding if using ESP with padding. 1134 ESP+Pad ESP+Pad IP-TFS IP-TFS 1135 1500 9000 1500 9000 1137 ------------------------------------------ 1138 40 1.14 us 7.14 us 1.17 us 7.17 us 1139 128 1.07 us 7.07 us 1.10 us 7.10 us 1140 256 0.97 us 6.97 us 1.00 us 7.00 us 1141 536 0.74 us 6.74 us 0.77 us 6.77 us 1142 576 0.71 us 6.71 us 0.74 us 6.74 us 1143 1460 0.00 us 6.00 us 0.04 us 6.04 us 1144 1500 1.20 us 5.97 us 0.00 us 6.00 us 1146 Figure 10: Added Latency 1148 Notice that the latency values are very similar between the two 1149 solutions; however, whereas IP-TFS provides for constant high 1150 bandwidth, in some cases even exceeding native Ethernet, ESP with 1151 padding often greatly reduces available bandwidth. 1153 Appendix D. Acknowledgements 1155 We would like to thank Don Fedyk for help in reviewing this work. 1157 Appendix E. Contributors 1159 The following people made significant contributions to this document. 1161 Lou Berger 1162 LabN Consulting, L.L.C. 1164 Email: lberger@labn.net 1166 Author's Address 1168 Christian Hopps 1169 LabN Consulting, L.L.C. 1171 Email: chopps@chopps.org