idnits 2.17.1 draft-hopps-ipsecme-iptfs-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 890 has weird spacing: '...4 any any...' == Line 906 has weird spacing: '...4 any any...' -- The document date (March 11, 2019) is 1870 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: '--800--' is mentioned on line 284, but not defined -- Looks like a reference, but probably isn't: '60' on line 284 == Missing Reference: '-240-' is mentioned on line 284, but not defined == Missing Reference: '--4000----------------------' is mentioned on line 284, but not defined Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Hopps 3 Internet-Draft LabN Consulting, L.L.C. 4 Intended status: Standards Track March 11, 2019 5 Expires: September 12, 2019 7 IP Traffic Flow Security 8 draft-hopps-ipsecme-iptfs-00 10 Abstract 12 This document describes a mechanism to enhance IPsec traffic flow 13 security by adding traffic flow confidentiality to encrypted IP 14 encapsulated traffic. Traffic flow confidentiality is provided by 15 obscuring the size and frequency of IP traffic using a fixed-sized, 16 constant-send-rate IPsec tunnel. The solution allows for congestion 17 control as well. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on September 12, 2019. 36 Copyright Notice 38 Copyright (c) 2019 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (https://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 54 1.1. Terminology & Concepts . . . . . . . . . . . . . . . . . 3 55 2. The IP-TFS Tunnel . . . . . . . . . . . . . . . . . . . . . . 3 56 2.1. Tunnel Content . . . . . . . . . . . . . . . . . . . . . 4 57 2.1.1. IPSec/ESP Payload . . . . . . . . . . . . . . . . . . 4 58 2.1.2. Data-Blocks . . . . . . . . . . . . . . . . . . . . . 5 59 2.1.3. No Implicit Padding . . . . . . . . . . . . . . . . . 5 60 2.1.4. IP Header Value Mapping . . . . . . . . . . . . . . . 6 61 2.2. Exclusive SA Use . . . . . . . . . . . . . . . . . . . . 6 62 2.3. Initiation of TFS mode . . . . . . . . . . . . . . . . . 6 63 2.4. Example of an encapsulated IP packet flow . . . . . . . . 7 64 2.5. Modes of operation . . . . . . . . . . . . . . . . . . . 7 65 2.5.1. Non-Congestion Controlled Mode . . . . . . . . . . . 7 66 2.5.2. Congestion Controlled Mode . . . . . . . . . . . . . 8 67 3. Congestion Information . . . . . . . . . . . . . . . . . . . 9 68 3.1. ECN Support . . . . . . . . . . . . . . . . . . . . . . . 9 69 4. Configuration . . . . . . . . . . . . . . . . . . . . . . . . 9 70 4.1. Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . 10 71 4.2. Fixed Packet Size . . . . . . . . . . . . . . . . . . . . 10 72 4.3. Congestion Information Configuration . . . . . . . . . . 10 73 5. Packet and Data Formats . . . . . . . . . . . . . . . . . . . 11 74 5.1. IPSec . . . . . . . . . . . . . . . . . . . . . . . . . . 11 75 5.1.1. Payload Format . . . . . . . . . . . . . . . . . . . 11 76 5.1.2. Data Blocks . . . . . . . . . . . . . . . . . . . . . 12 77 5.2. IKEv2 . . . . . . . . . . . . . . . . . . . . . . . . . . 13 78 5.2.1. IKEv2 Congestion Information Configuration Attribute 13 79 5.2.2. IKEv2 Congestion Information Notification Data . . . 14 80 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 81 7. Security Considerations . . . . . . . . . . . . . . . . . . . 15 82 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 83 8.1. Normative References . . . . . . . . . . . . . . . . . . 15 84 8.2. Informative References . . . . . . . . . . . . . . . . . 16 85 Appendix A. Comparisons of IP-TFS . . . . . . . . . . . . . . . 17 86 A.1. Comparing Overhead . . . . . . . . . . . . . . . . . . . 17 87 A.1.1. IP-TFS Overhead . . . . . . . . . . . . . . . . . . . 17 88 A.1.2. ESP with Padding Overhead . . . . . . . . . . . . . . 18 89 A.2. Overhead Comparison . . . . . . . . . . . . . . . . . . . 19 90 A.3. Comparing Available Bandwidth . . . . . . . . . . . . . . 19 91 A.3.1. Ethernet . . . . . . . . . . . . . . . . . . . . . . 20 92 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 22 93 Appendix C. Contributors . . . . . . . . . . . . . . . . . . . . 22 94 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 22 96 1. Introduction 98 Traffic Analysis ([RFC4301], [AppCrypt]) is the act of extracting 99 information about data being sent through a network. While one may 100 directly obscure the data through the use of encryption [RFC4303], 101 the traffic pattern itself exposes information due to variations in 102 it's shape and timing ([I-D.iab-wire-image], [AppCrypt]). Hiding the 103 size and frequency of traffic is referred to as Traffic Flow 104 Confidentiality (TFC) per [RFC4303]. 106 [RFC4303] provides for TFC by allowing padding to be added to 107 encrypted IP packets and allowing for sending all-pad packets 108 (indicated using protocol 59). This method has the major limitation 109 that it can significantly under-utilize the available bandwidth. 111 The IP-TFS solution provides for full TFC without the aforementioned 112 bandwidth limitation. To do this we use a constant-send-rate IPsec 113 [RFC4303] tunnel with fixed-sized encapsulating packets; however, 114 these fixed-sized packets can contain partial, full or multiple IP 115 packets to maximize the bandwidth of the tunnel. 117 For a comparison of the overhead of IP-TFS with the RFC4303 118 prescribed TFC solution see Appendix A. 120 Additionally, IP-TFS provides for dealing with network congestion 121 [RFC2914]. This is important for when the IP-TFS user is not in full 122 control of the domain through which the IP-TFS tunnel path flows. 124 1.1. Terminology & Concepts 126 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 127 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 128 "OPTIONAL" in this document are to be interpreted as described in 129 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, 130 as shown here. 132 This document assumes familiarity with IP security concepts described 133 in [RFC4301]. 135 2. The IP-TFS Tunnel 137 As mentioned in Section 1 IP-TFS utilizes an IPsec [RFC4303] tunnel 138 as it's transport. To provide for full TFC we send fixed-sized 139 encapsulating packets at a constant rate on the tunnel. 141 The primary input to the tunnel algorithm is the requested bandwidth 142 of the tunnel. Two values are then required to provide for this 143 bandwidth, the fixed size of the encapsulating packets, and rate at 144 which to send them. 146 The fixed packet size may either be specified manually or can be 147 determined through the use of Path MTU discovery [RFC1191] and 148 [RFC8201]. 150 Given the encapsulating packet size and the requested tunnel 151 bandwidth, the correct packet send rate can be calculated. The 152 packet send rate is the requested bandwidth divided by the payload 153 size of the encapsulating packet. 155 The egress of the IP-TFS tunnel SHOULD NOT impose any restrictions on 156 tunnel packet size or arrival rate. Packet size and send rate is 157 entirely the function of the ingress (sending) side of the IP-TFS 158 tunnel. Indeed, the ingress (sending) side of the IP-TFS tunnel MUST 159 be allowed by the egress side to vary the size and rate at which it 160 sends encapsulating packets, including sending them larger, smaller, 161 faster or slower than the requested size and rate. 163 2.1. Tunnel Content 165 As previously mentioned, one issue with the TFC padding solution in 166 [RFC4303] is the large amount of wasted bandwidth as only one IP 167 packet can be sent per encapsulating packet. In order to maximize 168 bandwidth IP-TFS breaks this one-to-one association. 170 With IP-TFS we fragment as well as aggregate the inner IP traffic 171 flow into fixed-sized encapsulating IP tunnel packets. We only pad 172 the tunnel packets if there is no data available to be sent at the 173 time of tunnel packet transmission. 175 In order to do this we create a new payload data type identified with 176 a new IP protocol number IPTFS_PROTOCOL (TBD). A payload of 177 IPTFS_PROTOCOL type is comprised of a 32 bit header followed by 178 either a partial, a full or multiple partial or full data-blocks. 180 2.1.1. IPSec/ESP Payload 181 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 . Outer Encapsulating Header ... . 183 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 . ESP Header... . 185 +-----------------------------------------------------------------+ 186 | ... : BlockOffset | 187 +-----------------------------------------------------------------+ 188 | Data Blocks Payload ... ~ 189 ~ ~ 190 ~ | 191 +-----------------------------------------------------------------| 192 . ESP Trailer... . 193 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Figure 1: Layout of IP-TFS IPSec Packet 197 The BlockOffset value is either zero or some offset into or past the 198 end of the data blocks payload data. If the value is zero it means 199 that a new data-block immediately follows the fixed header (i.e., the 200 BlockOffset value). Conversely, if the BlockOffset value is non-zero 201 it points at the start of the next data block. The BlockOffset can 202 point past the end of the data block payload data, this means that 203 the next data-block occurs in a subsequent encapsulating packet. 204 When the BlockOffset is non-zero the data immediately following the 205 header belongs to the previous data-block that is still being re- 206 assembled. 208 2.1.2. Data-Blocks 210 +-----------------------------------------------------------------+ 211 | Type | rest of IPv4, IPv6 or pad. 212 +-------- 214 Figure 2: Layout of IP-TFS data block 216 A data-block is defined by a 4-bit type code followed by the data 217 block data. The type values have been carefully chosen to coincide 218 with the IPv4/IPv6 version field values so that no per-data-block 219 type overhead is required to encapsulate an IP packet. Likewise, the 220 length of the data block is extracted from the encapsulated IPv4 or 221 IPv6 packet's length field. 223 2.1.3. No Implicit Padding 225 It's worth noting that there is no need for implicit pads at the end 226 of an encapsulating packet. Even when the start of a data block 227 occurs near the end of a encapsulating packet such that there is no 228 room for the length field of the encapsulated header to be included 229 in the current encapsulating packet, the fact that the length comes 230 at a known location and as is guaranteed to be present is enough to 231 fetch the length field from the subsequent encapsulating packet 232 payload. 234 2.1.4. IP Header Value Mapping 236 [RFC4301] provides some direction on when and how to map various 237 values from an inner IP header to the outer encapsulating header, 238 namely the Don't-Fragment (DF) bit ([RFC0791] and [RFC8200]), the 239 Differentiated Services (DS) field [RFC2474] and the Explicit 240 Congestion Notification (ECN) field [RFC3168]. Unlike [RFC4301] with 241 IP-TFS we may and often will be encapsulating more than 1 IP packet 242 per ESP packet. To deal with this we further restrict these 243 mappings. In particular we never map the inner DF bit as it is 244 unrelated to the IP-TFS tunnel functionality; we never directly 245 fragment the inner packets and the inner packets will not affect the 246 fragmentation of the outer encapsulation packets. Likewise, the ECN 247 value need not be mapped as any congestion related to the constant- 248 send-rate IP-TFS tunnel is unrelated (by design!) to the inner 249 traffic flow. Finally, by default the DS field SHOULD NOT be copied 250 although an implementation MAY choose to allow for configuration to 251 override this behavior. An implementation SHOULD also allow the DS 252 value to be set by configuration. 254 2.2. Exclusive SA Use 256 It is not the intention of this specification to allow for mixed use 257 of an IPsec SA. In other words, an SA that is created for IP-TFS is 258 exclusively for IP-TFS use and MUST NOT have non-IP-TFS payloads such 259 as IP (IP protocol 4), TCP transport (IP protocol 6), or ESP pad 260 packets (protocol 59) intermixed with IP-TFS (IP protocol TBD) 261 payloads. While it's possible to envision making the algorithm work 262 in the presence of sequence number skips in the IP-TFS payload 263 stream, the added complexity is not deemed worthwhile. Other IPsec 264 uses can configure and use their own SAs. 266 2.3. Initiation of TFS mode 268 While normally a user will configure their IPsec tunnel to operate in 269 IP-TFS mode to start, we also allow IP-TFS mode to be enabled post-SA 270 creation. This may be useful for debugging or other purposes. In 271 this late enabled mode the receiver would switch to IP-TFS mode on 272 receipt of the first ESP payload with the IPTFS_PROTOCOL indicated as 273 the payload type. 275 2.4. Example of an encapsulated IP packet flow 277 Below we show an example inner IP packet flow within the 278 encapsulating tunnel packet stream. Notice how encapsulated IP 279 packets can start and end anywhere, and more than one or less than 1 280 may occur in a single encapsulating packet. 282 Offset: 0 Offset: 100 Offset: 2900 Offset: 1400 283 [ ESP1 (1500) ][ ESP2 (1500) ][ ESP3 (1500) ][ ESP4 (1500) ] 284 [--800--][--800--][60][-240-][--4000----------------------][pad] 286 Figure 3: Inner and Outer Packet Flow 288 The encapsulated IP packet flow (lengths include IP header and 289 payload) is as follows: an 800 octet packet, an 800 octet packet, a 290 60 octet packet, a 240 octet packet, a 4000 octet packet. 292 The BlockOffset values in the 4 IP-TFS payload headers for this 293 packet flow would thus be: 0, 100, 2900, 1400 respectively. The 294 first encapsulating packet ESP1 has a zero BlockOffset which points 295 at the IP data block immediately following the IP-TFS header. The 296 following packet ESP2s BlockOffset points inward 100 octets to the 297 start of the 60 octet data block. The third encapsulating packet 298 ESP3 contains the middle portion of the 4000 octet data block so the 299 offset points past its end and into the forth encapsulating packet. 300 The fourth packet ESP4s offset is 1400 pointing at the padding which 301 follows the completion of the continued 4000 octet packet. 303 Having the BlockOffset always point at the next available data block 304 allows for quick recovery with minimal inner packet loss in the 305 presence of outer encapsulating packet loss. 307 2.5. Modes of operation 309 Just as with normal IPsec tunnels IP-TFS tunnels are unidirectional. 310 Bidirectional functionality is achieved by setting up 2 tunnels, one 311 in either direction. 313 An IP-TFS tunnel can operate in 2 modes, a non-congestion controlled 314 mode and congestion controlled mode. 316 2.5.1. Non-Congestion Controlled Mode 318 In the non-congestion controlled mode IP-TFS sends fixed-sized 319 packets at a constant rate. The packet send rate is constant and is 320 not automatically adjusted regardless of any network congestion 321 (i.e., packet loss). 323 For similar reasons as given in [RFC7510] the non-congestion 324 controlled mode should only be used where the user has full 325 administrative control over the path the tunnel will take. This is 326 required so the user can guarantee the bandwidth and also be sure as 327 to not be negatively affecting network congestion [RFC2914]. In this 328 case packet loss should be reported to the administrator (e.g., via 329 syslog, YANG notification, SNMP traps, etc) so that any failures due 330 to a lack of bandwidth can be corrected. 332 2.5.2. Congestion Controlled Mode 334 With the congestion controlled mode, IP-TFS adapts to network 335 congestion by lowering the packet send rate to accommodate the 336 congestion, as well as raising the rate when congestion subsides. 338 If congestion were handled in the network on a octet level we might 339 consider lowering the IPsec (encapsulation) packet size to adapt; 340 however, as congestion is normally handled in the network by dropping 341 packets we instead choose to lower the frequency we send our fixed 342 sized packets. This choice also minimizes transport overhead. 344 The output of a congestion control algorithm SHOULD adjust the 345 frequency that ingress sends packets until the congestion is 346 accommodated. While this document does not standardize the 347 congestion control algorithm, the algorithm used by an implementation 348 SHOULD conform to the guidelines in [RFC2914]. 350 When an implementation is choosing a congestion control algorithm it 351 is worth noting that IP-TFS is not providing for reliable delivery of 352 IP traffic and so per packet ACKs are not required, and are not 353 provided. 355 It's worth noting that the adjustable rate of sending over the 356 congestion controlled IP-TFS tunnel is being controlled by the 357 network congestion. As long as the encapsulated traffic flow shape 358 and timing are not directly affecting the network congestion, the 359 variations in the tunnel rate will not weaken the provided traffic 360 flow confidentiality. 362 2.5.2.1. Circuit Breakers 364 In additional to congestion control, implementations MAY choose to 365 define and implement circuit breakers [RFC8084] as a recovery method 366 of last resort. Enabling circuit breakers is also a reason a user 367 may wish to enable congestion information reports even when using the 368 non-congestion controlled mode of operation. The definition of 369 circuit breakers are outside the scope of this document. 371 3. Congestion Information 373 In order to support the congestion control mode, the receiver (egress 374 tunnel endpoint) MUST send regular packet drop reports to the sender 375 (ingress tunnel endpoint). These reports indicate the number of 376 packet drops during a sequence of packets. The sequence or range of 377 packets is identified using the start and end ESP sequence numbers of 378 the packet range. 380 These congestion information reports MAY also be sent when in the 381 non-congestion controlled mode to allow for reporting from the 382 sending device or to implement Circuit Breakers [RFC8084]. 384 The congestion information is sent using an IKEv2 INFORMATION 385 notifications [RFC7296]. These notifications are sent at a 386 configured interval (which can be configured to 0 to disable the 387 sending of the reports). 389 3.1. ECN Support 391 In additional to normal packet loss information IP-TFS supports use 392 of the ECN bits in the encapsulating IP header [RFC3168] for 393 identifying congestion. If ECN use is enabled and a packet arrives 394 at the egress endpoint with the Congestion Experienced (CE) value 395 set, then the receiver records that packet as being dropped, although 396 it does not drop it. When the CE information is used to calculate 397 the packet drop count the receiver also sets the E bit in the 398 congestion information notification data. In order to respond 399 quickly to the congestion indication the receiver MAY immediately 400 send a congestion information notification to the sender upon 401 receiving a packet with the CE indication. This additional immediate 402 send SHOULD only be done once per normal congestion information 403 sending interval though. 405 As noted in [RFC3168] the ECN bits are not protected by IPsec and 406 thus may constitute a covert channel. For this reason ECN use SHOULD 407 NOT be enabled by default. 409 4. Configuration 411 IP-TFS is meant to be deployable with a minimal amount of 412 configuration. All IP-TFS specific configuration (i.e., in addition 413 to the underlying IPsec tunnel configuration) should be able to be 414 specified at the tunnel ingress (sending) side alone (i.e., single- 415 ended provisioning). 417 4.1. Bandwidth 419 Bandwidth is a local configuration option. For non-congestion 420 controlled mode the bandwidth SHOULD be configured. For congestion 421 controlled mode one can configure the bandwidth or have no 422 configuration and let congestion control discover the maximum 423 bandwidth available. No standardized configuration method is 424 required. 426 4.2. Fixed Packet Size 428 The fixed packet size to be used for the tunnel encapsulation packets 429 can be configured manually or can be automatically determined using 430 Path MTU discovery (see [RFC1191] and [RFC8201]). No standardized 431 configuration method is required. 433 4.3. Congestion Information Configuration 435 If congestion control mode is to be used, or if the user wishes to 436 receive congestion information on the sender for circuit breaking or 437 other operational notifications in the non-congestion controlled 438 mode, IP-TFS will need to configure the egress tunnel endpoint to 439 send congestion information periodically. 441 In order to configure the sending interval of periodic congestion 442 information on the egress tunnel endpoint, we utilize the IKEv2 443 Configuration Payload (CP) [RFC7296]. Implementations MAY also allow 444 for manual (or default) configuration of this interval; however, 445 implementations of IP-TFS MUST support configuration using the IKEv2 446 exchange described below. 448 We utilize a new IKEv2 configuration attribute TFS_INFO_INTERVAL 449 (TBD) to configure the sending interval from the egress endpoint of 450 the tunnel. This value is configured using a CFG_REQUEST payload and 451 is acknowledge by the receiver using a CFG_REPLY payload. This 452 configuration exchange SHOULD be sent during the IKEv2 configuration 453 exchanges occurring as the tunnel is first brought up. The sending 454 interval value MAY also be changed at any time afterwards using a 455 similar CFG_REQUEST/CFG_REPLY payload inside an IKEv2 INFORMATIONAL 456 exchange. 458 In the absence of a congestion information configuration exchange the 459 sending interval is up to the receiving device configuration. 461 The sending interval value is given in milliseconds and is 16 bits 462 wide; however, it is not recommended that values below 1/10th of a 463 second are used as this could lead to early exhaustion of the Message 464 ID field used in the IKEv2 INFORMATIONAL exchange to send the 465 congestion information. 467 {{question: Could we get away with sending the info using the same 468 message ID each time? We have a timestamp that would allow for 469 duplicate detection, and the payload will be authenticated by IKEv2. 470 }} 472 A sending interval value of 0 disables sending of the congestion 473 information. 475 5. Packet and Data Formats 477 5.1. IPSec 479 5.1.1. Payload Format 481 1 2 3 482 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 483 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 484 |V| Reserved | BlockOffset | 485 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 486 | DataBlocks ... 487 +-+-+-+-+-+-+-+-+-+-+- 489 V: 490 A 1 bit version field that MUST be set to zero. If received as 491 one the packet MUST be dropped. 493 Reserved: 494 A 15 bit field set to 0 and ignored on receipt. 496 BlockOffset: 497 A 16 bit unsigned integer counting the number of octets following 498 this 32 bit header before the next data block. It can also point 499 past the end of the containing packet in which case the data 500 entirely belongs to the previous data block. If the offset 501 extends into subsequent packets the subsequent 32 bit IP-TFS 502 headers are not counted by this value. 504 DataBlocks: 505 Variable number of octets that constitute the start or 506 continuation of a previous data block. 508 5.1.2. Data Blocks 510 1 2 3 511 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 512 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 513 | Type | IPv4, IPv6 or pad... 514 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 516 Type: 517 A 4 bit field where 0x0 identifies a pad data block, 0x4 indicates 518 an IPv4 data block, and 0x6 indicates an IPv6 data block. 520 5.1.2.1. IPv4 Data Block 522 1 2 3 523 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 524 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 525 | 0x4 | IHL | TypeOfService | TotalLength | 526 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 527 | Rest of the inner packet ... 528 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 530 These values are the actual values within the encapsulated IPv4 531 header. In other words, the start of this data block is the start of 532 the encapsulated IP packet. 534 Type: 535 A 4 bit value of 0x4 indicating IPv4 (i.e., first nibble of the 536 IPv4 packet). 538 TotalLength: 539 The 16 bit unsigned integer length field of the IPv4 inner packet. 541 5.1.2.2. IPv6 Data Block 543 1 2 3 544 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 545 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 546 | 0x6 | TrafficClass | FlowLabel | 547 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 548 | TotalLength | Rest of the inner packet ... 549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 551 These values are the actual values within the encapsulated IPv6 552 header. In other words, the start of this data block is the start of 553 the encapsulated IP packet. 555 Type: 557 A 4 bit value of 0x6 indicating IPv6 (i.e., first nibble of the 558 IPv6 packet). 560 TotalLength: 561 The 16 bit unsigned integer length field of the inner IPv6 inner 562 packet. 564 5.1.2.3. Pad Data Block 566 1 2 3 567 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 568 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 569 | 0x0 | Padding ... 570 +-+-+-+-+-+-+-+-+-+-+- 572 Type: 573 A 4 bit value of 0x0 indicating a padding data block. 575 Padding: 576 extends to end of the encapsulating packet. 578 5.2. IKEv2 580 5.2.1. IKEv2 Congestion Information Configuration Attribute 582 The following defines the configuration attribute structure used in 583 the IKEv2 [RFC7296] configuration exchange to set the congestion 584 information report sending interval. 586 1 2 3 587 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 588 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 589 |R| Attribute Type | Length | 590 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 591 | Interval | 592 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 594 R: 595 1 bit set to 0. 597 Attribute Type: 598 15 bit value set to TFS_INFO_INTERVAL (TBD). 600 Length: 601 2 octet length set to 2. 603 SendInterval: 604 A 2 octet unsigned integer. The sending interval in milliseconds. 606 5.2.2. IKEv2 Congestion Information Notification Data 608 We utilize a send only (i.e., no response expected) IKEv2 609 INFORMATIONAL exchange (37) to transmit the congestion information 610 using a notification payload of type TFS_CONGEST_INFO (TBD). The The 611 Response bit should be set to 0. As no response is expected the only 612 payload should be the congestion information in the notification 613 payload. The following diagram defines the notification payload 614 data. 616 1 2 3 617 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 618 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 619 |E| Reserved | DropCount | 620 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 621 | Timestamp | 622 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 623 | AckSeqStart | 624 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 625 | AckSeqEnd | 626 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 628 E: 629 A 1 bit value that if set indicates that packet[s] with Congestion 630 Experienced (CE) ECN bits set were received and used in 631 calculating the DropCount value. 633 Reserved: 634 A 7 bit field set to 0 ignored on receipt. 636 DropCount: 637 A 24 bit unsigned integer count of the drops that occurred between 638 AckSeqStart and AckSeqEnd. If the drops exceed the resolution of 639 the counter then set to the maximum value (i.e., 0xFFFFFF). 641 AckSeqStart: 642 A 32 bit unsigned integer containing the first ESP sequence number 643 (as defined in [RFC4303]) of the packet range that this 644 information relates to. 646 AckSeqEnd: 647 A 32 bit unsigned integer containing the last ESP sequence number 648 (as defined in [RFC4303]) of the packet range that this 649 information relates to. 651 Timestamp: 652 A 32 bit unsigned integer containing the lower 32 bits of a 653 running monotonic millisecond timer of when this notification data 654 was created/sent. This value is used to determine duplicates and 655 drop counts of this information. Implementations should deal with 656 wrapping of this timer value. 658 6. IANA Considerations 660 This document requests a protocol number IPTFS_PROTOCOL be allocated 661 by IANA from "Assigned Internet Protocol Numbers" registry for 662 identifying the IP-TFS ESP payload format. 664 Type: TBD Description: IP-TFS ESP payload format. Reference: This 665 document 667 Additionally this document requests an attribute value 668 TFS_INFO_INTERVAL (TBD) be allocated by IANA from "IKEv2 669 Configuration Payload Attribute Types" registry. 671 Type: TBD Description: The sending rate of congestion information 672 from egress tunnel endpoint. Reference: This document 674 Additionally this document requests a notify message status type 675 TFS_CONGEST_INFO (TBD) be allocated by IANA from "IKEv2 Notify 676 Message Types - Status Types" registry. 678 Type: TBD Description: The sending rate of congestion information 679 from egress tunnel endpoint. Reference: This document 681 7. Security Considerations 683 This document describes a mechanism to add Traffic Flow 684 Confidentiality to IP traffic. Use of this mechanism is expected to 685 increase the security of the traffic being transported. Other than 686 the additional security afforded by using this mechanism, IP-TFS 687 utilizes the security protocols [RFC4303] and [RFC7296] and so their 688 security considerations apply to IP-TFS as well. 690 As noted previously in Section 2.5.2, for TFC to be fully maintained 691 the encapsulated traffic flow should not be affecting network 692 congestion in a predictable way, and if it would be then non- 693 congestion controlled mode use should be considered instead. 695 8. References 697 8.1. Normative References 699 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 700 Requirement Levels", BCP 14, RFC 2119, 701 DOI 10.17487/RFC2119, March 1997, 702 . 704 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 705 RFC 4303, DOI 10.17487/RFC4303, December 2005, 706 . 708 [RFC7296] Kaufman, C., Hoffman, P., Nir, Y., Eronen, P., and T. 709 Kivinen, "Internet Key Exchange Protocol Version 2 710 (IKEv2)", STD 79, RFC 7296, DOI 10.17487/RFC7296, October 711 2014, . 713 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 714 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 715 May 2017, . 717 8.2. Informative References 719 [AppCrypt] 720 Schneier, B., "Applied Cryptography: Protocols, 721 Algorithms, and Source Code in C", 11 2017. 723 [I-D.iab-wire-image] 724 Trammell, B. and M. Kuehlewind, "The Wire Image of a 725 Network Protocol", draft-iab-wire-image-01 (work in 726 progress), November 2018. 728 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 729 DOI 10.17487/RFC0791, September 1981, 730 . 732 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 733 DOI 10.17487/RFC1191, November 1990, 734 . 736 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 737 "Definition of the Differentiated Services Field (DS 738 Field) in the IPv4 and IPv6 Headers", RFC 2474, 739 DOI 10.17487/RFC2474, December 1998, 740 . 742 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 743 RFC 2914, DOI 10.17487/RFC2914, September 2000, 744 . 746 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 747 of Explicit Congestion Notification (ECN) to IP", 748 RFC 3168, DOI 10.17487/RFC3168, September 2001, 749 . 751 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 752 Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, 753 December 2005, . 755 [RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, 756 "Encapsulating MPLS in UDP", RFC 7510, 757 DOI 10.17487/RFC7510, April 2015, 758 . 760 [RFC8084] Fairhurst, G., "Network Transport Circuit Breakers", 761 BCP 208, RFC 8084, DOI 10.17487/RFC8084, March 2017, 762 . 764 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 765 (IPv6) Specification", STD 86, RFC 8200, 766 DOI 10.17487/RFC8200, July 2017, 767 . 769 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 770 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 771 DOI 10.17487/RFC8201, July 2017, 772 . 774 Appendix A. Comparisons of IP-TFS 776 A.1. Comparing Overhead 778 A.1.1. IP-TFS Overhead 780 The overhead of IP-TFS is 40 bytes per outer packet. Therefore the 781 octet overhead per inner packet is 40 divided by the number of outer 782 packets required (fractional allowed). The overhead as a percentage 783 of inner packet size is a constant based on the Outer MTU size. 785 OH = 40 / Outer Payload Size / Inner Packet Size 786 OH % of Inner Packet Size = 100 * OH / Inner Packet Size 787 OH % of Inner Packet Size = 4000 / Outer Payload Size 788 Type IP-TFS IP-TFS IP-TFS 789 MTU 576 1500 9000 790 PSize 536 1460 8960 791 ------------------------------- 792 40 7.46% 2.74% 0.45% 793 576 7.46% 2.74% 0.45% 794 1500 7.46% 2.74% 0.45% 795 9000 7.46% 2.74% 0.45% 797 Figure 4: IP-TFS Overhead as Percentage of Inner Packet Size 799 A.1.2. ESP with Padding Overhead 801 The overhead per inner packet for constant-send-rate padded ESP 802 (i.e., traditional IPSec TFC) is 36 octets plus any padding, unless 803 fragmentation is required. 805 When fragmentation of the inner packet is required to fit in the 806 outer IPsec packet, overhead is the number of outer packets required 807 to carry the fragmented inner packet times both the inner IP overhead 808 (20) and the outer packet overhead (36) minus the initial inner IP 809 overhead plus any required tail padding in the last encapsulation 810 packet. The required tail padding is the number of required packets 811 times the difference of the Outer Payload Size and the IP Overhead 812 minus the the Inner Payload Size. So: 814 Inner Paylaod Size = IP Packet Size - IP Overhead 815 Outer Payload Size = MTU - IPSec Overhead 817 Inner Payload Size 818 NF0 = ---------------------------------- 819 Outer Payload Size - IP Overhead 821 NF = CEILING(NF0) 823 OH = NF * (IP Overhead + IPsec Overhead) 824 - IP Overhead 825 + NF * (Outer Payload Size - IP Overhead) 826 - Inner Payload Size 828 OH = NF * (IPSec Overhead + Outer Payload Size) 829 - (IP Overhead + Inner Payload Size) 831 OH = NF * (IPSec Overhead + Outer Payload Size) 832 - Inner Packet Size 834 A.2. Overhead Comparison 836 The following tables collect the overhead values for some common L3 837 MTU sizes in order to compare them. The first table is the number of 838 octets of overhead for a given L3 MTU sized packet. The second table 839 is the percentage of overhead in the same MTU sized packet. 841 Type ESP+Pad ESP+Pad ESP+Pad IP-TFS IP-TFS IP-TFS 842 L3 MTU 576 1500 9000 576 1500 9000 843 PSize 540 1464 8964 536 1460 8960 844 ----------------------------------------------------------- 845 40 500 1424 8924 3.0 1.1 0.2 846 128 412 1336 8836 9.6 3.5 0.6 847 256 284 1208 8708 19.1 7.0 1.1 848 536 4 928 8428 40.0 14.7 2.4 849 576 576 888 8388 43.0 15.8 2.6 850 1460 268 4 7504 109.0 40.0 6.5 851 1500 228 1500 7464 111.9 41.1 6.7 852 8960 1408 1540 4 668.7 245.5 40.0 853 9000 1368 1500 9000 671.6 246.6 40.2 855 Figure 5: Overhead comparison in octets 857 Type ESP+Pad ESP+Pad ESP+Pad IP-TFS IP-TFS IP-TFS 858 MTU 576 1500 9000 576 1500 9000 859 PSize 540 1464 8964 536 1460 8960 860 ----------------------------------------------------------- 861 40 1250.0% 3560.0% 22310.0% 7.46% 2.74% 0.45% 862 128 321.9% 1043.8% 6903.1% 7.46% 2.74% 0.45% 863 256 110.9% 471.9% 3401.6% 7.46% 2.74% 0.45% 864 536 0.7% 173.1% 1572.4% 7.46% 2.74% 0.45% 865 576 100.0% 154.2% 1456.2% 7.46% 2.74% 0.45% 866 1460 18.4% 0.3% 514.0% 7.46% 2.74% 0.45% 867 1500 15.2% 100.0% 497.6% 7.46% 2.74% 0.45% 868 8960 15.7% 17.2% 0.0% 7.46% 2.74% 0.45% 869 9000 15.2% 16.7% 100.0% 7.46% 2.74% 0.45% 871 Figure 6: Overhead as Percentage of Inner Packet Size 873 A.3. Comparing Available Bandwidth 875 Another way to compare the two solutions is to look at the amount of 876 available bandwidth each solution provides. The following sections 877 consider and compare the percentage of available bandwidth. For the 878 sake of providing a well understood baseline we will also include 879 normal (unencrypted) Ethernet as well as normal ESP values. 881 A.3.1. Ethernet 883 In order to calculate the available bandwidth we first calculate the 884 per packet overhead in bits. The total overhead of Ethernet is 14+4 885 octets of header and CRC plus and additional 20 octets of framing 886 (preamble, start, and inter-packet gap) for a total of 48 octets. 887 Additionally the minimum payload is 46 octets. 889 Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP 890 MTU 590 1514 9014 590 1514 9014 any any 891 OH 74 74 74 78 78 78 38 74 892 ------------------------------------------------------------ 893 40 614 1538 9038 45 42 40 84 114 894 128 614 1538 9038 146 134 129 166 202 895 256 614 1538 9038 293 269 258 294 330 896 536 614 1538 9038 614 564 540 574 610 897 576 1228 1538 9038 659 606 581 614 650 898 1460 1842 1538 9038 1672 1538 1472 1498 1534 899 1500 1842 3076 9038 1718 1580 1513 1538 1574 900 8960 11052 10766 9038 10263 9438 9038 8998 9034 901 9000 11052 10766 18076 10309 9480 9078 9038 9074 903 Figure 7: L2 Octets Per Packet 905 Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP 906 MTU 590 1514 9014 590 1514 9014 any any 907 OH 74 74 74 78 78 78 38 74 908 -------------------------------------------------------------- 909 40 2.0M 0.8M 0.1M 27.3M 29.7M 31.0M 14.9M 11.0M 910 128 2.0M 0.8M 0.1M 8.5M 9.3M 9.7M 7.5M 6.2M 911 256 2.0M 0.8M 0.1M 4.3M 4.6M 4.8M 4.3M 3.8M 912 536 2.0M 0.8M 0.1M 2.0M 2.2M 2.3M 2.2M 2.0M 913 576 1.0M 0.8M 0.1M 1.9M 2.1M 2.2M 2.0M 1.9M 914 1460 678K 812K 138K 747K 812K 848K 834K 814K 915 1500 678K 406K 138K 727K 791K 826K 812K 794K 916 8960 113K 116K 138K 121K 132K 138K 138K 138K 917 9000 113K 116K 69K 121K 131K 137K 138K 137K 919 Figure 8: Packets Per Second on 10G Ethernet 921 Size E + P E + P E + P IPTFS IPTFS IPTFS Enet ESP 922 590 1514 9014 590 1514 9014 any any 923 74 74 74 78 78 78 38 74 924 ---------------------------------------------------------------------- 925 40 6.51% 2.60% 0.44% 87.30% 94.93% 99.14% 47.62% 35.09% 926 128 20.85% 8.32% 1.42% 87.30% 94.93% 99.14% 77.11% 63.37% 927 256 41.69% 16.64% 2.83% 87.30% 94.93% 99.14% 87.07% 77.58% 928 536 87.30% 34.85% 5.93% 87.30% 94.93% 99.14% 93.38% 87.87% 929 576 46.91% 37.45% 6.37% 87.30% 94.93% 99.14% 93.81% 88.62% 930 1460 79.26% 94.93% 16.15% 87.30% 94.93% 99.14% 97.46% 95.18% 931 1500 81.43% 48.76% 16.60% 87.30% 94.93% 99.14% 97.53% 95.30% 932 8960 81.07% 83.22% 99.14% 87.30% 94.93% 99.14% 99.58% 99.18% 933 9000 81.43% 83.60% 49.79% 87.30% 94.93% 99.14% 99.58% 99.18% 935 Figure 9: Percentage of Bandwidth on 10G Ethernet 937 A sometimes unexpected result of using IP-TFS (or any packet 938 aggregating tunnel) is that, for small to medium sized packets, the 939 available bandwidth is actually greater than native Ethernet. This 940 is due to the reduction in Ethernet framing overhead. This increased 941 bandwidth is paid for with an increase in latency. This latency is 942 the time to send the unrelated octets in the outer tunnel frame. The 943 following table illustrates the latency for some common values on a 944 10G Ethernet link. The table also includes latency introduced by 945 padding if using ESP with padding. 947 ESP+Pad ESP+Pad IP-TFS IP-TFS 948 1500 9000 1500 9000 950 ------------------------------------------ 951 40 1.14 us 7.14 us 1.17 us 7.17 us 952 128 1.07 us 7.07 us 1.10 us 7.10 us 953 256 0.97 us 6.97 us 1.00 us 7.00 us 954 536 0.74 us 6.74 us 0.77 us 6.77 us 955 576 0.71 us 6.71 us 0.74 us 6.74 us 956 1460 0.00 us 6.00 us 0.04 us 6.04 us 957 1500 1.20 us 5.97 us 0.00 us 6.00 us 959 Figure 10: Added Latency 961 Notice that the latency values are very similar between the two 962 solutions; however, whereas IP-TFS provides for constant high 963 bandwidth, in some cases even exceeding native Ethernet, ESP with 964 padding often greatly reduces available bandwidth. 966 Appendix B. Acknowledgements 968 We would like to thank Don Fedyk for help in reviewing this work. 970 Appendix C. Contributors 972 The following people made significant contributions to this document. 974 Lou Berger 975 LabN Consulting, L.L.C. 977 Email: lberger@labn.net 979 Author's Address 981 Christian Hopps 982 LabN Consulting, L.L.C. 984 Email: chopps@chopps.org