idnits 2.17.1 draft-pwouters-multi-sa-performance-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 2, 2020) is 1270 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'TBD' is mentioned on line 407, but not defined == Missing Reference: 'TO DO' is mentioned on line 312, but not defined -- Obsolete informational reference (is this intentional?): RFC 6982 (Obsoleted by RFC 7942) Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network A. Antony 3 Internet-Draft S. Klassert 4 Intended status: Standards Track secunet 5 Expires: May 6, 2021 P. Wouters 6 Red Hat 7 November 2, 2020 9 IKEv2 support for per-queue Child SAs 10 draft-pwouters-multi-sa-performance-00 12 Abstract 14 This document defines two Notification Payload (NUM_QUEUES and 15 QUEUE_INFO) for the Internet Key Exchange Protocol Version 2 (IKEv2). 16 These payloads add support for negotiating multiple identical Child 17 SAs that can be used to to optimize performance based on the number 18 of queues or CPUs, orcw to create multiple Child SAs for different 19 Quality of Service (QoS) levels. 21 Using multiple identical Child Sa's has the additional benefit that 22 multiple streams have their own Sequence Number, ensuring that CPU's 23 don't have to synchronize their crypto state or disable their replay 24 window detection. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on May 6, 2021. 43 Copyright Notice 45 Copyright (c) 2020 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 62 2. Performance bottlenecks . . . . . . . . . . . . . . . . . . . 3 63 3. Negotiation of performance specific Child SAs . . . . . . . . 3 64 4. Implementation specifics . . . . . . . . . . . . . . . . . . 4 65 4.1. One Child per CPU . . . . . . . . . . . . . . . . . . . . 4 66 4.2. QoS Child SA's . . . . . . . . . . . . . . . . . . . . . 5 67 5. Payload Format . . . . . . . . . . . . . . . . . . . . . . . 6 68 5.1. NUM_QUEUES Notify Payload . . . . . . . . . . . . . . . . 6 69 5.2. QUEUE_INFO Notify Payload . . . . . . . . . . . . . . . . 6 70 6. Security Considerations . . . . . . . . . . . . . . . . . . . 7 71 7. Implementation Status . . . . . . . . . . . . . . . . . . . . 7 72 7.1. Linux XFRM . . . . . . . . . . . . . . . . . . . . . . . 8 73 7.2. Libreswan . . . . . . . . . . . . . . . . . . . . . . . . 8 74 7.3. strongSWAN . . . . . . . . . . . . . . . . . . . . . . . 9 75 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 76 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 77 9.1. Normative References . . . . . . . . . . . . . . . . . . 9 78 9.2. Informative References . . . . . . . . . . . . . . . . . 10 79 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 81 1. Introduction 83 IPsec implementations are currently limited to using one queue or CPU 84 per Child SA. The result is that a machine with many queues/CPUs is 85 limited to only using one these per Child SA. This severely limits 86 the speeds that can be obtained. An unencrypted link of 10gbps or 87 more is commonly reduced to 2-3gbps when IPsec is used to encrypt the 88 link, for example when using AES-GCM. 90 Furthermore IPsec implementations are currently limited to use the 91 same Child SA for all Quality of Service (QoS) types bacause the QoS 92 type is not a part of the TS. The result is that IPsec can't do 93 active Quality of Service priorizing without disabling the anti 94 replay detection. 96 To make better use of multiple network queues and CPUs, it can be 97 beneficial to negotiate and install multiple identical Child SAs. 98 IKEv2 [RFC7296] already allows installing multiple identical Child 99 SAs, but often implementations will assume the older Child SA is 100 being replaced by the newer Child Sa, even when no INITIAL_CONTACT 101 notify payload was received. 103 When two IKEv2 peers want to negotiate multiple Child SAs, it would 104 be useful for them to convey how many of these are considered 105 acceptable to install. This avoids triggering CREATE_CHILD_SA 106 exchanges that will be rejected with TS_UNACCEPTABLE. 108 1.1. Requirements Language 110 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 111 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 112 "OPTIONAL" in this document are to be interpreted as described in BCP 113 14 [RFC2119] [RFC8174] when, and only when, they appear in all 114 capitals, as shown here. 116 2. Performance bottlenecks 118 Currently, most IPsec implementations are limited by using one CPU or 119 network queue per Child SA. There are a number of performance 120 reasons for this, but a key limitation is that sharing the AEAD 121 state, counters and sequence numbers between multiple CPUs is not 122 feasible without a significant performance penalty. There is a need 123 to negotiate and establish multiple Child SA's with identical TSi/TSr 124 on a per-queue or per-CPU basis. 126 3. Negotiation of performance specific Child SAs 128 The number of Child SA's notify payload refers to the number of 129 instances for this particular TSi/TSr combination. Both ends send 130 their Preferred number of Child SAs and the maximum of Child SAs they 131 are willing to install. Both ends pick the highest preferred number 132 up to the lowest maximum number. That is if one end prefers 16 but 133 accepts 32, and the other end prefers 48 and accepts 48, the number 134 picked is 32. If a 33rd Child SA is attempted, the peer with the 32 135 maximum SHOULD return TS_UNACCEPTABLE. 137 The NUM_QUEUES Notify is sent as part of the IKE_AUTH or 138 CREATE_CHILD_SA message that contains the Traffic Selector payload 139 for a new Child SA. If there are multiple IKE_AUTH exchanges, such 140 as when using EAP, the TSi/TSr payloads and the Notify payloads 141 defines in this document only appear in the first IKE_AUTH message. 142 In CREATE_CHILD_SA, the NUM_QUEUES Notify MUST only be sent in 143 messages for new set of Child SA's (the message used to set up the 144 Head SA) 146 The QUEUE_INFO Notify MUST only be sent in CREATE_CHILD_SA for Sub 147 SA's. During CREATE_CHILD_SA's sent for Child SA rekey, the 148 QUEUE_INFO MAY be included. If it is included it MUST be the same as 149 for the Child SA being rekeyed. 151 4. Implementation specifics 153 There are various considerations that an implementation could use to 154 determine the best way to install the multiple Child SAs. Below are 155 examples of such strategies. 157 4.1. One Child per CPU 159 A simple distribution could be to install one Child SA per CPU. Note 160 that at least one of the Child SAs must be the "fallback" in case 161 there is no specific Child SA on a specific CPU. This is called the 162 Head SA, where the per-CPU Child SA is called a Sub CA. The initial 163 Child SA negotiated with IKE becomes the Head SA. This ensures that 164 any CPU generating traffic to be encrypted has an available (if not 165 optimal) Child SA to use. Any subsequent Child SA's with identical 166 TSi/TSr are considered Sub SA's and installed to be used only by a 167 single CPU. 169 Implementations supporting per-CPU SAs SHOULD extend their mechanism 170 of on-demand negotiation that is triggered by traffic to include a 171 CPU (or queue) identifier in their ACQUIRE message from the SPD to 172 the IKE daemon (eg via NETLINK of PFKEYv2). If the kernel's ACQUIRE 173 message does not support sending a per-CPU identifier, then the IKE 174 daemon should initiate all its Child SAs immediately upon receiving 175 an ACQUIRE. 177 Performing per-CPU Child SA negotiations can result in both peers 178 initiating Sub SAs at once. This is especially likely in the per-CPU 179 acquire case. Responders should install the Sub SA on the CPU with 180 the least amount of Sub SA's for this TSi/TSr pair. It should count 181 outstanding ACQUIREs as an assigned Sub SA. It is still possible 182 that when the peers only have one slot left to assign, that both 183 peers send an ACQUIRE at the same time. The initiator that receives 184 the CREATE_CHID_SA response last, eg the initiator of the slowest 185 duplicate MAY send a delete to delete the duplicate Child SA. 187 As an optimization, Sub SA's that see little traffic MAY be deleted. 188 However, it MUST NOT delete an idle Head SA. This ensures both peers 189 always have a Child SA that can be used by a CPU that does not have a 190 Sub SA (yet) and ensures encrypted traffic can always be exchanged, 191 even when that traffic triggered a new per-CPU ACQUIRE. 193 When the number of queues or CPUs are different between the peers, 194 the peer with the least amount of queues or CPUs MAY decide to not 195 install a second outbound Child SA as it will never use it to send 196 traffic. However, it MUST install all inbound Child SA's as it 197 cannot predict which of these the other peer will use to send 198 traffic. It MUST NOT generate an error when deleting the (missing) 199 outbound SA component of the Child SA. 201 A per-CPU ACQUIRE message SHOULD still send the Traffic Selector 202 (TSi) information of the trigger packet. This information MAY be 203 used by the responder to select the most efficient target CPU to use. 204 For example, if the trigger packet was for TCP destination port 25 205 (SMTP), it might be able to install the Child SA on the CPU that is 206 also running the mail server process. See [RFC7296] Section 2.9. 208 The QUEUE_INFO Notify payload MAY be sent in the CREATE_CHILD_SA 209 request for the additional (subSA) Child SAs. It can be used to 210 convey the QoS stream or CPUID. 212 [Clarify narrowing Traffic Selectors. Should it be allowed/forbidden 213 ?] 215 [Clarify CP / INTERNAL_ADDRESS. Should it be allowed/forbidden ?] 217 [UDP enacap Due to the nature handling of UDP encapsulated ESP at the 218 receiver NIC queus and intermediate routers for parallel paths, UDP 219 encapsulated ESP will used multiple source ports. NOTE: this is 220 implemented in libreswan on Linux XFRM.] 222 4.2. QoS Child SA's 224 To install multiple Child SA's for different QoS levels, a method 225 similar to per-CPU is used. The initial Child SA is used for all QoS 226 levels not matched by more specific Child SA's. Additional Child 227 SA's are installed per QoS level, which can be done on-demand if the 228 kernel's IPsec subsystem can send per-QoS level ACQUIREs to the IKE 229 daemon. 231 A request for a Child SA for a specific QoS value MUST include the 232 QUEUE_INFO Notify payload set to the required QoS value so that both 233 endpoints use the same Child SA for the same QoS level. If a certain 234 QoS level proposed is not acceptable to the resonder, TS_UNACCEPTABLE 235 MUST be returned. During Child SA REKEY, the QUEUE_INFO Notify MAY 236 be included but MUST contain the same value as the Child SA that is 237 being rekeyed. [ This kind of suggests this should be a TS_TYPE and 238 not a Notify ] 240 5. Payload Format 242 All multi-octet fields representing integers are laid out in big 243 endian order (also known as "most significant byte first", or 244 "network byte order"). 246 5.1. NUM_QUEUES Notify Payload 248 The NUM_QUEUES Notify payload is related to a Child SA, and MAY be 249 exchanged in IKE_AUTH or in a CREATE_CHILD_SA for new SA. It MUST 250 NOT be sent in CREATE_CHILD_SA for REKEY. If received for a REKEY 251 operation, it MUST be ignored. See [RFC7296] Section 1.3.1. 253 1 2 3 254 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 255 +-+-----------------------------+-------------------------------+ 256 ! Next Payload !C! RESERVED ! Payload Length ! 257 +---------------+---------------+-------------------------------+ 258 ! Protocol ID ! SPI Size ! Notify Message Type ! 259 +---------------+---------------+-------------------------------+ 260 ! Preferred number of IPsec SAs | Max accepted number of SAs ! 261 +-------------------------------+-------------------------------+ 263 o Protocol ID (1 octet) - MUST be 0. MUST be ignored if not 0. 265 o SPI Size (1 octet) - MUST be 0. MUST be ignored if not 0. by the 266 IPsec protocol ID 268 o Notify Message Type (2 octets) - set to [TBD] 270 o Preferred number of per-CPU IPsec SAs (2 octets). Value MUST be 271 greater than 0. If 0 is received, it MUST be interpreted as 1. 273 o Maximum accepted number of per-CPU IPsec SAs (2 octets). Value 274 MUST be greater than 0. If 0 is received, it MUST be interpreted 275 as 1. 277 Note: The first Child SA that is not bound to a single CPU (Head SA) 278 is not counted as part of these numbers. 280 5.2. QUEUE_INFO Notify Payload 282 The QUEUE_INFO Notify payload is an optional related to a Child SA, 283 and MAY be exchanged in IKE_AUTH or in a CREATE_CHILD_SA for new SA. 285 It MUST NOT be sent in CREATE_CHILD_SA for REKEY, see [RFC7296] 286 Section 1.3.1. 288 1 2 3 289 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 290 +-+-----------------------------+-------------------------------+ 291 ! Next Payload !C! RESERVED ! Payload Length ! 292 +---------------+---------------+-------------------------------+ 293 ! Protocol ID ! SPI Size ! Notify Message Type ! 294 +---------------+---------------+-------------------------------+ 295 ! ! 296 ~ Optional payload data ~ 297 ! ! 298 +-------------------------------+-------------------------------+ 300 o Protocol ID (1 octet) - MUST be 0. MUST be ignored if not 0. 302 o SPI Size (1 octet) - MUST be 0. MUST be ignored if not 0. by the 303 IPsec protocol ID 305 o Notify Message Type (2 octets) - set to [TBD] 307 o Optional Payload Data. This can be to identify the QoS options or 308 CPU-ID [Probable needs to be specified by this document] 310 6. Security Considerations 312 [TO DO] 314 7. Implementation Status 316 [Note to RFC Editor: Please remove this section and the reference to 317 [RFC6982] before publication.] 319 This section records the status of known implementations of the 320 protocol defined by this specification at the time of posting of this 321 Internet-Draft, and is based on a proposal described in [RFC7942]. 322 The description of implementations in this section is intended to 323 assist the IETF in its decision processes in progressing drafts to 324 RFCs. Please note that the listing of any individual implementation 325 here does not imply endorsement by the IETF. Furthermore, no effort 326 has been spent to verify the information presented here that was 327 supplied by IETF contributors. This is not intended as, and must not 328 be construed to be, a catalog of available implementations or their 329 features. Readers are advised to note that other implementations may 330 exist. 332 According to [RFC7942], "this will allow reviewers and working groups 333 to assign due consideration to documents that have the benefit of 334 running code, which may serve as evidence of valuable experimentation 335 and feedback that have made the implemented protocols more mature. 336 It is up to the individual working groups to use this information as 337 they see fit". 339 Authors are requested to add a note to the RFC Editor at the top of 340 this section, advising the Editor to remove the entire section before 341 publication, as well as the reference to [RFC7942]. 343 7.1. Linux XFRM 345 Organization: Linux kernel XFRM 347 Name: XFRM-PCPU-v1 348 https://git.kernel.org/pub/scm/linux/kernel/git/klassert/linux- 349 stk.git/log/?h=xfrm-pcpu-v1 351 Description: An initial Kernel IPsec implementation of the per-CPU 352 method. 354 Level of maturity: Alpha 356 Coverage: Fully implements Head SA and per-CPU Sub SA's 358 Licensing: GPLv2 360 Implementation experience: TBD 362 Contact: Linux IPsec: members@linux-ipsec.org 364 7.2. Libreswan 366 Organization: The Libreswan Project 368 Name: pcpu-3 https://libreswan.org/wiki/XFRM_pCPU 370 Description: An initial IKE implementation of the per-CPU method. 372 Level of maturity: Alpha 374 Coverage: implements Head SA and per-CPU Sub SA's. 376 Licensing: GPLv2 378 Implementation experience: TBD 379 Contact: Libreswan Development: swan-dev@libreswan.org 381 7.3. strongSWAN 383 Organization: Secunet 385 Name: XXXX https://secunet.com/somethingU 387 Description: An initial IKE implementation of the per-CPU method. 389 Level of maturity: Alpha 391 Coverage: implements Head SA and per-CPU Sub SA's. 393 Licensing: GPLv2 395 Implementation experience: TBD 397 Contact: Antony Antony: antony.antony@secunet.com. 399 8. IANA Considerations 401 This document defines one new IKEv2 Notify Message for the IANA 402 "IKEv2 Notify Message Types - Status Types" registry. 404 Value Notify Messages - Status Types Reference 405 ----- ------------------------------ --------------- 406 [TBD] NUM_QUEUES [this document] 407 [TBD] QUEUE_INFO [this document] 409 Figure 1 411 9. References 413 9.1. Normative References 415 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 416 Requirement Levels", BCP 14, RFC 2119, 417 DOI 10.17487/RFC2119, March 1997, 418 . 420 [RFC7296] Kaufman, C., Hoffman, P., Nir, Y., Eronen, P., and T. 421 Kivinen, "Internet Key Exchange Protocol Version 2 422 (IKEv2)", STD 79, RFC 7296, DOI 10.17487/RFC7296, October 423 2014, . 425 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 426 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 427 May 2017, . 429 9.2. Informative References 431 [RFC6982] Sheffer, Y. and A. Farrel, "Improving Awareness of Running 432 Code: The Implementation Status Section", RFC 6982, 433 DOI 10.17487/RFC6982, July 2013, 434 . 436 [RFC7942] Sheffer, Y. and A. Farrel, "Improving Awareness of Running 437 Code: The Implementation Status Section", BCP 205, 438 RFC 7942, DOI 10.17487/RFC7942, July 2016, 439 . 441 Authors' Addresses 443 Antony Antony 444 secunet Security Networks AG 446 Email: antony.antony@secunet.com 448 Steffen Klassert 449 secunet Security Networks AG 451 Email: steffen.klassert@secunet.com 453 Paul Wouters 454 Red Hat 456 Email: pwouters@redhat.com