idnits 2.17.1 draft-ietf-ipsecme-ddos-protection-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. -- The document date (March 21, 2016) is 2929 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'IKEV2-IANA' Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IPSecME Working Group Y. Nir 3 Internet-Draft Check Point 4 Intended status: Standards Track V. Smyslov 5 Expires: September 22, 2016 ELVIS-PLUS 6 March 21, 2016 8 Protecting Internet Key Exchange Protocol version 2 (IKEv2) 9 Implementations from Distributed Denial of Service Attacks 10 draft-ietf-ipsecme-ddos-protection-05 12 Abstract 14 This document recommends implementation and configuration best 15 practices for Internet Key Exchange Protocol version 2 (IKEv2) 16 Responders, to allow them to resist Denial of Service and Distributed 17 Denial of Service attacks. Additionally, the document introduces a 18 new mechanism called "Client Puzzles" that help accomplish this task. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on September 22, 2016. 37 Copyright Notice 39 Copyright (c) 2016 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 55 2. Conventions Used in This Document . . . . . . . . . . . . . . 3 56 3. The Vulnerability . . . . . . . . . . . . . . . . . . . . . . 3 57 4. Defense Measures while IKE SA is being created . . . . . . . 6 58 4.1. Retention Periods for Half-Open SAs . . . . . . . . . . . 6 59 4.2. Rate Limiting . . . . . . . . . . . . . . . . . . . . . . 6 60 4.3. The Stateless Cookie . . . . . . . . . . . . . . . . . . 7 61 4.4. Puzzles . . . . . . . . . . . . . . . . . . . . . . . . . 8 62 4.5. Session Resumption . . . . . . . . . . . . . . . . . . . 10 63 4.6. Keeping computed Shared Keys . . . . . . . . . . . . . . 10 64 4.7. Preventing Attacks using "Hash and URL" Certificate 65 Encoding . . . . . . . . . . . . . . . . . . . . . . . . 11 66 5. Defense Measures after IKE SA is created . . . . . . . . . . 11 67 6. Plan for Defending a Responder . . . . . . . . . . . . . . . 12 68 7. Using Puzzles in the Protocol . . . . . . . . . . . . . . . . 14 69 7.1. Puzzles in IKE_SA_INIT Exchange . . . . . . . . . . . . . 15 70 7.1.1. Presenting Puzzle . . . . . . . . . . . . . . . . . . 15 71 7.1.2. Solving Puzzle and Returning the Solution . . . . . . 18 72 7.1.3. Computing Puzzle . . . . . . . . . . . . . . . . . . 18 73 7.1.4. Analyzing Repeated Request . . . . . . . . . . . . . 19 74 7.1.5. Making Decision whether to Serve the Request . . . . 20 75 7.2. Puzzles in IKE_AUTH Exchange . . . . . . . . . . . . . . 21 76 7.2.1. Presenting Puzzle . . . . . . . . . . . . . . . . . . 22 77 7.2.2. Solving Puzzle and Returning the Solution . . . . . . 22 78 7.2.3. Computing Puzzle . . . . . . . . . . . . . . . . . . 23 79 7.2.4. Receiving Puzzle Solution . . . . . . . . . . . . . . 23 80 8. Payload Formats . . . . . . . . . . . . . . . . . . . . . . . 24 81 8.1. PUZZLE Notification . . . . . . . . . . . . . . . . . . . 24 82 8.2. Puzzle Solution Payload . . . . . . . . . . . . . . . . . 25 83 9. Operational Considerations . . . . . . . . . . . . . . . . . 26 84 10. Security Considerations . . . . . . . . . . . . . . . . . . . 26 85 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 86 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 27 87 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 27 88 13.1. Normative References . . . . . . . . . . . . . . . . . . 27 89 13.2. Informative References . . . . . . . . . . . . . . . . . 28 90 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28 92 1. Introduction 94 Denial of Service (DoS) attacks have always been considered a serious 95 threat. These attacks are usually difficult to defend against since 96 the amount of resources the victim has is always bounded (regardless 97 of how high it is) and because some resources are required for 98 distinguishing a legitimate session from an attack. 100 The Internet Key Exchange protocol version 2 (IKEv2) described in 101 [RFC7296] includes defense against DoS attacks. In particular, there 102 is a cookie mechanism that allows the IKE Responder to effectively 103 defend itself against DoS attacks from spoofed IP-addresses. 104 However, bot-nets have become widespread, allowing attackers to 105 perform Distributed Denial of Service (DDoS) attacks, which are more 106 difficult to defend against. This document presents recommendations 107 to help the Responder thwart (D)DoS attacks. It also introduces a 108 new mechanism -- "puzzles" -- that can help accomplish this task. 110 2. Conventions Used in This Document 112 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 113 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 114 document are to be interpreted as described in [RFC2119]. 116 3. The Vulnerability 118 The IKE_SA_INIT Exchange described in Section 1.2 of [RFC7296] 119 involves the Initiator sending a single message. The Responder 120 replies with a single message and also allocates memory for a 121 structure called a half-open IKE Security Association (SA). This 122 half-open SA is later authenticated in the IKE_AUTH Exchange. If 123 that IKE_AUTH request never comes, the half-open SA is kept for an 124 unspecified amount of time. Depending on the algorithms used and 125 implementation, such a half-open SA will use from around 100 bytes to 126 several thousands bytes of memory. 128 This creates an easy attack vector against an IKE Responder. 129 Generating the IKE_SA_INIT request is cheap, and sending multiple 130 such requests can either cause the Responder to allocate too much 131 resources and fail, or else if resource allocation is somehow 132 throttled, legitimate Initiators would also be prevented from setting 133 up IKE SAs. 135 An obvious defense, which is described in Section 4.2, is limiting 136 the number of half-open SAs opened by a single peer. However, since 137 all that is required is a single packet, an attacker can use multiple 138 spoofed source IP addresses. 140 If we break down what a Responder has to do during an initial 141 exchange, there are three stages: 143 1. When the IKE_SA_INIT request arrives, the Responder: 145 * Generates or re-uses a Diffie-Hellman (D-H) private part. 147 * Generates a Responder Security Parameter Index (SPI). 149 * Stores the private part and peer public part in a half-open SA 150 database. 152 2. When the IKE_AUTH request arrives, the Responder: 154 * Derives the keys from the half-open SA. 156 * Decrypts the request. 158 3. If the IKE_AUTH request decrypts properly: 160 * Validates the certificate chain (if present) in the IKE_AUTH 161 request. 163 Yes, there's a stage 4 where the Responder actually creates Child 164 SAs, but when talking about (D)DoS, we never get to this stage. 166 Stage #1 is pretty light on CPU power, but requires some storage, and 167 it's very light for the Initiator as well. Stage #2 includes 168 private-key operations, so it's much heavier CPU-wise. Stage #3 169 includes public key operations, typically more than one. 171 To attack such a Responder, an attacker can attempt to either exhaust 172 memory or to exhaust CPU. Without any protection, the most efficient 173 attack is to send multiple IKE_SA_INIT requests and exhaust memory. 174 This should be easy because those requests are cheap. 176 There are obvious ways for the Responder to protect itself even 177 without changes to the protocol. It can reduce the time that an 178 entry remains in the half-open SA database, and it can limit the 179 amount of concurrent half-open SAs from a particular address or 180 prefix. The attacker can overcome this by using spoofed source 181 addresses. 183 The stateless cookie mechanism from Section 2.6 of [RFC7296] prevents 184 an attack with spoofed source addresses. This doesn't completely 185 solve the issue, but it makes the limiting of half-open SAs by 186 address or prefix work. Puzzles, introduced in Section 4.4, do the 187 same thing only more of it. They make it harder for an attacker to 188 reach the goal of getting a half-open SA. They don't have to be so 189 hard that an attacker can't afford to solve a single puzzle; it's 190 enough that they increase the cost of a half-open SAs for the 191 attacker so that it can create only a few. 193 Reducing the amount of time an abandoned half-open SA is kept attacks 194 the issue from the other side. It reduces the value the attacker 195 gets from managing to create a half-open SA. For example, if a half- 196 open SA is kept for 1 minute and the capacity is 60,000 half-open 197 SAs, an attacker would need to create 1,000 half-open SAs per second. 198 Reduce the retention time to 3 seconds, and the attacker needs to 199 create 20,000 half-open SAs per second. By introducing a puzzle, 200 each half-open SA becomes more expensive for an attacker, making it 201 more likely to thwart an exhaustion attack against Responder memory. 203 At this point, filling up the half-open SA database is no longer the 204 most efficient DoS attack. The attacker has two ways to do better: 206 1. Go back to spoofed addresses and try to overwhelm the CPU that 207 deals with generating cookies, or 209 2. Take the attack to the next level by also sending an IKE_AUTH 210 request. 212 It seems that the first thing cannot be dealt with at the IKE level. 213 It's probably better left to Intrusion Prevention System (IPS) 214 technology. 216 On the other hand, sending an IKE_AUTH request is surprisingly cheap. 217 It requires a proper IKE header with the correct IKE SPIs, and it 218 requires a single Encrypted payload. The content of the payload 219 might as well be junk. The Responder has to perform the relatively 220 expensive key derivation, only to find that the MAC on the Encrypted 221 payload on the IKE_AUTH request does not check. Depending on the 222 Responder implementation, this can be repeated with the same half- 223 open SA. Puzzles can make attacks of such sort more costly for an 224 attacker. See Section 7.2 for details. 226 Here too, the number of half-open SAs that the attacker can achieve 227 is crucial, because each one allows the attacker to waste some CPU 228 time. So making it hard to make many half-open SAs is important. 230 A strategy against DDoS has to rely on at least 4 components: 232 1. Hardening the half-open SA database by reducing retention time. 234 2. Hardening the half-open SA database by rate-limiting single IPs/ 235 prefixes. 237 3. Guidance on what to do when an IKE_AUTH request fails to decrypt. 239 4. Increasing cost of half-open SA up to what is tolerable for 240 legitimate clients. 242 Puzzles have their place as part of #4. 244 4. Defense Measures while IKE SA is being created 246 4.1. Retention Periods for Half-Open SAs 248 As a UDP-based protocol, IKEv2 has to deal with packet loss through 249 retransmissions. Section 2.4 of [RFC7296] recommends "that messages 250 be retransmitted at least a dozen times over a period of at least 251 several minutes before giving up". Retransmission policies in 252 practice wait at least one or two seconds before retransmitting for 253 the first time. 255 Because of this, setting the timeout on a half-open SA too low will 256 cause it to expire whenever even one IKE_AUTH request packet is lost. 257 When not under attack, the half-open SA timeout SHOULD be set high 258 enough that the Initiator will have enough time to send multiple 259 retransmissions, minimizing the chance of transient network 260 congestion causing IKE failure. 262 When the system is under attack, as measured by the amount of half- 263 open SAs, it makes sense to reduce this lifetime. The Responder 264 should still allow enough time for the round-trip, enough time for 265 the Initiator to derive the D-H shared value, and enough time to 266 derive the IKE SA keys and the create the IKE_AUTH request. Two 267 seconds is probably as low a value as can realistically be used. 269 It could make sense to assign a shorter value to half-open SAs 270 originating from IP addresses or prefixes that are considered suspect 271 because of multiple concurrent half-open SAs. 273 4.2. Rate Limiting 275 Even with DDoS, the attacker has only a limited amount of nodes 276 participating in the attack. By limiting the amount of half-open SAs 277 that are allowed to exist concurrently with each such node, the total 278 amount of half-open SAs is capped, as is the total amount of key 279 derivations that the Responder is forced to complete. 281 In IPv4 it makes sense to limit the number of half-open SAs based on 282 IP address. Most IPv4 nodes are either directly attached to the 283 Internet using a routable address or are hidden behind a NAT device 284 with a single IPv4 external address. IPv6 networks are currently a 285 rarity, so we can only speculate on what their wide deployment will 286 be like, but the current thinking is that ISP customers will be 287 assigned whole subnets, so we don't expect the kind of NAT deployment 288 that is common in IPv4. For this reason, it makes sense to use a 289 64-bit prefix as the basis for rate limiting in IPv6. 291 The number of half-open SAs is easy to measure, but it is also 292 worthwhile to measure the number of failed IKE_AUTH exchanges. If 293 possible, both factors should be taken into account when deciding 294 which IP address or prefix is considered suspicious. 296 There are two ways to rate-limit a peer address or prefix: 298 1. Hard Limit - where the number of half-open SAs is capped, and any 299 further IKE_SA_INIT requests are rejected. 301 2. Soft Limit - where if a set number of half-open SAs exist for a 302 particular address or prefix, any IKE_SA_INIT request will 303 require solving a puzzle. 305 The advantage of the hard limit method is that it provides a hard cap 306 on the amount of half-open SAs that the attacker is able to create. 307 The downside is that it allows the attacker to block IKE initiation 308 from small parts of the Internet. For example, if an network service 309 provider or some establishment offers Internet connectivity to its 310 customers or employees through an IPv4 NAT device, a single malicious 311 customer can create enough half-open SAs to fill the quota for the 312 NAT device external IP address. Legitimate Initiators on the same 313 network will not be able to initiate IKE. 315 The advantage of a soft limit is that legitimate clients can always 316 connect. The disadvantage is that an adversary with sufficient CPU 317 resources can still effectively DoS the Responder. 319 Regardless of the type of rate-limiting used, there is a huge 320 advantage in blocking the DoS attack using rate-limiting for 321 legitimate clients that are away from the attacking nodes. In such 322 cases, adverse impacts caused by the attack or by the measures used 323 to counteract the attack can be avoided. 325 4.3. The Stateless Cookie 327 Section 2.6 of [RFC7296] offers a mechanism to mitigate DoS attack: 328 the stateless cookie. When the server is under load, the Responder 329 responds to the IKE_SA_INIT request with a calculated "stateless 330 cookie" - a value that can be re-calculated based on values in the 331 IKE_SA_INIT request without storing Responder-side state. The 332 Initiator is expected to repeat the IKE_SA_INIT request, this time 333 including the stateless cookie. 335 Attackers that have multiple source IP addresses with return 336 routability, such as in the case of bot-nets, can fill up a half-open 337 SA table anyway. The cookie mechanism limits the amount of allocated 338 state to the size of the bot-net, multiplied by the number of half- 339 open SAs allowed per peer address, multiplied by the amount of state 340 allocated for each half-open SA. With typical values this can easily 341 reach hundreds of megabytes. 343 4.4. Puzzles 345 The puzzle introduced here extends the cookie mechanism from 346 [RFC7296]. It is loosely based on the proof-of-work technique used 347 in Bitcoins [bitcoins]. This sets an upper bound, determined by the 348 attacker's CPU, to the number of negotiations it can initiate in a 349 unit of time. 351 A puzzle is sent to the Initiator in two cases: 353 o The Responder is so overloaded that no half-open SAs may be 354 created without solving a puzzle, or 356 o The Responder is not too loaded, but the rate-limiting method 357 described in Section 4.2 prevents half-open SAs from being created 358 with this particular peer address or prefix without first solving 359 a puzzle. 361 When the Responder decides to send the challenge notification in 362 response to a IKE_SA_INIT request, the notification includes three 363 fields: 365 1. Cookie - this is calculated the same as in [RFC7296], i.e. the 366 process of generating the cookie is not specified. 368 2. Algorithm, this is the identifier of a Pseudo-Random Function 369 (PRF) algorithm, one of those proposed by the Initiator in the SA 370 payload. 372 3. Zero Bit Count (ZBC). This is a number between 8 and 255 (or a 373 special value - 0, see Section 7.1.1.1) that represents the 374 length of the zero-bit run at the end of the output of the PRF 375 function calculated over the cookie that the Initiator is to 376 send. The values 1-8 are explicitly excluded, because they 377 create a puzzle that is too easy to solve for it to make any 378 difference in mitigating DDoS attacks. Since the mechanism is 379 supposed to be stateless for the Responder, either the same ZBC 380 is used for all Initiators, or the ZBC is somehow encoded in the 381 cookie. If it is global then it means that this value is the 382 same for all the Initiators who are receiving puzzles at any 383 given point of time. The Responder, however, may change this 384 value over time depending on its load. 386 Upon receiving this challenge, the Initiator attempts to calculate 387 the PRF using different keys. When enough keys are found such that 388 the resulting PRF output calculated using each of them has a 389 sufficient number of trailing zero bits, that result is sent to the 390 Responder. 392 The reason for using several keys in the results rather than just one 393 key is to reduce the variance in the time it takes the initiator to 394 solve the puzzle. We have chosen the number of keys to be four (4) 395 as a compromise between the conflicting goals of reducing variance 396 and reducing the work the Responder needs to perform to verify the 397 puzzle solution. 399 When receiving a request with a solved puzzle, the Responder verifies 400 two things: 402 o That the cookie part is indeed valid. 404 o That the PRFs of the transmitted cookie calculated with the 405 transmitted keys has a sufficient number of trailing zero bits. 407 Example 1: Suppose the calculated cookie is 408 739ae7492d8a810cf5e8dc0f9626c9dda773c5a3 (20 octets), the algorithm 409 is PRF-HMAC-SHA256, and the required number of zero bits is 18. 410 After successively trying a bunch of keys, the Initiator finds the 411 following four 3-octet keys that work: 413 +--------+----------------------------------+----------+ 414 | Key | Last 32 Hex PRF Digits | # 0-bits | 415 +--------+----------------------------------+----------+ 416 | 061840 | e4f957b859d7fb1343b7b94a816c0000 | 18 | 417 | 073324 | 0d4233d6278c96e3369227a075800000 | 23 | 418 | 0c8a2a | 952a35d39d5ba06709da43af40700000 | 20 | 419 | 0d94c8 | 5a0452b21571e401a3d00803679c0000 | 18 | 420 +--------+----------------------------------+----------+ 422 Table 1: Three solutions for 18-bit puzzle 424 Example 2: Same cookie, but modify the required number of zero bits 425 to 22. The first 4-octet keys that work to satisfy that requirement 426 are 005d9e57, 010d8959, 0110778d, and 01187e37. Finding these 427 requires 18,382,392 invocations of the PRF. 429 +----------+-------------------------------+ 430 | # 0-bits | Time to Find 4 keys (seconds) | 431 +----------+-------------------------------+ 432 | 8 | 0.0025 | 433 | 10 | 0.0078 | 434 | 12 | 0.0530 | 435 | 14 | 0.2521 | 436 | 16 | 0.8504 | 437 | 17 | 1.5938 | 438 | 18 | 3.3842 | 439 | 19 | 3.8592 | 440 | 20 | 10.8876 | 441 +----------+-------------------------------+ 443 Table 2: The time needed to solve a puzzle of various difficulty for 444 the cookie = 739ae7492d8a810cf5e8dc0f9626c9dda773c5a3 446 The figures above were obtained on a 2.4 GHz single core i5. Run 447 times can be halved or quartered with multi-core code, but would be 448 longer on mobile phone processors, even if those are multi-core as 449 well. With these figures 18 bits is believed to be a reasonable 450 choice for puzzle level difficulty for all Initiators, and 20 bits is 451 acceptable for specific hosts/prefixes. 453 Using puzzles mechanism in the IKE_SA_INIT exchange is described in 454 Section 7.1. 456 4.5. Session Resumption 458 When the Responder is under attack, it MAY choose to prefer 459 previously authenticated peers who present a Session Resumption 460 ticket (see [RFC5723] for details). The Responder MAY require such 461 Initiators to pass a return routability check by including the COOKIE 462 notification in the IKE_SESSION_RESUME response message, as allowed 463 by Section 4.3.2. of [RFC5723]. Note that the Responder SHOULD cache 464 tickets for a short time to reject reused tickets (Section 4.3.1), 465 and therefore there should be no issue of half-open SAs resulting 466 from replayed IKE_SESSION_RESUME messages. 468 4.6. Keeping computed Shared Keys 470 Once the IKE_SA_INIT exchange is finished, the Responder is waiting 471 for the first message of the IKE_AUTH exchange from the Initiator. 472 At this point the Initiator is not yet authenticated and this fact 473 allows a malicious peer to perform an attack, described in Section 3 474 - it can just send a garbage in the IKE_AUTH message thus forcing the 475 Responder to perform CPU costly operations to compute SK_* keys. 477 If the received IKE_AUTH message failed to decrypt correctly (or 478 failed to pass ICV check), then the Responder SHOULD still keep the 479 computed SK_* keys, so that if it happened to be an attack, then the 480 malicious Initiator cannot get advantage of repeating the attack 481 multiple times on a single IKE SA. The responder can also use 482 puzzles in the IKE_AUTH exchange as decribed in Section 7.2. 484 4.7. Preventing Attacks using "Hash and URL" Certificate Encoding 486 In IKEv2 each side may use "Hash and URL" Certificate Encoding to 487 instruct the peer to retrieve certificates from the specified 488 location (see Section 3.6 of [RFC7296] for details). Malicious 489 initiators can use this feature to mount a DoS attack on responder by 490 providing an URL pointing to a large file possibly containing 491 garbage. While downloading the file the responder consumes CPU, 492 memory and network bandwidth. 494 To prevent this kind of attacks the responder should not blindly 495 download the whole file. Instead it SHOULD first read the initial 496 few bytes, decode the length of the ASN.1 structure from these bytes, 497 and then download no more than the decoded number of bytes. Note, 498 that it is always possible to determine the length of ASN.1 499 structures used in IKEv2 by analyzing the first few bytes, if they 500 are DER-encoded. Implementations SHOULD also apply a configurable 501 hard limit to the number of pulled bytes and SHOULD provide an 502 ability for an administrator to either completely disable this 503 feature or to limit its use to a configurable list of trusted URLs. 505 5. Defense Measures after IKE SA is created 507 Once IKE SA is created there is usually not much traffic over it. In 508 most cases this traffic consists of exchanges aimed to create 509 additional Child SAs, rekey, or delete them and check the liveness of 510 the peer. With a typical setup and typical Child SA lifetimes, there 511 are typically no more than a few such exchanges, often less. Some of 512 these exchanges require relatively little resources (like liveness 513 check), while others may be resource consuming (like creating or 514 rekeying Child SA with D-H exchange). 516 Since any endpoint can initiate a new exchange, there is a 517 possibility that a peer would initiate too many exchanges that could 518 exhaust host resources. For example, the peer can perform endless 519 continuous Child SA rekeying or create overwhelming number of Child 520 SAs with the same Traffic Selectors etc. Such behavior may be caused 521 by buggy implementation, misconfiguration or be intentional. The 522 latter becomes more of a real threat if the peer uses NULL 523 Authentication, described in [RFC7619]. In this case the peer 524 remains anonymous, allowing it to escape any responsibility for its 525 actions. 527 The following recommendations for defense against possible DoS 528 attacks after IKE SA is established are mostly intended for 529 implementations that allow unauthenticated IKE sessions; however, 530 they may also be useful in other cases. 532 o If the IKEv2 window size is greater than one, then the peer could 533 initiate multiple simultaneous exchanges that could increase host 534 resource consumption. Since currently there is no way in IKEv2 to 535 decrease window size once it was increased (see Section 2.3 of 536 [RFC7296]), the window size cannot be dynamically adjusted 537 depending on the load. For that reason, it is NOT RECOMMENDED to 538 ever increase the IKEv2 window size above its default value of one 539 if the peer uses NULL Authentication. 541 o If the peer initiates requests to rekey IKE SA or Child SA too 542 often, implementations can respond to some of these requests with 543 the TEMPORARY_FAILURE notification, indicating that the request 544 should be retried after some period of time. 546 o If the peer creates too many Child SA with the same or overlapping 547 Traffic Selectors, implementations can respond with the 548 NO_ADDITIONAL_SAS notification. 550 o If the peer initiates too many exchanges of any kind, 551 implementations can introduce an artificial delay before 552 responding to request messages. This delay would decrease the 553 rate the implementation need to process requests from any 554 particular peer, making it possible to process requests from the 555 others. The delay should not be too long to avoid causing the IKE 556 SA to be deleted on the other end due to timeout. It is believed 557 that a few seconds is enough. Note, that if the Responder 558 receives retransmissions of the request message during the delay 559 period, the retransmitted messages should be silently discarded. 561 o If these counter-measures are inefficient, implementations can 562 delete the IKE SA with an offending peer by sending Delete 563 Payload. 565 6. Plan for Defending a Responder 567 This section outlines a plan for defending a Responder from a DDoS 568 attack based on the techniques described earlier. The numbers given 569 here are not normative, and their purpose is to illustrate the 570 configurable parameters needed for defeating the DDoS attack. 572 Implementations may be deployed in different environments, so it is 573 RECOMMENDED that the parameters be settable. As an example, most 574 commercial products are required to undergo benchmarking where the 575 IKE SA establishment rate is measured. Benchmarking is 576 indistinguishable from a DoS attack and the defenses described in 577 this document may defeat the benchmark by causing exchanges to fail 578 or take a long time to complete. Parameters should be tunable to 579 allow for benchmarking (if only by turning DDoS protection off). 581 Since all countermeasures may cause delays and work on the 582 Initiators, they SHOULD NOT be deployed unless an attack is likely to 583 be in progress. To minimize the burden imposed on Initiators, the 584 Responder should monitor incoming IKE requests, searching for two 585 things: 587 1. A general DDoS attack. Such an attack is indicated by a high 588 number of concurrent half-open SAs, a high rate of failed 589 IKE_AUTH exchanges, or a combination of both. For example, 590 consider a Responder that has 10,000 distinct peers of which at 591 peak 7,500 concurrently have VPN tunnels. At the start of peak 592 time, 600 peers might establish tunnels at any given minute, and 593 tunnel establishment (both IKE_SA_INIT and IKE_AUTH) takes 594 anywhere from 0.5 to 2 seconds. For this Responder, we expect 595 there to be less than 20 concurrent half-open SAs, so having 100 596 concurrent half-open SAs can be interpreted as an indication of 597 an attack. Similarly, IKE_AUTH request decryption failures 598 should never happen. Supposing the the tunnels are established 599 using EAP (see Section 2.16 of [RFC7296]), users enter the wrong 600 password about 20% of the time. So we'd expect 125 wrong 601 password failures a minute. If we get IKE_AUTH decryption 602 failures from multiple sources more than once per second, or EAP 603 failure more than 300 times per minute, that can also be an 604 indication of a DDoS attack. 606 2. An attack from a particular IP address or prefix. Such an attack 607 is indicated by an inordinate amount of half-open SAs from that 608 IP address or prefix, or an inordinate amount of IKE_AUTH 609 failures. A DDoS attack may be viewed as multiple such attacks. 610 If they are mitigated well enough, there will not be a need enact 611 countermeasures on all Initiators. Typical measures might be 5 612 concurrent half-open SAs, 1 decrypt failure, or 10 EAP failures 613 within a minute. 615 Note that using counter-measures against an attack from a particular 616 IP address may be enough to avoid the overload on the half-open SA 617 database and in this case the number of failed IKE_AUTH exchanges 618 never exceeds the threshold of attack detection. This is a good 619 thing as it prevents Initiators that are not close to the attackers 620 from being affected. 622 When there is no general DDoS attack, it is suggested that no cookie 623 or puzzles be used. At this point the only defensive measure is the 624 monitoring of the number of half-open SAs, and setting a soft limit 625 per peer IP or prefix. The soft limit can be set to 3-5, and the 626 puzzle difficulty should be set to such a level (number of zero-bits) 627 that all legitimate clients can handle it without degraded user 628 experience. 630 As soon as any kind of attack is detected, either a lot of 631 initiations from multiple sources or a lot of initiations from a few 632 sources, it is best to begin by requiring stateless cookies from all 633 Initiators. This will force the attacker to use real source 634 addresses, and help avoid the need to impose a greater burden in the 635 form of cookies on the general population of Initiators. This makes 636 the per-node or per-prefix soft limit more effective. 638 When cookies are activated for all requests and the attacker is still 639 managing to consume too many resources, the Responder MAY increase 640 the difficulty of puzzles imposed on IKE_SA_INIT requests coming from 641 suspicious nodes/prefixes. It should still be doable by all 642 legitimate peers, but it can degrade experience, for example by 643 taking up to 10 seconds to solve the puzzle. 645 If the load on the Responder is still too great, and there are many 646 nodes causing multiple half-open SAs or IKE_AUTH failures, the 647 Responder MAY impose hard limits on those nodes. 649 If it turns out that the attack is very widespread and the hard caps 650 are not solving the issue, a puzzle MAY be imposed on all Initiators. 651 Note that this is the last step, and the Responder should avoid this 652 if possible. 654 7. Using Puzzles in the Protocol 656 This section describes how the puzzle mechanism is used in IKEv2. It 657 is organized as follows. The Section 7.1 describes using puzzles in 658 the IKE_SA_INIT exchange and the Section 7.2 describes using puzzles 659 in the IKE_AUTH exchange. Both sections are divided into subsections 660 describing how puzzles should be presented, solved and processed by 661 the Initiator and the Responder. 663 7.1. Puzzles in IKE_SA_INIT Exchange 665 IKE Initiator indicates the desire to create a new IKE SA by sending 666 IKE_SA_INIT request message. The message may optionally contain a 667 COOKIE notification if this is a repeated request performed after the 668 Responder's demand to return a cookie. 670 HDR, [N(COOKIE),] SA, KE, Ni, [V+][N+] --> 672 According to the plan, described in Section 6, the IKE Responder 673 should monitor incoming requests to detect whether it is under 674 attack. If the Responder learns that (D)DoS attack is likely to be 675 in progress, then its actions depend on the volume of the attack. If 676 the volume is moderate, then the Responder requests the Initiator to 677 return a cookie. If the volume is so high, that puzzles need to be 678 used for defense, then the Responder requests the Initiator to solve 679 a puzzle. 681 The Responder MAY choose to process some fraction of IKE_SA_INIT 682 requests without presenting a puzzle while being under attack to 683 allow legacy clients, that don't support puzzles, to have a chance to 684 be served. The decision whether to process any particular request 685 must be probabilistic, with the probability depending on the 686 Responder's load (i.e. on the volume of attack). The requests that 687 don't contain the COOKIE notification MUST NOT participate in this 688 lottery. In other words, the Responder must first perform return 689 routability check before allowing any legacy client to be served if 690 it is under attack. See Section 7.1.4 for details. 692 7.1.1. Presenting Puzzle 694 If the Responder makes a decision to use puzzles, then it MUST 695 include two notifications in its response message - the COOKIE 696 notification and the PUZZLE notification. The format of the PUZZLE 697 notification is described in Section 8.1. 699 <-- HDR, N(COOKIE), N(PUZZLE), [V+][N+] 701 The presence of these notifications in an IKE_SA_INIT response 702 message indicates to the Initiator that it should solve the puzzle to 703 get better chances to be served. 705 7.1.1.1. Selecting Puzzle Difficulty Level 707 The PUZZLE notification contains the difficulty level of the puzzle - 708 the minimum number of trailing zero bits that the result of PRF must 709 contain. In diverse environments it is next to impossible for the 710 Responder to set any specific difficulty level that will result in 711 roughly the same amount of work for all Initiators, because 712 computation power of different Initiators may vary by the order of 713 magnitude, or even more. The Responder may set difficulty level to 714 0, meaning that the Initiator is requested to spend as much power to 715 solve puzzle, as it can afford. In this case no specific value of 716 ZBC is required from the Initiator, however the larger the ZBC that 717 Initiator is able to get, the better the chances it will have to be 718 served by the Responder. In diverse environments it is RECOMMENDED 719 that the Initiator sets difficulty level to 0, unless the attack 720 volume is very high. 722 If the Responder sets non-zero difficulty level, then the level 723 should be determined by analyzing the volume of the attack. The 724 Responder MAY set different difficulty levels to different requests 725 depending on the IP address the request has come from. 727 7.1.1.2. Selecting Puzzle Algorithm 729 The PUZZLE notification also contains identifier of the algorithm, 730 that must be used by Initiator to compute puzzle. 732 Cryptographic algorithm agility is considered an important feature 733 for modern protocols ([RFC7696]). This feature ensures that protocol 734 doesn't rely on a single build-in set of cryptographic algorithms, 735 but has a means to replace one set with another and negotiate new set 736 with the peer. IKEv2 fully supports cryptographic algorithm agility 737 for its core operations. 739 To support this feature in case of puzzles, the algorithm that is 740 used to compute puzzle needs to be negotiated during IKE_SA_INIT 741 exchange. The negotiation is performed as follows. The initial 742 request message sent by Initiator contains SA payload with the list 743 of transforms the Initiator supports and is willing to use in the IKE 744 SA being established. The Responder parses received SA payload and 745 finds mutually supported set of transforms of type PRF. It selects 746 most preferred transform from this set and includes it into the 747 PUZZLE notification. There is no requirement that the PRF selected 748 for puzzles be the same, as the PRF that is negotiated later for the 749 use in core IKE SA crypto operations. If there are no mutually 750 supported PRFs, then negotiation will fail anyway and there is no 751 reason to return a puzzle. In this case the Responder returns 752 NO_PROPOSAL_CHOSEN notification. Note that PRF is a mandatory 753 transform type for IKE SA (see Sections 3.3.2 and 3.3.3 of [RFC7296]) 754 and at least one transform of this type must always be present in SA 755 payload in IKE_SA_INIT request message. 757 7.1.1.3. Generating Cookie 759 If Responder supports puzzles then cookie should be computed in such 760 a manner, that the Responder is able to learn some important 761 information from the sole cookie, when it is later returned back by 762 Initiator. In particular - the Responder should be able to learn the 763 following information: 765 o Whether the puzzle was given to the Initiator or only the cookie 766 was requested. 768 o The difficulty level of the puzzle given to the Initiator. 770 o The number of consecutive puzzles given to the Initiator. 772 o The amount of time the Initiator spent to solve the puzzles. This 773 can be calculated if the cookie is timestamped. 775 This information helps the Responder to make a decision whether to 776 serve this request or demand more work from the Initiator. 778 One possible approach to get this information is to encode it in the 779 cookie. The format of such encoding is a local matter of Responder, 780 as the cookie would remain an opaque blob to the Initiator. If this 781 information is encoded in the cookie, then the Responder MUST make it 782 integrity protected, so that any intended or accidental alteration of 783 this information in returned cookie is detectable. So, the cookie 784 would be generated as: 786 Cookie = | | 787 Hash(Ni | IPi | SPIi | | ) 789 Alternatively, the Responder may continue to generate cookie as 790 suggested in Section 2.6 of [RFC7296], but associate the additional 791 information, that would be stored locally, with the particular 792 version of the secret. In this case the Responder should have 793 different secrets for every combination of difficulty level and 794 number of consecutive puzzles, and should change the secrets 795 periodically, keeping a few previous versions, to be able to 796 calculate how long ago the cookie was generated. 798 The Responder may also combine these approaches. This document 799 doesn't mandate how the Responder learns this information from the 800 cookie. 802 7.1.2. Solving Puzzle and Returning the Solution 804 If the Initiator receives a puzzle but it doesn't support puzzles, 805 then it will ignore the PUZZLE notification as an unrecognized status 806 notification (in accordance to Section 3.10.1 of [RFC7296]). The 807 Initiator also MAY ignore the PUZZLE notification if it is not 808 willing to spend resources to solve the puzzle of the requested 809 difficulty, even if it supports puzzles. In both cases the Initiator 810 acts as described in Section 2.6 of [RFC7296] - it restarts the 811 request and includes the received COOKIE notification into it. The 812 Responder should be able to distinguish the situation when it just 813 requested a cookie from the situation when the puzzle was given to 814 the Initiator, but the Initiator for some reason ignored it. 816 If the received message contains a PUZZLE notification and doesn't 817 contain a COOKIE notification, then this message is malformed because 818 it requests to solve the puzzle, but doesn't provide enough 819 information to do it. In this case the Initiator MUST ignore the 820 received message and continue to wait until either the valid one is 821 received or the retransmission timer fires. If it fails to receive 822 the valid message after several retransmissions of IKE_SA_INIT 823 request, then it means that something is wrong and the IKE SA cannot 824 be established. 826 If the Initiator supports puzzles and is ready to deal with them, 827 then it tries to solve the given puzzle. After the puzzle is solved 828 the Initiator restarts the request and returns the puzzle solution in 829 a new payload called a Puzzle Solution payload (denoted as PS, see 830 Section 8.2) along with the received COOKIE notification back to the 831 Responder. 833 HDR, N(COOKIE), [PS,] SA, KE, Ni, [V+][N+] --> 835 7.1.3. Computing Puzzle 837 General principals of constructing puzzles in IKEv2 are described in 838 Section 4.4. They can be summarized as follows: given unpredictable 839 string S and pseudo-random function PRF find N different keys Ki 840 (where i=[1..N]) for that PRF so that the result of PRF(Ki,S) has at 841 least the specified number of trailing zero bits. This specification 842 requires that the solution to the puzzle contains 4 different keys 843 (i.e. N=4). 845 In the IKE_SA_INIT exchange it is the cookie that plays the role of 846 unpredictable string S. In other words, in IKE_SA_INIT the task for 847 the IKE Initiator is to find the four different, equal-sized keys Ki 848 for the agreed upon PRF such that each result of PRF(Ki,cookie) where 849 i = [1..4] has a sufficient number of trailing zero bits. Only the 850 content of the COOKIE notification is used in puzzle calculation, 851 i.e. the header of the Notification payload is not included. 853 Note, that puzzles in the IKE_AUTH exchange are computed differently 854 than in the IKE_SA_INIT_EXCHANGE. See Section 7.2.3 for details. 856 7.1.4. Analyzing Repeated Request 858 The received request must at least contain a COOKIE notification. 859 Otherwise it is an initial request and it must be processed according 860 to Section 7.1. First, the cookie MUST be checked for validity. If 861 the cookie is invalid, then the request is treated as initial and is 862 processed according to Section 7.1. It is RECOMMENDED that a new 863 cookie is requested in this case. 865 If the cookie is valid then some important information is learned 866 from it or from local state based on identifier of the cookie's 867 secret (see Section 7.1.1.3 for details). This information helps the 868 Responder to sort out incoming requests, giving more priority to 869 those of them, which were created by spending more of the Initiator's 870 resources. 872 First, the Responder determines if it requested only a cookie, or 873 presented a puzzle to the Initiator. If no puzzle was given, then it 874 means that at the time the Responder requested a cookie it didn't 875 detect the (D)DoS attack or the attack volume was low. In this case 876 the received request message must not contain the PS payload, and 877 this payload MUST be ignored if for any reason the message contains 878 it. Since no puzzle was given, the Responder marks the request with 879 the lowest priority since the Initiator spent little resources 880 creating it. 882 If the Responder learns from the cookie that the puzzle was given to 883 the Initiator, then it looks for the PS payload to determine whether 884 its request to solve the puzzle was honored or not. If the incoming 885 message doesn't contain a PS payload, then it means that the 886 Initiator either doesn't support puzzles or doesn't want to deal with 887 them. In either case the request is marked with the lowest priority 888 since the Initiator spent little resources creating it. 890 If a PS payload is found in the message, then the Responder MUST 891 verify the puzzle solution that it contains. The solution is 892 interpreted as four different keys. The result of using each of them 893 in the PRF (as described in Section 7.1.3) must contain at least the 894 requested number of trailing zero bits. The Responder MUST check all 895 the four returned keys. 897 If any checked result contains fewer bits than were requested, it 898 means that the Initiator spent less resources than expected by the 899 Responder. This request is marked with the lowest priority. 901 If the Initiator provided the solution to the puzzle satisfying the 902 requested difficulty level, or if the Responder didn't indicate any 903 particular difficulty level (by setting ZBC to zero) and the 904 Initiator was free to select any difficulty level it can afford, then 905 the priority of the request is calculated based on the following 906 considerations: 908 o The Responder must take the smallest number of trailing zero bits 909 among the checked results and count it as the number of zero bits 910 the Initiator got. 912 o The higher number of zero bits the Initiator got, the higher 913 priority its request should receive. 915 o The more consecutive puzzles the Initiator solved, the higher 916 priority it should receive. 918 o The more time the Initiator spent solving the puzzles, the higher 919 priority it should receive. 921 After the priority of the request is determined the final decision 922 whether to serve it or not is made. 924 7.1.5. Making Decision whether to Serve the Request 926 The Responder decides what to do with the request based on its 927 priority and Responder's current load. There are three possible 928 actions: 930 o Accept request. 932 o Reject request. 934 o Demand more work from Initiator by giving it a new puzzle. 936 The Responder SHOULD accept an incoming request if its priority is 937 high - it means that the Initiator spent quite a lot of resources. 938 The Responder MAY also accept some of low-priority requests where the 939 Initiators don't support puzzles. The percentage of accepted legacy 940 requests depends on the Responder's current load. 942 If the Initiator solved the puzzle, but didn't spend much resources 943 for it (the selected puzzle difficulty level appeared to be low and 944 the Initiator solved it quickly), then the Responder SHOULD give it 945 another puzzle. The more puzzles the Initiator solves the higher its 946 chances are to be served. 948 The details of how the Responder makes a decision for any particular 949 request, are implementation dependent. The Responder can collect all 950 the incoming requests for some short period of time, sort them out 951 based on their priority, calculate the number of available memory 952 slots for half-open IKE SAs and then serve that number of requests 953 from the head of the sorted list. The rest of requests can be either 954 discarded or responded to with new puzzles. 956 Alternatively, the Responder may decide whether to accept every 957 incoming request with some kind of lottery, taking into account its 958 priority and the available resources. 960 7.2. Puzzles in IKE_AUTH Exchange 962 Once the IKE_SA_INIT exchange is completed, the Responder has created 963 a state and is waiting for the first message of the IKE_AUTH exchange 964 from the Initiator. At this point the Initiator has already passed 965 return routability check and has proved that it has performed some 966 work to complete IKE_SA_INIT exchange. However, the Initiator is not 967 yet authenticated and this fact allows malicious Initiator to perform 968 an attack, described in Section 3. Unlike DoS attack in IKE_SA_INIT 969 exchange, which is targeted on the Responder's memory resources, the 970 goal of this attack is to exhaust a Responder's CPU power. The 971 attack is performed by sending the first IKE_AUTH message containing 972 garbage. This costs nothing to the Initiator, but the Responder has 973 to do relatively costly operations of computing the D-H shared secret 974 and deriving SK_* keys to be able to verify authenticity of the 975 message. If the Responder doesn't keep the computed keys after an 976 unsuccessful verification of the IKE_AUTH message, then the attack 977 can be repeated several times on the same IKE SA. 979 The Responder can use puzzles to make this attack more costly for the 980 Initiator. The idea is that the Responder includes a puzzle in the 981 IKE_SA_INIT response message and the Initiator includes a puzzle 982 solution in the first IKE_AUTH request message outside the Encrypted 983 payload, so that the Responder is able to verify puzzle solution 984 before computing D-H shared secret. The difficulty level of the 985 puzzle should be selected so that the Initiator would spend 986 substantially more time to solve the puzzle than the Responder to 987 compute the shared secret. 989 The Responder should constantly monitor the amount of the half-open 990 IKE SA states that receive IKE_AUTH messages that cannot be decrypted 991 due to integrity check failures. If the percentage of such states is 992 high and it takes an essential fraction of Responder's computing 993 power to calculate keys for them, then the Responder may assume that 994 it is under attack and SHOULD use puzzles to make it harder for 995 attackers. 997 7.2.1. Presenting Puzzle 999 The Responder requests the Initiator to solve a puzzle by including 1000 the PUZZLE notification in the IKE_SA_INIT response message. The 1001 Responder MUST NOT use puzzles in the IKE_AUTH exchange unless the 1002 puzzle has been previously presented and solved in the preceding 1003 IKE_SA_INIT exchange. 1005 <-- HDR, SA, KE, Nr, N(PUZZLE), [V+][N+] 1007 7.2.1.1. Selecting Puzzle Difficulty Level 1009 The difficulty level of the puzzle in IKE_AUTH exchange should be 1010 chosen so that the Initiator would spend more time to solve the 1011 puzzle than the Responder to compute the D-H shared secret and the 1012 keys, needed to decrypt and verify the IKE_AUTH request message. On 1013 the other hand, the difficulty level should not be too high, 1014 otherwise the legitimate clients would experience an additional delay 1015 while establishing IKE SA. 1017 Note, that since puzzles in the IKE_AUTH exchange are only allowed to 1018 be used if they were used in the preceding IKE_SA_INIT exchange, the 1019 Responder would be able to estimate the computational power of the 1020 Initiator and to select the difficulty level accordingly. Unlike 1021 puzzles in IKE_SA_INIT, the requested difficulty level for IKE_AUTH 1022 puzzles MUST NOT be zero. In other words, the Responder must always 1023 set specific difficulty level and must not let the Initiator to 1024 choose it on its own. 1026 7.2.1.2. Selecting Puzzle Algorithm 1028 The algorithm for the puzzle is selected as described in 1029 Section 7.1.1.2. There is no requirement, that the algorithm for the 1030 puzzle in the IKE_SA INIT exchange be the same, as the algorithm for 1031 the puzzle in IKE_AUTH exchange, however it is expected that in most 1032 cases they will be the same. 1034 7.2.2. Solving Puzzle and Returning the Solution 1036 If the IKE_SA_INIT response message contains the PUZZLE notification 1037 and the Initiator supports puzzles, it MUST solve the puzzle. Note, 1038 that puzzle construction in the IKE_AUTH exchange differs from the 1039 puzzle construction in the IKE_SA_INIT exchange and is described in 1040 Section 7.2.3. Once the puzzle is solved the Initiator sends the 1041 IKE_AUTH request message, containing the Puzzle Solution payload. 1043 HDR, PS, SK {IDi, [CERT,] [CERTREQ,] 1044 [IDr,] AUTH, SA, TSi, TSr} --> 1046 The Puzzle Solution payload MUST be placed outside the Encrypted 1047 payload, so that the Responder would be able to verify the puzzle 1048 before calculating the D-H shared secret and the SK_* keys. 1050 If IKE Fragmentation [RFC7383] is used in IKE_AUTH exchange, then the 1051 PS payload MUST be present only in the first IKE Fragment message, in 1052 accordance with the Section 2.5.3 of [RFC7383]. Note, that 1053 calculation of the puzzle in the IKE_AUTH exchange doesn't depend on 1054 the content of the IKE_AUTH message (see Section 7.2.3). Thus the 1055 Initiator has to solve the puzzle only once and the solution is valid 1056 for both unfragmented and fragmented IKE messages. 1058 7.2.3. Computing Puzzle 1060 The puzzles in the IKE_AUTH exchange are computed differently than in 1061 the IKE_SA_INIT exchange (see Section 7.1.3). The general principle 1062 is the same; the difference is in the construction of the string S. 1063 Unlike the IKE_SA_INIT exchange, where S is the cookie, in the 1064 IKE_AUTH exchange S is a concatenation of Nr and SPIr. In other 1065 words, the task for IKE Initiator is to find the four different keys 1066 Ki for the agreed upon PRF such that each result of PRF(Ki,Nr | SPIr) 1067 where i=[1..4] has a sufficient number of trailing zero bits. Nr is 1068 a nonce used by the Responder in IKE_SA_INIT exchange, stripped of 1069 any headers. SPIr is IKE Responder's SPI from the IKE header of the 1070 SA being established. 1072 7.2.4. Receiving Puzzle Solution 1074 If the Responder requested the Initiator to solve a puzzle in the 1075 IKE_AUTH exchange, then it MUST silently discard all the IKE_AUTH 1076 request messages without the Puzzle Solution payload. 1078 Once the message containing a solution to the puzzle is received, the 1079 Responder MUST verify the solution before performing computationlly 1080 intensive operations i.e. computing the D-H shared secret and the 1081 SK_* keys. The Responder MUST verify all the four returned keys. 1083 The Responder MUST silently discard the received message if any 1084 checked verification result is not correct (contains insufficient 1085 number of trailing zero bits). If the Responder successfully 1086 verifies the puzzle and calculates the SK_* key, but the message 1087 authenticity check fails, then it SHOULD save the calculated keys in 1088 the IKE SA state while waiting for the retransmissions from the 1089 Initiator. In this case the Responder may skip verification of the 1090 puzzle solution and ignore the Puzzle Solution payload in the 1091 retransmitted messages. 1093 If the Initiator uses IKE Fragmentation, then it is possible, that 1094 due to packet loss and/or reordering the Responder could receive non- 1095 first IKE Fragment messages before receiving the first one, 1096 containing the PS payload. In this case the Responder MAY choose to 1097 keep the received fragments until the first fragment containing the 1098 solution to the puzzle is received. However, in this case the 1099 Responder SHOULD NOT try to verify authenticity of the kept fragments 1100 until the first fragment with the PS payload is received and the 1101 solution to the puzzle is verified. After successful verification of 1102 the puzzle the Responder could calculate the SK_* key and verify 1103 authenticity of the collected fragments. 1105 8. Payload Formats 1107 8.1. PUZZLE Notification 1109 The PUZZLE notification is used by the IKE Responder to inform the 1110 Initiator about the necessity to solve the puzzle. It contains the 1111 difficulty level of the puzzle and the PRF the Initiator should use. 1113 1 2 3 1114 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1115 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1116 | Next Payload |C| RESERVED | Payload Length | 1117 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1118 |Protocol ID(=0)| SPI Size (=0) | Notify Message Type | 1119 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1120 | PRF | Difficulty | 1121 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1123 o Protocol ID (1 octet) -- MUST be 0. 1125 o SPI Size (1 octet) - MUST be 0, meaning no Security Parameter 1126 Index (SPI) is present. 1128 o Notify Message Type (2 octets) -- MUST be , the value 1129 assigned for the PUZZLE notification. 1131 o PRF (2 octets) -- Transform ID of the PRF algorithm that must be 1132 used to solve the puzzle. Readers should refer to the section 1133 "Transform Type 2 - Pseudo-Random Function Transform IDs" in 1134 [IKEV2-IANA] for the list of possible values. 1136 o Difficulty (1 octet) -- Difficulty Level of the puzzle. Specifies 1137 minimum number of trailing zero bits (ZBC), that each of the 1138 results of PRF must contain. Value 0 means that the Responder 1139 doesn't request any specific difficulty level and the Initiator is 1140 free to select appropriate difficulty level on its own (see 1141 Section 7.1.1.1 for details). 1143 This notification contains no data. 1145 8.2. Puzzle Solution Payload 1147 The solution to the puzzle is returned back to the Responder in a 1148 dedicated payload, called the Puzzle Solution payload and denoted as 1149 PS in this document. 1151 1 2 3 1152 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1153 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1154 | Next Payload |C| RESERVED | Payload Length | 1155 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1156 | | 1157 ~ Puzzle Solution Data ~ 1158 | | 1159 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1161 o Puzzle Solution Data (variable length) -- Contains the solution to 1162 the puzzle - four different keys for the selected PRF. This field 1163 MUST NOT be empty. All the keys MUST have the same size, 1164 therefore the size of this field is always a mutiple of 4 bytes. 1165 If the selected PRF accepts only fixed-size keys, then the size of 1166 each key MUST be of that fixed size. If the PRF agreed upon 1167 accepts keys of any size, then then the size of each key MUST be 1168 between 1 octet and the preferred key length of the PRF 1169 (inclusive). It is expected that in most cases the keys will be 4 1170 (or even less) octets in length, however it depends on puzzle 1171 difficulty and on the Initiator's strategy to find solutions, and 1172 thus the size is not mandated by this specification. The 1173 Responder determines the size of each key by dividing the size of 1174 the Puzzle Solution Data by 4 (the number of keys). Note that the 1175 size of Puzzle Solution Data is the size of Payload (as indicated 1176 in Payload Length field) minus 4 - the size of Payload Header. 1178 The payload type for the Puzzle Solution payload is . 1180 9. Operational Considerations 1182 The difficulty level should be set by balancing the requirement to 1183 minimize the latency for legitimate Initiators and making things 1184 difficult for attackers. A good rule of thumb is for taking about 1 1185 second to solve the puzzle. A typical Initiator or bot-net member in 1186 2014 can perform slightly less than a million hashes per second per 1187 core, so setting the difficulty level to n=20 is a good compromise. 1188 It should be noted that mobile Initiators, especially phones are 1189 considerably weaker than that. Implementations should allow 1190 administrators to set the difficulty level, and/or be able to set the 1191 difficulty level dynamically in response to load. 1193 Initiators should set a maximum difficulty level beyond which they 1194 won't try to solve the puzzle and log or display a failure message to 1195 the administrator or user. 1197 10. Security Considerations 1199 When selecting parameters for the puzzles, in particular the puzzle 1200 difficulty, care must be taken. If the puzzles appeared too easy for 1201 majority of the attackers, then the puzzles mechanism wouldn't be 1202 able to prevent DoS attack and would only impose an additional burden 1203 on the legitimate Initiators. On the other hand, if the puzzles 1204 appeared to be too hard for majority of the Initiators then many 1205 legitimate users would experience unacceptable delay in IKE SA setup 1206 (or unacceptable power consumption on mobile devices), that might 1207 cause them to cancel connection attempt. In this case the resources 1208 of the Responder are preserved, however the DoS attack can be 1209 considered successful. Thus a sensible balance should be kept by the 1210 Responder while choosing the puzzle difficulty - to defend itself and 1211 to not over-defend itself. It is RECOMMENDED that the puzzle 1212 difficulty be chosen so, that the Responder's load remain close to 1213 the maximum it can tolerate. It is also RECOMMENDED to dynamically 1214 adjust the puzzle difficulty in accordance to the current Responder's 1215 load. 1217 Solving puzzles requires a lot of CPU power, that would increase 1218 power consumption. This would influence battery-powered Initiators, 1219 e.g. mobile phones or some IoT devices. If puzzles are hard then the 1220 required additional power consumption may appear to be unacceptable 1221 for some Initiators. The Responder SHOULD take this possibility into 1222 considerations while choosing the puzzles difficulty and while 1223 selecting which percentage of Initiators are allowed to reject 1224 solving puzzles. See Section 7.1.4 for details. 1226 If the Initiator uses NULL Authentication [RFC7619] then its identity 1227 is never verified, that may be used by attackers to perform DoS 1228 attack after IKE SA is established. Responders that allow 1229 unauthenticated Initiators to connect must be prepared deal with 1230 various kinds of DoS attacks even after IKE SA is created. See 1231 Section 5 for details. 1233 To prevent amplification attacks implementations must strictly follow 1234 the retransmission rules described in Section 2.1 of [RFC7296]. 1236 11. IANA Considerations 1238 This document defines a new payload in the "IKEv2 Payload Types" 1239 registry: 1241 Puzzle Solution PS 1243 This document also defines a new Notify Message Type in the "IKEv2 1244 Notify Message Types - Status Types" registry: 1246 PUZZLE 1248 12. Acknowledgements 1250 The authors thank Tero Kivinen, Yaron Sheffer and Scott Fluhrer for 1251 their contribution into design of the protocol. In particular, Tero 1252 Kivinen suggested the kind of puzzle where the task is to find a 1253 solution with requested number of zero trailing bits. Yaron Sheffer 1254 and Scott Fluhrer suggested a way to make puzzle difficulty less 1255 erratic by solving several weaker puzles. The authors also thank 1256 David Waltermire for his carefull review of the document, Graham 1257 Bartlett for pointing out to the possibility of "Hash & URL" related 1258 attack, and all others who commented the document. 1260 13. References 1262 13.1. Normative References 1264 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1265 Requirement Levels", BCP 14, RFC 2119, 1266 DOI 10.17487/RFC2119, March 1997, 1267 . 1269 [RFC7296] Kaufman, C., Hoffman, P., Nir, Y., Eronen, P., and T. 1270 Kivinen, "Internet Key Exchange Protocol Version 2 1271 (IKEv2)", STD 79, RFC 7296, DOI 10.17487/RFC7296, October 1272 2014, . 1274 [RFC7383] Smyslov, V., "Internet Key Exchange Protocol Version 2 1275 (IKEv2) Message Fragmentation", RFC 7383, 1276 DOI 10.17487/RFC7383, November 2014, 1277 . 1279 [IKEV2-IANA] 1280 "Internet Key Exchange Version 2 (IKEv2) Parameters", 1281 . 1283 13.2. Informative References 1285 [bitcoins] 1286 Nakamoto, S., "Bitcoin: A Peer-to-Peer Electronic Cash 1287 System", October 2008, . 1289 [RFC5723] Sheffer, Y. and H. Tschofenig, "Internet Key Exchange 1290 Protocol Version 2 (IKEv2) Session Resumption", RFC 5723, 1291 DOI 10.17487/RFC5723, January 2010, 1292 . 1294 [RFC7619] Smyslov, V. and P. Wouters, "The NULL Authentication 1295 Method in the Internet Key Exchange Protocol Version 2 1296 (IKEv2)", RFC 7619, DOI 10.17487/RFC7619, August 2015, 1297 . 1299 [RFC7696] Housley, R., "Guidelines for Cryptographic Algorithm 1300 Agility and Selecting Mandatory-to-Implement Algorithms", 1301 BCP 201, RFC 7696, DOI 10.17487/RFC7696, November 2015, 1302 . 1304 Authors' Addresses 1306 Yoav Nir 1307 Check Point Software Technologies Ltd. 1308 5 Hasolelim st. 1309 Tel Aviv 6789735 1310 Israel 1312 EMail: ynir.ietf@gmail.com 1313 Valery Smyslov 1314 ELVIS-PLUS 1315 PO Box 81 1316 Moscow (Zelenograd) 124460 1317 Russian Federation 1319 Phone: +7 495 276 0211 1320 EMail: svan@elvis.ru