idnits 2.17.1 draft-ietf-ipsecme-ddos-protection-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 1, 2016) is 2735 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'IKEV2-IANA' Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IPSecME Working Group Y. Nir 3 Internet-Draft Check Point 4 Intended status: Standards Track V. Smyslov 5 Expires: April 4, 2017 ELVIS-PLUS 6 October 1, 2016 8 Protecting Internet Key Exchange Protocol version 2 (IKEv2) 9 Implementations from Distributed Denial of Service Attacks 10 draft-ietf-ipsecme-ddos-protection-10 12 Abstract 14 This document recommends implementation and configuration best 15 practices for Internet Key Exchange Protocol version 2 (IKEv2) 16 Responders, to allow them to resist Denial of Service and Distributed 17 Denial of Service attacks. Additionally, the document introduces a 18 new mechanism called "Client Puzzles" that help accomplish this task. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on April 4, 2017. 37 Copyright Notice 39 Copyright (c) 2016 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 55 2. Conventions Used in This Document . . . . . . . . . . . . . . 3 56 3. The Vulnerability . . . . . . . . . . . . . . . . . . . . . . 3 57 4. Defense Measures while the IKE SA is being created . . . . . 6 58 4.1. Retention Periods for Half-Open SAs . . . . . . . . . . . 6 59 4.2. Rate Limiting . . . . . . . . . . . . . . . . . . . . . . 6 60 4.3. The Stateless Cookie . . . . . . . . . . . . . . . . . . 7 61 4.4. Puzzles . . . . . . . . . . . . . . . . . . . . . . . . . 8 62 4.5. Session Resumption . . . . . . . . . . . . . . . . . . . 10 63 4.6. Keeping computed Shared Keys . . . . . . . . . . . . . . 11 64 4.7. Preventing "Hash and URL" Certificate Encoding Attacks . 11 65 4.8. IKE Fragmentation . . . . . . . . . . . . . . . . . . . . 12 66 5. Defense Measures after an IKE SA is created . . . . . . . . . 12 67 6. Plan for Defending a Responder . . . . . . . . . . . . . . . 13 68 7. Using Puzzles in the Protocol . . . . . . . . . . . . . . . . 15 69 7.1. Puzzles in IKE_SA_INIT Exchange . . . . . . . . . . . . . 15 70 7.1.1. Presenting a Puzzle . . . . . . . . . . . . . . . . . 16 71 7.1.2. Solving a Puzzle and Returning the Solution . . . . . 18 72 7.1.3. Computing a Puzzle . . . . . . . . . . . . . . . . . 19 73 7.1.4. Analyzing Repeated Request . . . . . . . . . . . . . 20 74 7.1.5. Deciding if to Serve the Request . . . . . . . . . . 21 75 7.2. Puzzles in an IKE_AUTH Exchange . . . . . . . . . . . . . 22 76 7.2.1. Presenting Puzzle . . . . . . . . . . . . . . . . . . 22 77 7.2.2. Solving Puzzle and Returning the Solution . . . . . . 23 78 7.2.3. Computing the Puzzle . . . . . . . . . . . . . . . . 24 79 7.2.4. Receiving the Puzzle Solution . . . . . . . . . . . . 24 80 8. Payload Formats . . . . . . . . . . . . . . . . . . . . . . . 25 81 8.1. PUZZLE Notification . . . . . . . . . . . . . . . . . . . 25 82 8.2. Puzzle Solution Payload . . . . . . . . . . . . . . . . . 26 83 9. Operational Considerations . . . . . . . . . . . . . . . . . 26 84 10. Security Considerations . . . . . . . . . . . . . . . . . . . 27 85 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 86 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 29 87 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 29 88 13.1. Normative References . . . . . . . . . . . . . . . . . . 29 89 13.2. Informative References . . . . . . . . . . . . . . . . . 30 90 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 30 92 1. Introduction 94 Denial of Service (DoS) attacks have always been considered a serious 95 threat. These attacks are usually difficult to defend against since 96 the amount of resources the victim has is always bounded (regardless 97 of how high it is) and because some resources are required for 98 distinguishing a legitimate session from an attack. 100 The Internet Key Exchange protocol version 2 (IKEv2) described in 101 [RFC7296] includes defense against DoS attacks. In particular, there 102 is a cookie mechanism that allows the IKE Responder to defend itself 103 against DoS attacks from spoofed IP-addresses. However, botnets have 104 become widespread, allowing attackers to perform Distributed Denial 105 of Service (DDoS) attacks, which are more difficult to defend 106 against. This document presents recommendations to help the 107 Responder counter (D)DoS attacks. It also introduces a new mechanism 108 -- "puzzles" -- that can help accomplish this task. 110 2. Conventions Used in This Document 112 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 113 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 114 "OPTIONAL" in this document are to be interpreted as described in 115 [RFC2119]. 117 3. The Vulnerability 119 The IKE_SA_INIT Exchange described in Section 1.2 of [RFC7296] 120 involves the Initiator sending a single message. The Responder 121 replies with a single message and also allocates memory for a 122 structure called a half-open IKE Security Association (SA). This 123 half-open SA is later authenticated in the IKE_AUTH Exchange. If 124 that IKE_AUTH request never comes, the half-open SA is kept for an 125 unspecified amount of time. Depending on the algorithms used and 126 implementation, such a half-open SA will use from around 100 bytes to 127 several thousands bytes of memory. 129 This creates an easy attack vector against an IKE Responder. 130 Generating the IKE_SA_INIT request is cheap. Sending large amounts 131 of IKE_SA_INIT requests can cause a Responder to use up all its 132 resources. If the Responder tries to defend against this by 133 throttling new requests, this will also prevent legitimate Initiators 134 from setting up IKE SAs. 136 An obvious defense, which is described in Section 4.2, is limiting 137 the number of half-open SAs opened by a single peer. However, since 138 all that is required is a single packet, an attacker can use multiple 139 spoofed source IP addresses. 141 If we break down what a Responder has to do during an initial 142 exchange, there are three stages: 144 1. When the IKE_SA_INIT request arrives, the Responder: 146 * Generates or re-uses a Diffie-Hellman (D-H) private part. 148 * Generates a Responder Security Parameter Index (SPI). 150 * Stores the private part and peer public part in a half-open SA 151 database. 153 2. When the IKE_AUTH request arrives, the Responder: 155 * Derives the keys from the half-open SA. 157 * Decrypts the request. 159 3. If the IKE_AUTH request decrypts properly: 161 * Validates the certificate chain (if present) in the IKE_AUTH 162 request. 164 The fourth stage where the Responder creates the Child SA is not 165 reached by attackers who cannot pass the authentication step. 167 Stage #1 is pretty light on CPU power, but requires some storage, and 168 it's very light for the Initiator as well. Stage #2 includes 169 private-key operations, so it is much heavier CPU-wise. Stage #3 may 170 include public key operations if certificates are involved. These 171 operations are often more computationly expensive than those 172 performed at stage #2. 174 To attack such a Responder, an attacker can attempt either to exhaust 175 memory or to exhaust CPU. Without any protection, the most efficient 176 attack is to send multiple IKE_SA_INIT requests and exhaust memory. 177 This is easy because IKE_SA_INIT requests are cheap. 179 There are obvious ways for the Responder to protect itself without 180 changes to the protocol. It can reduce the time that an entry 181 remains in the half-open SA database, and it can limit the amount of 182 concurrent half-open SAs from a particular address or prefix. The 183 attacker can overcome this by using spoofed source addresses. 185 The stateless cookie mechanism from Section 2.6 of [RFC7296] prevents 186 an attack with spoofed source addresses. This doesn't completely 187 solve the issue, but it makes the limiting of half-open SAs by 188 address or prefix work. Puzzles, introduced in Section 4.4, 189 accomplish the same thing only more of it. They make it harder for 190 an attacker to reach the goal of getting a half-open SA. Puzzles do 191 not have to be so hard that an attacker cannot afford to solve a 192 single puzzle; it is enough that puzzles increase the cost of 193 creating a half-open SAs, so the attacker is limited in the amount 194 they can create. 196 Reducing the lifetime of an abandoned half-open SA also reduces the 197 impact of such attacks. For example, if a half-open SA is kept for 1 198 minute and the capacity is 60 thousand half-open SAs, an attacker 199 would need to create one thousand half-open SAs per second. If the 200 retention time is reduced to 3 seconds, the attacker would need to 201 create 20 thousand half-open SAs per second to get the same result. 202 By introducing a puzzle, each half-open SA becomes more expensive for 203 an attacker, making it more likely to prevent an exhaustion attack 204 against Responder memory. 206 At this point, filling up the half-open SA database is no longer the 207 most efficient DoS attack. The attacker has two alternative attacks 208 to do better: 210 1. Go back to spoofed addresses and try to overwhelm the CPU that 211 deals with generating cookies, or 213 2. Take the attack to the next level by also sending an IKE_AUTH 214 request. 216 If an attacker is so powerfull that it is able to overwhelm the 217 Responder's CPU that deals with generating cookies, then the attack 218 cannot be dealt with at the IKE level and must be handled by means of 219 the Intrusion Prevention System (IPS) technology. 221 On the other hand, the second alternative of sending an IKE_AUTH 222 request is very cheap. It requires generating a proper IKE header 223 with the correct IKE SPIs and a single Encrypted payload. The 224 content of the payload is irrelevant and might be junk. The 225 Responder has to perform the relatively expensive key derivation, 226 only to find that the MAC on the Encrypted payload on the IKE_AUTH 227 request fails the integrity check. If a Responder does not hold on 228 to the calculated SKEYSEED and SK_* keys (which it should in case a 229 valid IKE_AUTH comes in later) this attack might be repeated on the 230 same half-open SA. Puzzles make attacks of such sort more costly for 231 an attacker. See Section 7.2 for details. 233 Here too, the number of half-open SAs that the attacker can achieve 234 is crucial, because each one allows the attacker to waste some CPU 235 time. So making it hard to make many half-open SAs is important. 237 A strategy against DDoS has to rely on at least 4 components: 239 1. Hardening the half-open SA database by reducing retention time. 241 2. Hardening the half-open SA database by rate-limiting single IPs/ 242 prefixes. 244 3. Guidance on what to do when an IKE_AUTH request fails to decrypt. 246 4. Increasing the cost of half-open SAs up to what is tolerable for 247 legitimate clients. 249 Puzzles are used as a solution for strategy #4. 251 4. Defense Measures while the IKE SA is being created 253 4.1. Retention Periods for Half-Open SAs 255 As a UDP-based protocol, IKEv2 has to deal with packet loss through 256 retransmissions. Section 2.4 of [RFC7296] recommends "that messages 257 be retransmitted at least a dozen times over a period of at least 258 several minutes before giving up". Many retransmission policies in 259 practice wait one or two seconds before retransmitting for the first 260 time. 262 Because of this, setting the timeout on a half-open SA too low will 263 cause it to expire whenever even one IKE_AUTH request packet is lost. 264 When not under attack, the half-open SA timeout SHOULD be set high 265 enough that the Initiator will have enough time to send multiple 266 retransmissions, minimizing the chance of transient network 267 congestion causing an IKE failure. 269 When the system is under attack, as measured by the amount of half- 270 open SAs, it makes sense to reduce this lifetime. The Responder 271 should still allow enough time for the round-trip, enough time for 272 the Initiator to derive the D-H shared value, and enough time to 273 derive the IKE SA keys and the create the IKE_AUTH request. Two 274 seconds is probably as low a value as can realistically be used. 276 It could make sense to assign a shorter value to half-open SAs 277 originating from IP addresses or prefixes that are considered suspect 278 because of multiple concurrent half-open SAs. 280 4.2. Rate Limiting 282 Even with DDoS, the attacker has only a limited amount of nodes 283 participating in the attack. By limiting the amount of half-open SAs 284 that are allowed to exist concurrently with each such node, the total 285 amount of half-open SAs is capped, as is the total amount of key 286 derivations that the Responder is forced to complete. 288 In IPv4 it makes sense to limit the number of half-open SAs based on 289 IP address. Most IPv4 nodes are either directly attached to the 290 Internet using a routable address or are hidden behind a NAT device 291 with a single IPv4 external address. For IPv6, ISPs assign between a 292 /48 and a /64, so it does not make sense for rate-limiting to work on 293 single IPv6 IPs. Instead, ratelimits should be done based on either 294 the /48 or /64 of the misbehaving IPv6 address observed. 296 The number of half-open SAs is easy to measure, but it is also 297 worthwhile to measure the number of failed IKE_AUTH exchanges. If 298 possible, both factors should be taken into account when deciding 299 which IP address or prefix is considered suspicious. 301 There are two ways to rate-limit a peer address or prefix: 303 1. Hard Limit - where the number of half-open SAs is capped, and any 304 further IKE_SA_INIT requests are rejected. 306 2. Soft Limit - where if a set number of half-open SAs exist for a 307 particular address or prefix, any IKE_SA_INIT request will be 308 required to solve a puzzle. 310 The advantage of the hard limit method is that it provides a hard cap 311 on the amount of half-open SAs that the attacker is able to create. 312 The disadvantage is that it allows the attacker to block IKE 313 initiation from small parts of the Internet. For example, if an 314 network service provider or some establishment offers Internet 315 connectivity to its customers or employees through an IPv4 NAT 316 device, a single malicious customer can create enough half-open SAs 317 to fill the quota for the NAT device external IP address. Legitimate 318 Initiators on the same network will not be able to initiate IKE. 320 The advantage of a soft limit is that legitimate clients can always 321 connect. The disadvantage is that an adversary with sufficient CPU 322 resources can still effectively DoS the Responder. 324 Regardless of the type of rate-limiting used, legitimate initiators 325 that are not on the same network segments as the attackers will not 326 be affected. This is very important as it reduces the adverse impact 327 caused by the measures used to counteract the attack, and allows most 328 initiators to keep working even if they do not support puzzles. 330 4.3. The Stateless Cookie 332 Section 2.6 of [RFC7296] offers a mechanism to mitigate DoS attacks: 333 the stateless cookie. When the server is under load, the Responder 334 responds to the IKE_SA_INIT request with a calculated "stateless 335 cookie" - a value that can be re-calculated based on values in the 336 IKE_SA_INIT request without storing Responder-side state. The 337 Initiator is expected to repeat the IKE_SA_INIT request, this time 338 including the stateless cookie. This mechanism prevents DoS attacks 339 from spoofed IP addresses, since an attacker needs to have a routable 340 IP address to return the cookie. 342 Attackers that have multiple source IP addresses with return 343 routability, such as in the case of botnets, can fill up a half-open 344 SA table anyway. The cookie mechanism limits the amount of allocated 345 state to the number of attackers, multiplied by the number of half- 346 open SAs allowed per peer address, multiplied by the amount of state 347 allocated for each half-open SA. With typical values this can easily 348 reach hundreds of megabytes. 350 4.4. Puzzles 352 The puzzle introduced here extends the cookie mechanism of [RFC7296]. 353 It is loosely based on the proof-of-work technique used in Bitcoins 354 [bitcoins]. Puzzles set an upper bound, determined by the attacker's 355 CPU, to the number of negotiations the attacker can initiate in a 356 unit of time. 358 A puzzle is sent to the Initiator in two cases: 360 o The Responder is so overloaded that no half-open SAs may be 361 created without solving a puzzle, or 363 o The Responder is not too loaded, but the rate-limiting method 364 described in Section 4.2 prevents half-open SAs from being created 365 with this particular peer address or prefix without first solving 366 a puzzle. 368 When the Responder decides to send the challenge to solve a puzzle in 369 response to a IKE_SA_INIT request, the message includes at least 370 three components: 372 1. Cookie - this is calculated the same as in [RFC7296], i.e. the 373 process of generating the cookie is not specified. 375 2. Algorithm, this is the identifier of a Pseudo-Random Function 376 (PRF) algorithm, one of those proposed by the Initiator in the SA 377 payload. 379 3. Zero Bit Count (ZBC). This is a number between 8 and 255 (or a 380 special value - 0, see Section 7.1.1.1) that represents the 381 length of the zero-bit run at the end of the output of the PRF 382 function calculated over the cookie that the Initiator is to 383 send. The values 1-8 are explicitly excluded, because they 384 create a puzzle that is too easy to solve. Since the mechanism 385 is supposed to be stateless for the Responder, either the same 386 ZBC is used for all Initiators, or the ZBC is somehow encoded in 387 the cookie. If it is global then it means that this value is the 388 same for all the Initiators who are receiving puzzles at any 389 given point of time. The Responder, however, may change this 390 value over time depending on its load. 392 Upon receiving this challenge, the Initiator attempts to calculate 393 the PRF output using different keys. When enough keys are found such 394 that the resulting PRF output calculated using each of them has a 395 sufficient number of trailing zero bits, that result is sent to the 396 Responder. 398 The reason for using several keys in the results, rather than just 399 one key, is to reduce the variance in the time it takes the initiator 400 to solve the puzzle. We have chosen the number of keys to be four 401 (4) as a compromise between the conflicting goals of reducing 402 variance and reducing the work the Responder needs to perform to 403 verify the puzzle solution. 405 When receiving a request with a solved puzzle, the Responder verifies 406 two things: 408 o That the cookie is indeed valid. 410 o That the results of PRF of the transmitted cookie calculated with 411 the transmitted keys has a sufficient number of trailing zero 412 bits. 414 Example 1: Suppose the calculated cookie is 415 739ae7492d8a810cf5e8dc0f9626c9dda773c5a3 (20 octets), the algorithm 416 is PRF-HMAC-SHA256, and the required number of zero bits is 18. 417 After successively trying a bunch of keys, the Initiator finds the 418 following four 3-octet keys that work: 420 +--------+----------------------------------+----------+ 421 | Key | Last 32 Hex PRF Digits | # 0-bits | 422 +--------+----------------------------------+----------+ 423 | 061840 | e4f957b859d7fb1343b7b94a816c0000 | 18 | 424 | 073324 | 0d4233d6278c96e3369227a075800000 | 23 | 425 | 0c8a2a | 952a35d39d5ba06709da43af40700000 | 20 | 426 | 0d94c8 | 5a0452b21571e401a3d00803679c0000 | 18 | 427 +--------+----------------------------------+----------+ 429 Table 1: Four solutions for the 18-bit puzzle 431 Example 2: Same cookie, but modify the required number of zero bits 432 to 22. The first 4-octet keys that work to satisfy that requirement 433 are 005d9e57, 010d8959, 0110778d, and 01187e37. Finding these 434 requires 18,382,392 invocations of the PRF. 436 +----------+-------------------------------+ 437 | # 0-bits | Time to Find 4 keys (seconds) | 438 +----------+-------------------------------+ 439 | 8 | 0.0025 | 440 | 10 | 0.0078 | 441 | 12 | 0.0530 | 442 | 14 | 0.2521 | 443 | 16 | 0.8504 | 444 | 17 | 1.5938 | 445 | 18 | 3.3842 | 446 | 19 | 3.8592 | 447 | 20 | 10.8876 | 448 +----------+-------------------------------+ 450 Table 2: The time needed to solve a puzzle of various difficulty for 451 the cookie = 739ae7492d8a810cf5e8dc0f9626c9dda773c5a3 453 The figures above were obtained on a 2.4 GHz single core i5. Run 454 times can be halved or quartered with multi-core code, but would be 455 longer on mobile phone processors, even if those are multi-core as 456 well. With these figures 18 bits is believed to be a reasonable 457 choice for puzzle level difficulty for all Initiators, and 20 bits is 458 acceptable for specific hosts/prefixes. 460 Using puzzles mechanism in the IKE_SA_INIT exchange is described in 461 Section 7.1. 463 4.5. Session Resumption 465 When the Responder is under attack, it SHOULD prefer previously 466 authenticated peers who present a Session Resumption ticket 467 [RFC5723]. However, the Responder SHOULD NOT serve resumed 468 Initiators exclusively because dropping all IKE_SA_INIT requests 469 would lock out legitimate Initiators that have no resumption ticket. 470 When under attack the Responder SHOULD require Initiators presenting 471 Session Resumption Tickets to pass a return routability check by 472 including the COOKIE notification in the IKE_SESSION_RESUME response 473 message, as described in Section 4.3.2. of [RFC5723]. Note that the 474 Responder SHOULD cache tickets for a short time to reject reused 475 tickets (Section 4.3.1), and therefore there should be no issue of 476 half-open SAs resulting from replayed IKE_SESSION_RESUME messages. 478 Several kinds of DoS attacks are possible on servers supported IKE 479 Session Resumption. See Section 9.3 of [RFC5723] for details. 481 4.6. Keeping computed Shared Keys 483 Once the IKE_SA_INIT exchange is finished, the Responder is waiting 484 for the first message of the IKE_AUTH exchange from the Initiator. 485 At this point the Initiator is not yet authenticated, and this fact 486 allows an attacker to perform an attack, described in Section 3. 487 Instead of sending properly formed and encrypted IKE_AUTH message the 488 attacker can just send arbitrary data, forcing the Responder to 489 perform costly CPU operations to compute SK_* keys. 491 If the received IKE_AUTH message failed to decrypt correctly (or 492 failed to pass ICV check), then the Responder SHOULD still keep the 493 computed SK_* keys, so that if it happened to be an attack, then an 494 attacker cannot get advantage of repeating the attack multiple times 495 on a single IKE SA. The responder can also use puzzles in the 496 IKE_AUTH exchange as decribed in Section 7.2. 498 4.7. Preventing "Hash and URL" Certificate Encoding Attacks 500 In IKEv2 each side may use the "Hash and URL" Certificate Encoding to 501 instruct the peer to retrieve certificates from the specified 502 location (see Section 3.6 of [RFC7296] for details). Malicious 503 initiators can use this feature to mount a DoS attack on the 504 responder by providing an URL pointing to a large file possibly 505 containing meaningless bits. While downloading the file the 506 responder consumes CPU, memory and network bandwidth. 508 To prevent this kind of attack, the responder should not blindly 509 download the whole file. Instead, it SHOULD first read the initial 510 few bytes, decode the length of the ASN.1 structure from these bytes, 511 and then download no more than the decoded number of bytes. Note, 512 that it is always possible to determine the length of ASN.1 513 structures used in IKEv2, if they are DER-encoded, by analyzing the 514 first few bytes. However, since the content of the file being 515 downloaded can be under the attacker's control, implementations 516 should not blindly trust the decoded length and SHOULD check whether 517 it makes sense before continuing to download the file. 518 Implementations SHOULD also apply a configurable hard limit to the 519 number of pulled bytes and SHOULD provide an ability for an 520 administrator to either completely disable this feature or to limit 521 its use to a configurable list of trusted URLs. 523 4.8. IKE Fragmentation 525 IKE Fragmentation described in [RFC7383] allows IKE peers to avoid IP 526 fragmentation of large IKE messages. Attackers can mount several 527 kinds of DoS attacks using IKE Fragmentation. See Section 5 of 528 [RFC7383] for details on how to mitigate these attacks. 530 5. Defense Measures after an IKE SA is created 532 Once an IKE SA is created there usually are only a limited amount of 533 IKE messages exchanged. This IKE traffic consists of exchanges aimed 534 to create additional Child SAs, IKE rekeys, IKE deletions and IKE 535 liveness tests. Some of these exchanges require relatively little 536 resources (like liveness check), while others may be resource 537 consuming (like creating or rekeying Child SA with D-H exchange). 539 Since any endpoint can initiate a new exchange, there is a 540 possibility that a peer would initiate too many exchanges that could 541 exhaust host resources. For example, the peer can perform endless 542 continuous Child SA rekeying or create an overwhelming number of 543 Child SAs with the same Traffic Selectors etc. Such behavior can be 544 caused by broken implementations, misconfiguration, or as an 545 intentional attack. The latter becomes more of a real threat if the 546 peer uses NULL Authentication, as described in [RFC7619]. In this 547 case the peer remains anonymous, allowing it to escape any 548 responsibility for its behaviour. See Section 3 of [RFC7619] for 549 details on how to mitigate attacks when using NULL Authentication. 551 The following recommendations apply especially for NULL Authenticated 552 IKE sessions, but also apply to authenticated IKE sessions, with the 553 difference that in the latter case, the identified peer can be locked 554 out. 556 o If the IKEv2 window size is greater than one, peers are able to 557 initiate multiple simultaneous exchanges that increase host 558 resource consumption. Since there is no way in IKEv2 to decrease 559 window size once it has been increased (see Section 2.3 of 560 [RFC7296]), the window size cannot be dynamically adjusted 561 depending on the load. It is NOT RECOMMENDED to allow an IKEv2 562 window size greater than one when NULL Authentication has been 563 used. 565 o If a peer initiates an abusive amount of CREATE_CHILD_SA exchanges 566 to rekey IKE SAs or Child SAs, the Responder SHOULD reply with 567 TEMPORARY_FAILURE notifications indicating the peer must slow down 568 their requests. 570 o If a peer creates many Child SA with the same or overlapping 571 Traffic Selectors, implementations MAY respond with the 572 NO_ADDITIONAL_SAS notification. 574 o If a peer initiates many exchanges of any kind, the Responder MAY 575 introduce an artificial delay before responding to each request 576 message. This delay would decrease the rate the Responder needs 577 to process requests from any particular peer, and frees up 578 resources on the Responder that can be used for answering 579 legitimate clients. If the Responder receives retransmissions of 580 the request message during the delay period, the retransmitted 581 messages MUST be silently discarded. The delay must be short 582 enough to avoid legitimate peers deleting the IKE SA due to a 583 timeout. It is believed that a few seconds is enough. Note 584 however, that even a few seconds may be too long when settings 585 rely on an immediate response to the request message, e.g. for the 586 purposes of quick detection of a dead peer. 588 o If these counter-measures are inefficient, implementations MAY 589 delete the IKE SA with an offending peer by sending Delete 590 Payload. 592 In IKE, a client can request various configuration attributes from 593 server. Most often these attributes include internal IP addresses. 594 Malicious clients can try to exhaust a server's IP address pool by 595 continuously requesting a large number of internal addresses. Server 596 implementations SHOULD limit the number of IP addresses allocated to 597 any particular client. Note, this is not possible with clients using 598 NULL Authentication, since their identity cannot be verified. 600 6. Plan for Defending a Responder 602 This section outlines a plan for defending a Responder from a DDoS 603 attack based on the techniques described earlier. The numbers given 604 here are not normative, and their purpose is to illustrate the 605 configurable parameters needed for surviving DDoS attacks. 607 Implementations are deployed in different environments, so it is 608 RECOMMENDED that the parameters be settable. For example, most 609 commercial products are required to undergo benchmarking where the 610 IKE SA establishment rate is measured. Benchmarking is 611 indistinguishable from a DoS attack and the defenses described in 612 this document may defeat the benchmark by causing exchanges to fail 613 or take a long time to complete. Parameters SHOULD be tunable to 614 allow for benchmarking (if only by turning DDoS protection off). 616 Since all countermeasures may cause delays and additional work for 617 the Initiators, they SHOULD NOT be deployed unless an attack is 618 likely to be in progress. To minimize the burden imposed on 619 Initiators, the Responder should monitor incoming IKE requests, for 620 two scenarios: 622 1. A general DDoS attack. Such an attack is indicated by a high 623 number of concurrent half-open SAs, a high rate of failed 624 IKE_AUTH exchanges, or a combination of both. For example, 625 consider a Responder that has 10,000 distinct peers of which at 626 peak 7,500 concurrently have VPN tunnels. At the start of peak 627 time, 600 peers might establish tunnels within any given minute, 628 and tunnel establishment (both IKE_SA_INIT and IKE_AUTH) takes 629 anywhere from 0.5 to 2 seconds. For this Responder, we expect 630 there to be less than 20 concurrent half-open SAs, so having 100 631 concurrent half-open SAs can be interpreted as an indication of 632 an attack. Similarly, IKE_AUTH request decryption failures 633 should never happen. Supposing that the tunnels are established 634 using EAP (see Section 2.16 of [RFC7296]), users may be expected 635 to enter a wrong password about 20% of the time. So we'd expect 636 125 wrong password failures a minute. If we get IKE_AUTH 637 decryption failures from multiple sources more than once per 638 second, or EAP failures more than 300 times per minute, this can 639 also be an indication of a DDoS attack. 641 2. An attack from a particular IP address or prefix. Such an attack 642 is indicated by an inordinate amount of half-open SAs from a 643 specific IP address or prefix, or an inordinate amount of 644 IKE_AUTH failures. A DDoS attack may be viewed as multiple such 645 attacks. If these are mitigated successfully, there will not be 646 a need to enact countermeasures on all Initiators. For example, 647 measures might be 5 concurrent half-open SAs, 1 decrypt failure, 648 or 10 EAP failures within a minute. 650 Note that using counter-measures against an attack from a particular 651 IP address may be enough to avoid the overload on the half-open SA 652 database. In this case the number of failed IKE_AUTH exchanges will 653 never exceed the threshold of attack detection. 655 When there is no general DDoS attack, it is suggested that no cookie 656 or puzzles be used. At this point the only defensive measure is to 657 monitor the number of half-open SAs, and set a soft limit per peer IP 658 or prefix. The soft limit can be set to 3-5. If the puzzles are 659 used, the puzzle difficulty SHOULD be set to such a level (number of 660 zero-bits) that all legitimate clients can handle it without degraded 661 user experience. 663 As soon as any kind of attack is detected, either a lot of 664 initiations from multiple sources or a lot of initiations from a few 665 sources, it is best to begin by requiring stateless cookies from all 666 Initiators. This will mitigate attacks based on IP address spoofing, 667 and help avoid the need to impose a greater burden in the form of 668 puzzles on the general population of Initiators. This makes the per- 669 node or per-prefix soft limit more effective. 671 When cookies are activated for all requests and the attacker is still 672 managing to consume too many resources, the Responder MAY start to 673 use puzzles for these requests or increase the difficulty of puzzles 674 imposed on IKE_SA_INIT requests coming from suspicious nodes/ 675 prefixes. This should still be doable by all legitimate peers, but 676 the use of puzzles at a higher difficulty may degrade the user 677 experience, for example by taking up to 10 seconds to solve the 678 puzzle. 680 If the load on the Responder is still too great, and there are many 681 nodes causing multiple half-open SAs or IKE_AUTH failures, the 682 Responder MAY impose hard limits on those nodes. 684 If it turns out that the attack is very widespread and the hard caps 685 are not solving the issue, a puzzle MAY be imposed on all Initiators. 686 Note that this is the last step, and the Responder should avoid this 687 if possible. 689 7. Using Puzzles in the Protocol 691 This section describes how the puzzle mechanism is used in IKEv2. It 692 is organized as follows. The Section 7.1 describes using puzzles in 693 the IKE_SA_INIT exchange and the Section 7.2 describes using puzzles 694 in the IKE_AUTH exchange. Both sections are divided into subsections 695 describing how puzzles should be presented, solved and processed by 696 the Initiator and the Responder. 698 7.1. Puzzles in IKE_SA_INIT Exchange 700 IKE Initiator indicates the desire to create a new IKE SA by sending 701 an IKE_SA_INIT request message. The message may optionally contain a 702 COOKIE notification if this is a repeated request performed after the 703 Responder's demand to return a cookie. 705 HDR, [N(COOKIE),] SA, KE, Ni, [V+][N+] --> 707 According to the plan, described in Section 6, the IKE Responder 708 monitors incoming requests to detect whether it is under attack. If 709 the Responder learns that a (D)DoS attack is likely to be in 710 progress, then its actions depend on the volume of the attack. If 711 the volume is moderate, then the Responder requests the Initiator to 712 return a cookie. If the volume is high to such an extent that 713 puzzles need to be used for defense, then the Responder requests the 714 Initiator to solve a puzzle. 716 The Responder MAY choose to process some fraction of IKE_SA_INIT 717 requests without presenting a puzzle while being under attack to 718 allow legacy clients, that don't support puzzles, to have a chance to 719 be served. The decision whether to process any particular request 720 must be probabilistic, with the probability depending on the 721 Responder's load (i.e. on the volume of attack). The requests that 722 don't contain the COOKIE notification MUST NOT participate in this 723 lottery. In other words, the Responder must first perform a return 724 routability check before allowing any legacy client to be served if 725 it is under attack. See Section 7.1.4 for details. 727 7.1.1. Presenting a Puzzle 729 If the Responder makes a decision to use puzzles, then it includes 730 two notifications in its response message - the COOKIE notification 731 and the PUZZLE notification. Note that the PUZZLE notification MUST 732 always be accompanied with the COOKIE notification, since the content 733 of the COOKIE notification is used as an input data when solving 734 puzzle. The format of the PUZZLE notification is described in 735 Section 8.1. 737 <-- HDR, N(COOKIE), N(PUZZLE), [V+][N+] 739 The presence of these notifications in an IKE_SA_INIT response 740 message indicates to the Initiator that it should solve the puzzle to 741 have a better chance to be served. 743 7.1.1.1. Selecting the Puzzle Difficulty Level 745 The PUZZLE notification contains the difficulty level of the puzzle - 746 the minimum number of trailing zero bits that the result of PRF must 747 contain. In diverse environments it is nearly impossible for the 748 Responder to set any specific difficulty level that will result in 749 roughly the same amount of work for all Initiators, because 750 computation power of different Initiators may vary by an order of 751 magnitude, or even more. The Responder may set the difficulty level 752 to 0, meaning that the Initiator is requested to spend as much power 753 to solve a puzzle as it can afford. In this case no specific value 754 of ZBC is required from the Initiator, however the larger the ZBC 755 that Initiator is able to get, the better the chance is that it will 756 be served by the Responder. In diverse environments it is 757 RECOMMENDED that the Initiator set the difficulty level to 0, unless 758 the attack volume is very high. 760 If the Responder sets a non-zero difficulty level, then the level 761 SHOULD be determined by analyzing the volume of the attack. The 762 Responder MAY set different difficulty levels to different requests 763 depending on the IP address the request has come from. 765 7.1.1.2. Selecting the Puzzle Algorithm 767 The PUZZLE notification also contains an identifier of the algorithm, 768 that is used by Initiator to compute puzzle. 770 Cryptographic algorithm agility is considered an important feature 771 for modern protocols [RFC7696]. Algorithm agility ensures that a 772 protocol doesn't rely on a single built-in set of cryptographic 773 algorithms, but has a means to replace one set with another and 774 negotiate new algorithms with the peer. IKEv2 fully supports 775 cryptographic algorithm agility for its core operations. 777 To support crypto agility in case of puzzles, the algorithm that is 778 used to compute a puzzle needs to be negotiated during the 779 IKE_SA_INIT exchange. The negotiation is performed as follows. The 780 initial request message from the Initiator contains an SA payload 781 containing a list of transforms of different types. Thereby the 782 Initiator asserts that it supports all transforms from this list and 783 can use any of them in the IKE SA being established. The Responder 784 parses the received SA payload and finds a mutually supported of type 785 PRF. The Responder selects the preferred PRF from the list of 786 mutually supported ones and includes it into the PUZZLE notification. 787 There is no requirement that the PRF selected for puzzles be the same 788 as the PRF that is negotiated later for use in core IKE SA crypto 789 operations. If there are no mutually supported PRFs, then IKE SA 790 negotiation will fail anyway and there is no reason to return a 791 puzzle. In this case the Responder returns a NO_PROPOSAL_CHOSEN 792 notification. Note that PRF is a mandatory transform type for IKE SA 793 (see Sections 3.3.2 and 3.3.3 of [RFC7296]) and at least one 794 transform of this type is always present in the SA payload in an 795 IKE_SA_INIT request message. 797 7.1.1.3. Generating a Cookie 799 If the Responder supports puzzles then a cookie should be computed in 800 such a manner that the Responder is able to learn some important 801 information from the sole cookie, when it is later returned back by 802 Initiator. In particular - the Responder SHOULD be able to learn the 803 following information: 805 o Whether the puzzle was given to the Initiator or only the cookie 806 was requested. 808 o The difficulty level of the puzzle given to the Initiator. 810 o The number of consecutive puzzles given to the Initiator. 812 o The amount of time the Initiator spent to solve the puzzles. This 813 can be calculated if the cookie is timestamped. 815 This information helps the Responder to make a decision whether to 816 serve this request or demand more work from the Initiator. 818 One possible approach to get this information is to encode it in the 819 cookie. The format of such encoding is an implementation detail of 820 Responder, as the cookie would remain an opaque block of data to the 821 Initiator. If this information is encoded in the cookie, then the 822 Responder MUST make it integrity protected, so that any intended or 823 accidental alteration of this information in the returned cookie is 824 detectable. So, the cookie would be generated as: 826 Cookie = | | 827 Hash(Ni | IPi | SPIi | | ) 829 Note, that according to the Section 2.6 of [RFC7296], the size of the 830 cookie cannot exceed 64 bytes. 832 Alternatively, the Responder may generate a cookie as suggested in 833 Section 2.6 of [RFC7296], but associate the additional information, 834 using local storage identified with the particular version of the 835 secret. In this case the Responder should have different secrets for 836 every combination of difficulty level and number of consecutive 837 puzzles, and should change the secrets periodically, keeping a few 838 previous versions, to be able to calculate how long ago a cookie was 839 generated. 841 The Responder may also combine these approaches. This document 842 doesn't mandate how the Responder learns this information from a 843 cookie. 845 When selecting cookie generation algorithm implementations MUST 846 ensure that an attacker gains no or insignificant benefit from re- 847 using puzzle solutions in several requests. See Section 10 for 848 details. 850 7.1.2. Solving a Puzzle and Returning the Solution 852 If the Initiator receives a puzzle but it doesn't support puzzles, 853 then it will ignore the PUZZLE notification as an unrecognized status 854 notification (in accordance to Section 3.10.1 of [RFC7296]). The 855 Initiator MAY ignore the PUZZLE notification if it is not willing to 856 spend resources to solve the puzzle of the requested difficulty, even 857 if it supports puzzles. In both cases the Initiator acts as 858 described in Section 2.6 of [RFC7296] - it restarts the request and 859 includes the received COOKIE notification into it. The Responder 860 should be able to distinguish the situation when it just requested a 861 cookie from the situation where the puzzle was given to the 862 Initiator, but the Initiator for some reason ignored it. 864 If the received message contains a PUZZLE notification and doesn't 865 contain a COOKIE notification, then this message is malformed because 866 it requests to solve the puzzle, but doesn't provide enough 867 information to allow the puzzle to be solved. In this case the 868 Initiator MUST ignore the received message and continue to wait until 869 either a valid PUZZLE notification is received or the retransmission 870 timer fires. If it fails to receive a valid message after several 871 retransmissions of IKE_SA_INIT requests, then it means that something 872 is wrong and the IKE SA cannot be established. 874 If the Initiator supports puzzles and is ready to solve them, then it 875 tries to solve the given puzzle. After the puzzle is solved the 876 Initiator restarts the request and returns back to the Responder the 877 puzzle solution in a new payload called a Puzzle Solution payload 878 (denoted as PS, see Section 8.2) along with the received COOKIE 879 notification. 881 HDR, N(COOKIE), [PS,] SA, KE, Ni, [V+][N+] --> 883 7.1.3. Computing a Puzzle 885 General principles of constructing puzzles in IKEv2 are described in 886 Section 4.4. They can be summarized as follows: given unpredictable 887 string S and pseudo-random function PRF find N different keys Ki 888 (where i=[1..N]) for that PRF so that the result of PRF(Ki,S) has at 889 least the specified number of trailing zero bits. This specification 890 requires that the puzzle solution contains 4 different keys (i.e., 891 N=4). 893 In the IKE_SA_INIT exchange it is the cookie that plays the role of 894 unpredictable string S. In other words, in the IKE_SA_INIT the task 895 for the IKE Initiator is to find the four different, equal-sized keys 896 Ki for the agreed upon PRF such that each result of PRF(Ki,cookie) 897 where i = [1..4] has a sufficient number of trailing zero bits. Only 898 the content of the COOKIE notification is used in puzzle calculation, 899 i.e., the header of the Notify payload is not included. 901 Note, that puzzles in the IKE_AUTH exchange are computed differently 902 than in the IKE_SA_INIT_EXCHANGE. See Section 7.2.3 for details. 904 7.1.4. Analyzing Repeated Request 906 The received request must at least contain a COOKIE notification. 907 Otherwise it is an initial request and in this case it MUST be 908 processed according to Section 7.1. First, the cookie MUST be 909 checked for validity. If the cookie is invalid, then the request is 910 treated as initial and is processed according to Section 7.1. It is 911 RECOMMENDED that a new cookie is requested in this case. 913 If the cookie is valid, then some important information is learned 914 from it, or from local state based on identifier of the cookie's 915 secret (see Section 7.1.1.3 for details). This information helps the 916 Responder to sort out incoming requests, giving more priority to 917 those which were created by spending more of the Initiator's 918 resources. 920 First, the Responder determines if it requested only a cookie, or 921 presented a puzzle to the Initiator. If no puzzle was given, this 922 means that at the time the Responder requested a cookie it didn't 923 detect the (D)DoS attack or the attack volume was low. In this case 924 the received request message must not contain the PS payload, and 925 this payload MUST be ignored if the message contains a PS payload for 926 any reason. Since no puzzle was given, the Responder marks the 927 request with the lowest priority since the Initiator spent little 928 resources creating it. 930 If the Responder learns from the cookie that the puzzle was given to 931 the Initiator, then it looks for the PS payload to determine whether 932 its request to solve the puzzle was honored or not. If the incoming 933 message doesn't contain a PS payload, this means that the Initiator 934 either doesn't support puzzles or doesn't want to deal with them. In 935 either case the request is marked with the lowest priority since the 936 Initiator spent little resources creating it. 938 If a PS payload is found in the message, then the Responder MUST 939 verify the puzzle solution that it contains. The solution is 940 interpreted as four different keys. The result of using each of them 941 in the PRF (as described in Section 7.1.3) must contain at least the 942 requested number of trailing zero bits. The Responder MUST check all 943 of the four returned keys. 945 If any checked result contains fewer bits than were requested, this 946 means that the Initiator spent less resources than expected by the 947 Responder. This request is marked with the lowest priority. 949 If the Initiator provided the solution to the puzzle satisfying the 950 requested difficulty level, or if the Responder didn't indicate any 951 particular difficulty level (by setting ZBC to zero) and the 952 Initiator was free to select any difficulty level it can afford, then 953 the priority of the request is calculated based on the following 954 considerations: 956 o The Responder MUST take the smallest number of trailing zero bits 957 among the checked results and count it as the number of zero bits 958 the Initiator solved for. 960 o The higher number of zero bits the Initiator provides, the higher 961 priority its request should receive. 963 o The more consecutive puzzles the Initiator solved, the higher 964 priority it should receive. 966 o The more time the Initiator spent solving the puzzles, the higher 967 priority it should receive. 969 After the priority of the request is determined the final decision 970 whether to serve it or not is made. 972 7.1.5. Deciding if to Serve the Request 974 The Responder decides what to do with the request based on the 975 request's priority and the Responder's current load. There are three 976 possible actions: 978 o Accept request. 980 o Reject request. 982 o Demand more work from the Initiator by giving it a new puzzle. 984 The Responder SHOULD accept an incoming request if its priority is 985 high - this means that the Initiator spent quite a lot of resources. 986 The Responder MAY also accept some low-priority requests where the 987 Initiators don't support puzzles. The percentage of accepted legacy 988 requests depends on the Responder's current load. 990 If the Initiator solved the puzzle, but didn't spend much resources 991 for it (the selected puzzle difficulty level appeared to be low and 992 the Initiator solved it quickly), then the Responder SHOULD give it 993 another puzzle. The more puzzles the Initiator solves the higher its 994 chances are to be served. 996 The details of how the Responder makes a decision for any particular 997 request are implementation dependent. The Responder can collect all 998 of the incoming requests for some short period of time, sort them out 999 based on their priority, calculate the number of available memory 1000 slots for half-open IKE SAs and then serve that number of requests 1001 from the head of the sorted list. The remainder of requests can be 1002 either discarded or responded to with new puzzle requests. 1004 Alternatively, the Responder may decide whether to accept every 1005 incoming request with some kind of lottery, taking into account its 1006 priority and the available resources. 1008 7.2. Puzzles in an IKE_AUTH Exchange 1010 Once the IKE_SA_INIT exchange is completed, the Responder has created 1011 a state and is waiting for the first message of the IKE_AUTH exchange 1012 from the Initiator. At this point the Initiator has already passed 1013 the return routability check and has proved that it has performed 1014 some work to complete IKE_SA_INIT exchange. However, the Initiator 1015 is not yet authenticated and this allows a malicious Initiator to 1016 perform an attack, described in Section 3. Unlike a DoS attack in 1017 the IKE_SA_INIT exchange, which is targeted on the Responder's memory 1018 resources, the goal of this attack is to exhaust a Responder's CPU 1019 power. The attack is performed by sending the first IKE_AUTH message 1020 containing arbitrary data. This costs nothing to the Initiator, but 1021 the Responder has to perform relatively costly operations when 1022 computing the D-H shared secret and deriving SK_* keys to be able to 1023 verify authenticity of the message. If the Responder doesn't keep 1024 the computed keys after an unsuccessful verification of the IKE_AUTH 1025 message, then the attack can be repeated several times on the same 1026 IKE SA. 1028 The Responder can use puzzles to make this attack more costly for the 1029 Initiator. The idea is that the Responder includes a puzzle in the 1030 IKE_SA_INIT response message and the Initiator includes a puzzle 1031 solution in the first IKE_AUTH request message outside the Encrypted 1032 payload, so that the Responder is able to verify puzzle solution 1033 before computing the D-H shared secret. 1035 The Responder constantly monitors the amount of the half-open IKE SA 1036 states that receive IKE_AUTH messages that cannot be decrypted due to 1037 integrity check failures. If the percentage of such states is high 1038 and it takes an essential fraction of Responder's computing power to 1039 calculate keys for them, then the Responder may assume that it is 1040 under attack and SHOULD use puzzles to make it harder for attackers. 1042 7.2.1. Presenting Puzzle 1044 The Responder requests the Initiator to solve a puzzle by including 1045 the PUZZLE notification in the IKE_SA_INIT response message. The 1046 Responder MUST NOT use puzzles in the IKE_AUTH exchange unless a 1047 puzzle has been previously presented and solved in the preceding 1048 IKE_SA_INIT exchange. 1050 <-- HDR, SA, KE, Nr, N(PUZZLE), [V+][N+] 1052 7.2.1.1. Selecting Puzzle Difficulty Level 1054 The difficulty level of the puzzle in the IKE_AUTH exchange should be 1055 chosen so that the Initiator would spend more time to solve the 1056 puzzle than the Responder to compute the D-H shared secret and the 1057 keys needed to decrypt and verify the IKE_AUTH request message. On 1058 the other hand, the difficulty level should not be too high, 1059 otherwise legitimate clients will experience an additional delay 1060 while establishing the IKE SA. 1062 Note, that since puzzles in the IKE_AUTH exchange are only allowed to 1063 be used if they were used in the preceding IKE_SA_INIT exchange, the 1064 Responder would be able to roughly estimate the computational power 1065 of the Initiator and select the difficulty level accordingly. Unlike 1066 puzzles in the IKE_SA_INIT, the requested difficulty level for 1067 IKE_AUTH puzzles MUST NOT be zero. In other words, the Responder 1068 must always set a specific difficulty level and must not let the 1069 Initiator to choose it on its own. 1071 7.2.1.2. Selecting the Puzzle Algorithm 1073 The algorithm for the puzzle is selected as described in 1074 Section 7.1.1.2. There is no requirement that the algorithm for the 1075 puzzle in the IKE_SA INIT exchange be the same as the algorithm for 1076 the puzzle in IKE_AUTH exchange; however, it is expected that in most 1077 cases they will be the same. 1079 7.2.2. Solving Puzzle and Returning the Solution 1081 If the IKE_SA_INIT regular response message (i.e. the message 1082 containing SA, KE, NONCE payloads) contains the PUZZLE notification 1083 and the Initiator supports puzzles, it MUST solve the puzzle. Note, 1084 that puzzle construction in the IKE_AUTH exchange differs from the 1085 puzzle construction in the IKE_SA_INIT exchange and is described in 1086 Section 7.2.3. Once the puzzle is solved the Initiator sends the 1087 IKE_AUTH request message containing the Puzzle Solution payload. 1089 HDR, PS, SK {IDi, [CERT,] [CERTREQ,] 1090 [IDr,] AUTH, SA, TSi, TSr} --> 1092 The Puzzle Solution (PS) payload MUST be placed outside the Encrypted 1093 payload, so that the Responder is able to verify the puzzle before 1094 calculating the D-H shared secret and the SK_* keys. 1096 If IKE Fragmentation [RFC7383] is used in IKE_AUTH exchange, then the 1097 PS payload MUST be present only in the first IKE Fragment message, in 1098 accordance with the Section 2.5.3 of [RFC7383]. Note, that 1099 calculation of the puzzle in the IKE_AUTH exchange doesn't depend on 1100 the content of the IKE_AUTH message (see Section 7.2.3). Thus the 1101 Initiator has to solve the puzzle only once and the solution is valid 1102 for both unfragmented and fragmented IKE messages. 1104 7.2.3. Computing the Puzzle 1106 A puzzle in the IKE_AUTH exchange is computed differently than in the 1107 IKE_SA_INIT exchange (see Section 7.1.3). The general principle is 1108 the same; the difference is in the construction of the string S. 1109 Unlike the IKE_SA_INIT exchange, where S is the cookie, in the 1110 IKE_AUTH exchange S is a concatenation of Nr and SPIr. In other 1111 words, the task for IKE Initiator is to find the four different keys 1112 Ki for the agreed upon PRF such that each result of PRF(Ki,Nr | SPIr) 1113 where i=[1..4] has a sufficient number of trailing zero bits. Nr is 1114 a nonce used by the Responder in the IKE_SA_INIT exchange, stripped 1115 of any headers. SPIr is the IKE Responder's SPI from the IKE header 1116 of the SA being established. 1118 7.2.4. Receiving the Puzzle Solution 1120 If the Responder requested the Initiator to solve a puzzle in the 1121 IKE_AUTH exchange, then it MUST silently discard all the IKE_AUTH 1122 request messages without the Puzzle Solution payload. 1124 Once the message containing a solution to the puzzle is received, the 1125 Responder MUST verify the solution before performing computationlly 1126 intensive operations i.e., computing the D-H shared secret and the 1127 SK_* keys. The Responder MUST verify all four of the returned keys. 1129 The Responder MUST silently discard the received message if any 1130 checked verification result is not correct (contains insufficient 1131 number of trailing zero bits). If the Responder successfully 1132 verifies the puzzle and calculates the SK_* key, but the message 1133 authenticity check fails, then it SHOULD save the calculated keys in 1134 the IKE SA state while waiting for the retransmissions from the 1135 Initiator. In this case the Responder may skip verification of the 1136 puzzle solution and ignore the Puzzle Solution payload in the 1137 retransmitted messages. 1139 If the Initiator uses IKE Fragmentation, then it sends all fragments 1140 of a message simultaneously. Due to packets loss and/or reordering 1141 it is possible that the Responder receives subsequent fragments 1142 before receiving the first one, that contains the PS payload. In 1143 this case the Responder MAY choose to keep the received fragments 1144 until the first fragment containing the solution to the puzzle is 1145 received. In this case the Responder SHOULD NOT try to verify 1146 authenticity of the kept fragments until the first fragment with the 1147 PS payload is received and the solution to the puzzle is verified. 1148 After successful verification of the puzzle, the Responder can then 1149 calculate the SK_* key and verify authenticity of the collected 1150 fragments. 1152 8. Payload Formats 1154 8.1. PUZZLE Notification 1156 The PUZZLE notification is used by the IKE Responder to inform the 1157 Initiator about the need to solve the puzzle. It contains the 1158 difficulty level of the puzzle and the PRF the Initiator should use. 1160 1 2 3 1161 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1162 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1163 | Next Payload |C| RESERVED | Payload Length | 1164 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1165 |Protocol ID(=0)| SPI Size (=0) | Notify Message Type | 1166 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1167 | PRF | Difficulty | 1168 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1170 o Protocol ID (1 octet) -- MUST be 0. 1172 o SPI Size (1 octet) - MUST be 0, meaning no Security Parameter 1173 Index (SPI) is present. 1175 o Notify Message Type (2 octets) -- MUST be , the value 1176 assigned for the PUZZLE notification. 1178 o PRF (2 octets) -- Transform ID of the PRF algorithm that MUST be 1179 used to solve the puzzle. Readers should refer to the section 1180 "Transform Type 2 - Pseudo-Random Function Transform IDs" in 1181 [IKEV2-IANA] for the list of possible values. 1183 o Difficulty (1 octet) -- Difficulty Level of the puzzle. Specifies 1184 the minimum number of trailing zero bits (ZBC), that each of the 1185 results of PRF must contain. Value 0 means that the Responder 1186 doesn't request any specific difficulty level and the Initiator is 1187 free to select an appropriate difficulty level on its own (see 1188 Section 7.1.1.1 for details). 1190 This notification contains no data. 1192 8.2. Puzzle Solution Payload 1194 The solution to the puzzle is returned back to the Responder in a 1195 dedicated payload, called the Puzzle Solution payload and denoted as 1196 PS in this document. 1198 1 2 3 1199 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1200 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1201 | Next Payload |C| RESERVED | Payload Length | 1202 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1203 | | 1204 ~ Puzzle Solution Data ~ 1205 | | 1206 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1208 o Puzzle Solution Data (variable length) -- Contains the solution to 1209 the puzzle - four different keys for the selected PRF. This field 1210 MUST NOT be empty. All of the keys MUST have the same size, 1211 therefore the size of this field is always a mutiple of 4 bytes. 1212 If the selected PRF accepts only fixed-size keys, then the size of 1213 each key MUST be of that fixed size. If the agreed upon PRF 1214 accepts keys of any size, then then the size of each key MUST be 1215 between 1 octet and the preferred key length of the PRF 1216 (inclusive). It is expected that in most cases the keys will be 4 1217 (or even less) octets in length, however it depends on puzzle 1218 difficulty and on the Initiator's strategy to find solutions, and 1219 thus the size is not mandated by this specification. The 1220 Responder determines the size of each key by dividing the size of 1221 the Puzzle Solution Data by 4 (the number of keys). Note that the 1222 size of Puzzle Solution Data is the size of Payload (as indicated 1223 in Payload Length field) minus 4 - the size of Payload Header. 1225 The payload type for the Puzzle Solution payload is . 1227 9. Operational Considerations 1229 The puzzle difficulty level should be set by balancing the 1230 requirement to minimize the latency for legitimate Initiators with 1231 making things difficult for attackers. A good rule of thumb is for 1232 taking about 1 second to solve the puzzle. A typical Initiator or 1233 botnet member at the time this document is written can perform 1234 slightly less than a million hashes per second per core, so setting 1235 the number of zero bits to 20 is a good compromise. It should be 1236 noted that mobile Initiators, especially phones are considerably 1237 weaker than that. Implementations should allow administrators to set 1238 the difficulty level, and/or be able to set the difficulty level 1239 dynamically in response to load. 1241 Initiators SHOULD set a maximum difficulty level beyond which they 1242 won't try to solve the puzzle and log or display a failure message to 1243 the administrator or user. 1245 Until the widespread adoption of puzzles happens, most Initiators 1246 will ignore them, as will all attackers. For puzzles to become a 1247 really powerfull defense measure against DDoS attacks they must be 1248 supported by the majority of legitimate clients. 1250 10. Security Considerations 1252 Care must be taken when selecting parameters for the puzzles, in 1253 particular the puzzle difficulty. If the puzzles are too easy for 1254 the majority of attacker, then the puzzle mechanism wouldn't be able 1255 to prevent (D)DoS attacks and would only impose an additional burden 1256 on legitimate Initiators. On the other hand, if the puzzles are too 1257 hard for the majority of Initiators, then many legitimate users would 1258 experience unacceptable delays in IKE SA setup (and unacceptable 1259 power consumption on mobile devices), that might cause them to cancel 1260 the connection attempt. In this case the resources of the Responder 1261 are preserved, however the DoS attack can be considered successful. 1262 Thus a sensible balance should be kept by the Responder while 1263 choosing the puzzle difficulty - to defend itself and to not over- 1264 defend itself. It is RECOMMENDED that the puzzle difficulty be 1265 chosen so, that the Responder's load remains close to the maximum it 1266 can tolerate. It is also RECOMMENDED to dynamically adjust the 1267 puzzle difficulty in accordance to the current Responder's load. 1269 If the cookie is generated as suggested in Section 2.6 of [RFC7296], 1270 then an attacker can use the same SPIi and the same Ni for several 1271 requests from the same IPi. This will result in generating the same 1272 cookies for these requests until the Responder changes the value of 1273 its cookie generation secret. Since the cookies are used as an input 1274 data for puzzles in the IKE_SA_INIT exchange, generating same cookies 1275 allows the attacker to re-use puzzle solution, thus bypassing proof 1276 of work requirement. Note, that the attacker can get only limited 1277 benefit from this situation - once the half-open SA is created by the 1278 Responder all the subsequent initial requests with the same IPi and 1279 SPIi will be treated as retransmissions and discarded by the 1280 Responder. However, once this half-open SA is expired and deleted, 1281 the attacker can create a new one for free if the Responder haven't 1282 changed its cookie generation secret yet. 1284 The Responder can use various countermeasures to completely eliminate 1285 or mitigate this scenatio. First, the Responder can change its 1286 cookie generation secret frequently especially if under attack, as 1287 recommended in the Section 2.6 of [RFC7296]. For example, if the 1288 Responder keeps two values of the secret (current and previous) and 1289 the secret lifetime is no more than a half of the current half-open 1290 SA retention time (see Section 4.1), then the attacker cannot get 1291 benefit from re-using puzzle solution. However, short cookie 1292 generation secret lifetime could have negative consequence on weak 1293 legitimate Initiators, since it could take too long for them to solve 1294 puzzles and their solutons would be discarded if the cookie 1295 generation secret has been already changed few times. 1297 Another approach for the Responder is to modify cookie generation 1298 algorithm in such a way, that the generated cookies are always 1299 different or are repeated only within short time period. If the 1300 Responder includes timestamp in the as suggested in 1301 Section 7.1.1.3, then the cookies will repeat only within short time 1302 interval equal to timestamp resolution. Another approach for the 1303 Responder is to maintain a global counter that is incremented every 1304 time a cookie is generated and include this counter in the 1305 . This will make every cookies unique. 1307 Implementations MUST use one of the above (or some other) 1308 countermeasures to completely eliminate or make insignificant the 1309 possible benefit an attacker can get from re-using puzzle solutions. 1310 Note, this issue doesn't exist in IKE_AUTH puzzles (Section 7.2) 1311 since the puzzles in IKE_AUTH are always unique if the Responder 1312 generates SPIr and Nr randomly in accordance with [RFC7296]. 1314 Solving puzzles requires a lot of CPU power that increases power 1315 consumption. This additional power consumption can negatively affect 1316 battery-powered Initiators, e.g. mobile phones or some IoT devices. 1317 If puzzles are too hard, then the required additional power 1318 consumption may appear to be unacceptable for some Initiators. The 1319 Responder SHOULD take this possibility into consideration while 1320 choosing the puzzle difficulty, and while selecting which percentage 1321 of Initiators are allowed to reject solving puzzles. See 1322 Section 7.1.4 for details. 1324 If the Initiator uses NULL Authentication [RFC7619] then its identity 1325 is never verified. This condition may be used by attackers to 1326 perform a DoS attack after the IKE SA is established. Responders 1327 that allow unauthenticated Initiators to connect must be prepared to 1328 deal with various kinds of DoS attacks even after the IKE SA is 1329 created. See Section 5 for details. 1331 To prevent amplification attacks implementations must strictly follow 1332 the retransmission rules described in Section 2.1 of [RFC7296]. 1334 11. IANA Considerations 1336 This document defines a new payload in the "IKEv2 Payload Types" 1337 registry: 1339 Puzzle Solution PS 1341 This document also defines a new Notify Message Type in the "IKEv2 1342 Notify Message Types - Status Types" registry: 1344 PUZZLE 1346 12. Acknowledgements 1348 The authors thank Tero Kivinen, Yaron Sheffer, and Scott Fluhrer for 1349 their contributions to the design of the protocol. In particular, 1350 Tero Kivinen suggested the kind of puzzle where the task is to find a 1351 solution with a requested number of zero trailing bits. Yaron 1352 Sheffer and Scott Fluhrer suggested a way to make puzzle difficulty 1353 less erratic by solving several weaker puzles. The authors also 1354 thank David Waltermire and Paul Wouters for their careful reviews of 1355 the document, Graham Bartlett for pointing out to the possibility of 1356 the "Hash & URL" related attack, Stephen Farrell for catching the 1357 repeated cookie issue, and all others who commented the document. 1359 13. References 1361 13.1. Normative References 1363 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1364 Requirement Levels", BCP 14, RFC 2119, 1365 DOI 10.17487/RFC2119, March 1997, 1366 . 1368 [RFC5723] Sheffer, Y. and H. Tschofenig, "Internet Key Exchange 1369 Protocol Version 2 (IKEv2) Session Resumption", RFC 5723, 1370 DOI 10.17487/RFC5723, January 2010, 1371 . 1373 [RFC7296] Kaufman, C., Hoffman, P., Nir, Y., Eronen, P., and T. 1374 Kivinen, "Internet Key Exchange Protocol Version 2 1375 (IKEv2)", STD 79, RFC 7296, DOI 10.17487/RFC7296, October 1376 2014, . 1378 [RFC7383] Smyslov, V., "Internet Key Exchange Protocol Version 2 1379 (IKEv2) Message Fragmentation", RFC 7383, 1380 DOI 10.17487/RFC7383, November 2014, 1381 . 1383 [IKEV2-IANA] 1384 "Internet Key Exchange Version 2 (IKEv2) Parameters", 1385 . 1387 13.2. Informative References 1389 [bitcoins] 1390 Nakamoto, S., "Bitcoin: A Peer-to-Peer Electronic Cash 1391 System", October 2008, . 1393 [RFC7619] Smyslov, V. and P. Wouters, "The NULL Authentication 1394 Method in the Internet Key Exchange Protocol Version 2 1395 (IKEv2)", RFC 7619, DOI 10.17487/RFC7619, August 2015, 1396 . 1398 [RFC7696] Housley, R., "Guidelines for Cryptographic Algorithm 1399 Agility and Selecting Mandatory-to-Implement Algorithms", 1400 BCP 201, RFC 7696, DOI 10.17487/RFC7696, November 2015, 1401 . 1403 Authors' Addresses 1405 Yoav Nir 1406 Check Point Software Technologies Ltd. 1407 5 Hasolelim st. 1408 Tel Aviv 6789735 1409 Israel 1411 EMail: ynir.ietf@gmail.com 1413 Valery Smyslov 1414 ELVIS-PLUS 1415 PO Box 81 1416 Moscow (Zelenograd) 124460 1417 Russian Federation 1419 Phone: +7 495 276 0211 1420 EMail: svan@elvis.ru