Implementation Guidance for the PKCS #1 RSA Cryptography Specification

Internet-Draft	RSA Implementation Guidance	November 2023
Kario	Expires 25 May 2024	[Page]

Abstract

This document specifies additions and amendments to RFC 8017. Specifically, it provides guidance to implementers of the standard to protect against side-channel attacks. It also deprecates the RSAES-PKCS-v1_5 encryption scheme, but provides an alternative depadding algorithm that protects against side-channel attacks raising from users of vulnerable APIs. The purpose of this specification is to increase security of RSA implementations.¶

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶

This Internet-Draft will expire on 25 May 2024.¶

Copyright Notice

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶

1. Introduction

The PKCS #1 [RFC8017] describes the RSA cryptosystem, providing guidance on implementing encryption schemes and signature schemes.¶

Unfortunately, straight-forward implementation of the RSA encryption schemes leave it vulnerable to side-channel attacks. Protections against them are not documented in RFC 8017, and attacks are mentioned only in passing.¶

2. Rationale

The RSAES-PKCS-v1_5 encryption scheme is known to be problematic since 1998, when Daniel Bleichenbacher published his attack [Bleichenbacher98]. Side-channel attacks against public key algorithms, including RSA, are known to be possible since 1996 thanks to work by Paul Kocher [Kocher96].¶

Despite those results, side-channel attacks against RSA implementations have proliferated for the next 25 years. Including attacks against simple exponentiation implementations [Dhem98][Schindler01], implementations that use the Chinese Remainder Theorem optimisation [Schindler00][Brumley03] [Aciicmez05], and implementations that use either base or exponent blinding exclusively [Aciicmez07][Aciicmez08] [Schindler14].¶

Similarly, side-channel free handling of the errors from the RSAES-PKCS-v1_5 decryption operation is something that implementations struggle with [Bock18][Kario23].¶

We thus provide guidance how to implement those algorithms in a way that should be secure against at least the simple timing side channel attacks.¶

3. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].¶

4. Notation

In this document we reuse the notation from RFC 8017, in addition, we define the following:¶

AL

alternative message length, non-negative integer, 0 <= AL <= k - 11¶

AM

alternative encoded message, an octet string¶

bb

base blinding factor, a positive integer¶

bbInv

base un-blinding factor, a positive integer,¶

  bbInv = bb^(-1) mod n

D

octet string representation of d¶

DH

an octet string of a SHA-256 hash of D¶

KDK

an octet string containing a Key Derivation Key for a specific ciphertext C¶

l

length in octets of the message M¶

b_i

an exponent blinding factor for i-th prime, non-negative integer.¶

5. Side channel attacks

Cryptographic implementations may provide a lot of indirect signals to the attacker that includes information about the secret processed data. Depending on type of information, those leaks can be used to decrypt data or retrieve private keys. Most common side-channels that leak information about secret data are:¶

Different errors returned¶
Different processing times of operations¶
Different patterns of jump instructions and memory accesses¶
Use of hardware instructions that take different amount time to execute depending on operands or result¶

Some of those leaks may be detectable over the network, while others may require closer access to the attacked system. With closer access, the attacker may be able to measure power usage, electromagnetic emanations, or sounds and correlate them with specific bits of secret information.¶

Recent research into network based side channel detection has shown that even very small side channels (of just few clock cycles) can be reliably detected over the network. The detectability depends on the sample size the attacker is able to collect, not on size of the side-channel.¶

6. General recommendations

As a general rule, all operations that process secret information (be it parts of the private key or parts of encrypted message) MUST be performed with code that doesn't have secret data dependent branch instructions, secret data dependent memory accesses, or uses non-constant time machine instructions (which ones are those is architecture dependant, but division is commonly non-constant time).¶

Special care should be placed around the code that handles the conversion of the numerical representation to the octet string representation in RSA decryption operations.¶

All operations that use private keys SHOULD additionally employ both base blinding and exponent blinding as protections against leaks inside modular exponentiation code.¶

7. Side-channel free modular exponentiation

The underlying modular exponentiation algorithm MUST be constant time with regards to the exponent in all uses of the private key.¶

For private key decryption the modular exponentiation algorithm MUST be constant time with regards to the output of the exponentiation.¶

In case the Chinese remainder theorem optimisation is used the modular exponentiation algorithm must also be constant time with regards to the used moduli.¶

7.1. General recommendations

It's especially important to make sure that all values that are secret to the attacker are stored in memory buffers that have sizes determined by the public modulus.¶

For example, the private exponents should be stored in memory buffers that have sizes determined by the public modulus value, not the numerical values of the exponents themselves.¶

Similarly, the size of the output buffer for multiplication should always be equal to the sum of buffer sizes of multiplicands. The output size of the modular reduction operation should similarly be equal to the size of the modulus and not depend on bit size of the output.¶

7.2. Montgomery ladder

For the modular exponentiation algorithm to be side-channel free every step of the calculation MUST NOT depend on the bits of the exponent. In particular, use of simple square and multiply algorithm will leak information about bits of the exponent through lack of multiplication operation in individual exponentiation steps.¶

The recommended workaround against it, is the use of the Montgomery ladder construction.¶

While that approach ensures that both the square and multiply operations are performed, the fact that the results of them are placed in different memory locations based on bits of the secret exponent will provide enough information for an attacker to recover the bits of the exponent. To counteract it, the implementation should ensure that both memory locations are accessed and updated on every step.¶

7.3. Montgomery reduction in multiplication

As multiplication operations quickly make the intermediate values in modular exponentiation large, performing a modular reduction after every multiplication or squaring operation is a common optimisation.¶

To further optimise the modular reduction, the Montgomery modular multiplication is used for performing the combined multiply-and-reduce operation. The last step of that operation is conditional on the value of the output. A side-channel free implementation should perfom the subtraction in all cases and then copy the result or the first operand of the subtraction based on sign of the result of the subtraction in side-channel free manner.¶

8. Base blinding

As a protection against multiple attacks, it's RECOMMENDED to perform all operations involving the private key with the use of blinding [Kocher96].¶

It should be noted that for decryption operations the unblinding operation MUST be performed using side-channel free code that does not leak information about the result of this multiplication and reduction modulo operation.¶

To implement base blinding, select a number bb uniformly at random such that it is relatively prime to n and smaller than n.¶

Compute multiplicative inverse of bb modulo n.¶

  bbInv = bb^(-1) mod n

In the RSADP() operation, after performing step 1, multiply c by bb mod n. Use the result as new c for all the remaining operations.¶

Before returning the value m in step 3, multiply it by bbInv mod n.¶

Note: multiplication by bbInv and reduction modulo n MUST be performed using side-channel free code with respect to value m.¶

As calculating multiplicative inverse is expensive, implementations MAY calculate new values of bb and bbInv by squaring them:¶

  new bb = bb^2 mod n
  new bbInv = bbInv^2 mod n

A given pair of blinding factors (bb, bbInv) MUST NOT be used for more than one RSADP() operation.¶

Unless the multiplication (squaring) and reduction modulo operations are verified to be side-channel free, it's RECOMMENDED to generate completely new blinding parameters every few hundred private key operations.¶

9. Exponent blinding

To further protect against private key leaks, it's RECOMMENDED to perform the blinding of the used exponents [Kocher96].¶

When performing the RSADP() operation, the blinding depends on the form of the private key.¶

If the key is in the first form, the pair (n, d), then the exponent d should be modified by adding a multiple of Euler phi(n): m = c^(d + b*phi(n)) mod n. Where b is a 64 bit long uniform random number.¶

A new value b MUST be selected for every RSADP() operation.¶

If the key is the second form, the quintuple (p, q, dP, dQ, qInv) with optional sequence of triplets (r_i, d_i, t_i), i = 3, ..., u, then each exponent used MUST be blinded individually.¶

The m_1 = c^(dP + b_1 * phi(p)) mod p¶
The m_2 = c^(dQ + b_2 * phi(q)) mod q¶
If u > 3, then m_i = c^(d_i + b_i * phi(r_i)) mod (r_i)¶

Where b_1, b_2, ..., b_i are all uniformly selected random numbers at least 64 bits long (or at least 2 machine word sizes, whichever is greater).¶

As Euler phi(p) for an argument p that's a prime is equal p - 1, it's simple to calculate in this case.¶

Note: the selection of random b_i values, multiplication of them by the result of phi() function, and addition to the exponent MUST be performed with side-channel free code.¶

Use of smaller blinding factor is NOT RECOMMENDED, as values shorter than 64 bits have been shown to still be vulnerable to side-channel attacks[Bauer12][Schindler11].¶

The b_1, b_2, ..., b_i factors MUST NOT be reused for multiple RSADP() operations.¶

10. Depadding

In case of RSA-OAEP, the padding is self-verifying, thus the depadding operation needs to follow the standard algorithm to provide a safe API to users.¶

It MUST ignore the value of the very fist octet of padding and process the remaining bytes as if it was equal zero.¶

The RSAES-PKCS-v1_5 encryption scheme is considered deprecated, and should be used only to process legacy data. It MUST NOT be used as part of online protocols or API endpoints.¶

For implementations that can't remove support for this padding mode it's RECOMMENDED to implement an implicit rejection mechanism that completely hides from the calling code whether the padding check failed or not.¶

It should be noted that the algorithm MUST be implemented as stated, otherwise in case of heteregonous environments where two implementations use the same key but implement the implicit rejection differently, it may be possible for the attacker to compare behaviour between the implementations to guess if the padding check failed or not.¶

The basic idea of the implicit rejection is to prepare a random but deterministic message to be returned in case the standard RSAES-PKCS-v1_5 padding checks fail. To do that, use the private key and the provided ciphertext to derive a static, but unknown to the attacker, random value. It's a combination of the method documented in the TLS 1.2 (RFC 5246[RFC5246]) and the deterministic (EC)DSA signatures (RFC 6979 [RFC6979]).¶

10.1. IRPRF

For the calculation of the random message for implicit rejection we define a Pseudo-Random Function (PRF) as follows:¶

IRPRF( KDK, label, length )¶

Input:¶

KDK the key derivation key¶

label a label making the output unique for a given KDK¶

length requested length of output in octets¶

Output: derived key, an octet string¶

Steps:¶

If KDK is not 32 octets long, or if length is larger than 8192 return error and stop.¶
The returned value is created by concatenation of subsequent calls to a SHA-256 HMAC function with the KDK as the HMAC key and following octet string as the message:¶
```
  P_i = I || label || bitLength
```
¶
Where the I is an iterator value encoded as two octet long big endian integer, label is the passed in label, and bitLength is the length times 8 (to represent number of bits of output) encoded as two octet big endian integer. The iterator is initialised to 0 on first call, and then incremented by 1 for every subsequent HMAC call.¶
The HMAC is iterated until the concatenated output is shorter than length¶
The output is the length left-most octets of the concatenated HMAC output¶

10.2. Implicit rejection

For implementations that cannot remove support for the RSAES-PKCS-v1_5 encryption scheme nor provide a usage-specific API, it's possible to implement an implicit rejection algorithm as a protection measure. It should be noted that implementing it correctly is hard, thus it's RECOMMENDED instead to disable support for RSAES-PKCS-v1_5 padding instead.¶

To implement implicit rejection, the RSAES-PKCS1-V1_5-DECRYPT from section 7.2.2 of RFC 8017 needs to be implemented as follows:¶

Length checking: If the length of the ciphertext C is not k octets (or if k < 11), output "decryption error" and stop.¶
RSA decryption:¶
1. Convert the ciphertext C to an integer ciphertext representative c:¶
```
  c = OS2IP (C).
```
  ¶
2. Apply the RSADP decryption primitive to the RSA private key (n, d) and the ciphertext representative c to produce an integer message representative m:¶
```
  m = RSADP ((n, d), c).
```
  ¶
  Note: the RSADP MUST be constant-time with respect of message m.¶
  
  If RSADP outputs "ciphertext representative out of range" (meaning that c >= n), output "decryption error" and stop.¶
3. Convert the message representative m to an encoded message EM of length k octets:¶
```
  EM = I2OSP (m, k).
```
  ¶
  Note: I2OSP MUST be constant-time with respect of m.¶
Derivation of alternative message¶
1. Derive the Key Derivation Key (KDK)¶
  1. Convert the private expoent d to a string of length k octets:¶
```
  D = I2OSP (d, k).
```
    ¶
  2. Hash the private exponent using the SHA-256 algorithm:¶
```
  DH = SHA256 (D).
```
    ¶
    Note: This value MAY be cached between the decryption operations, but MUST be considered private-key equivalent.¶
  3. Use the DH as the SHA-256 HMAC key and the provided ciphertext C as the message. If the ciphertext C is not k octets long, it MUST be left padded with octets of value zero.¶
```
  KDK = HMAC (DH, C, SHA256).
```
    ¶
2. Create the candidate lengths and the random message¶
  1. Use the IRPRF with key KDK, "length" as six octet label encoded with UTF-8, to generate 256 octet output. Interpret this output as 128 two octet long big-endian numbers.¶
```
  CL = IRPRF (KDK, "length", 256).
```
    ¶
  2. Use the IRPRF with key KDK, "message" as a seven octet label encoded with UTF-8 to generate k octet long output to be used as the alternative message:¶
```
  AM = IRPRF (KDK, "message", k).
```
    ¶
3. Select the alternative length for the alternative message.¶
  
  Note: this must be performed in side-channel free way.¶
  1. Iterate over the 128 candidate CL lengths. For each zero out high order bits so that they have the same bit length as the maximum valid message size (k - 11).¶
  2. Select the last length that's not larger than k - 11, use 0 if none are. Save it as AL.¶
EME-PKCS1-v1_5 decoding: Separate the encoded message EM into an octet string PS consisting of nonzero octets and a message M as¶
```
  EM = 0x00 || 0x02 || PS || 0x00 || M.
```
¶
If the first octet of EM does not have hexadecimal value 0x00, if the second octet of EM does not have hexadecimal value 0x02, if there is no octet with hexadecimal value 0x00 to separate PS from M, or if the length of PS is less than 8 octets, the check variable must remember if any of those checks failed. Irrespective of the check variable value, the code should also return length of message M: L. If there is no octet with hexadecimal value 0x00 to separate PS from M, then L should equal 0.¶

Note: All those checks MUST be performed irrespective if previous checks failed or not. A common technique for that is to have a check variable that is OR-ed with the results of subsequent checks.¶
Decision which message to return: in case the check variable is set, the code should return the last AL octets of AM, in case the check variable is unset the code should return the last L octets of EM.¶

Note: The decision which length to use MUST be performed in side-channel free manner. While the length of the returned message is not considered sensitive, the read memory location is. As such, when returning message M both EM and AM memory locations MUST be read.¶

11. Non-fixes

While there are infinite ways to implement those algorithms incorrectly few common ideas to work-around side-channel attacks are repeated. We list few of them as examples of approaches that don't work and thus MUST NOT be used.¶

11.1. Random delays

«describe that adding random delays doesn't work»¶

11.2. Returing random value from API

«describe why returning a random message, even of user-specified length is not a universal solution»¶

12. Deprecated Algorithms

Current protocol deployments MUST NOT use encryption with RSAES-PKCS-v1_5 padding. Support for RSAES-PKCS-v1_5 SHOULD be disabled in default configuration of any implementation of RSA cryptosystem. All new protocols MUST NOT specify RSAES-PKCS-v1_5 as a valid encryption padding for RSA keys.¶

[RFC8017]: Moriarty, K., Ed., Kaliski, B., Jonsson, J., and A. Rusch, "PKCS #1: RSA Cryptography Specifications Version 2.2", RFC 8017, DOI 10.17487/RFC8017, November 2016, <https://www.rfc-editor.org/info/rfc8017>.

15.2. Informative References

[Aciicmez05]: Acıiçmez, O., Schindler, W., and Ç. K. Koç, "Improving Brumley and Boneh timing attack on unprotected SSL implementations", Proceedings of the 12th ACM conference on Computer and communications security CCS '05, 2005, <https://doi.org/10.1145/1102120.1102140>.
[Aciicmez07]: Acıiçmez, O. and W. Schindler, "A Major Vulnerability in RSA Implementations due to MicroArchitectural Analysis Threat", Cryptology ePrint Archive Paper 2007/336, 2007, <https://eprint.iacr.org/2007/336>.
[Aciicmez08]: Acıiçmez, O. and W. Schindler, "A Vulnerability in RSA Implementations Due to Instruction Cache Analysis and Its Demonstration on OpenSSL", Lecture Notes in Computer Science vol 4964, 2007, <https://doi.org/10.1007/978-3-540-79263-5_16>.
[Bauer12]: Bauer, S., "Attacking Exponent Blinding in RSA without CRT", Lecture Notes in Computer Science vol 7275, 2012, <https://doi.org/10.1007/978-3-642-29912-4_7>.
[Bleichenbacher98]: Bleichenbacher, D., "Chosen Ciphertext Attacks Against Protocols Based on the RSA Encryption Standard PKCS#1", Lecture Notes in Computer Science vol 1462, <https://doi.org/10.1007/BFb0055716>.
[Bock18]: Böck, H., Somorovsky, J., and C. Young, "Return Of Bleichenbacher's Oracle Threat (ROBOT)", 27th USENIX Security Symposium USENIX Security 18, 2018, <https://www.usenix.org/conference/usenixsecurity18/presentation/bock>.
[Brumley03]: Brumley, D. and D. Boneh, "Remote timing attacks are practical", Computer Networks Volume 48, Issue 5, 2003, <https://doi.org/10.1016/j.comnet.2005.01.010>.
[Dhem98]: Dhem, J., Koeune, F., Leroux, P., Mestré, P., Quisquater, J., and J. Willems, "A Practical Implementation of the Timing Attack", Lecture Notes in Computer Science vol 1820, 1998, <https://doi.org/10.1007/10721064_15>.
[Kario23]: Kario, H., "Everlasting ROBOT: the Marvin Attack", 28th European Symposium on Research in Computer Security , 2023, <https://eprint.iacr.org/2023/1442>.
[Kocher96]: Kocher, P. C., "Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems.", Lecture Notes in Computer Science vol 1109, 1996, <https://doi.org/10.1007/3-540-68697-5_9>.
[RFC2119]: Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>.
[RFC5246]: Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS) Protocol Version 1.2", RFC 5246, DOI 10.17487/RFC5246, August 2008, <https://www.rfc-editor.org/info/rfc5246>.
[RFC6979]: Pornin, T., "Deterministic Usage of the Digital Signature Algorithm (DSA) and Elliptic Curve Digital Signature Algorithm (ECDSA)", RFC 6979, DOI 10.17487/RFC6979, August 2013, <https://www.rfc-editor.org/info/rfc6979>.
[Schindler00]: Schindler, W., "A Timing Attack against RSA with the Chinese Remainder Theorem", Lecture Notes in Computer Science vol 1965, 2000, <https://doi.org/10.1007/3-540-44499-8_8>.
[Schindler01]: Schindler, W., Koeune, F., and J. Quisquater, "Improving Divide and Conquer Attacks against Cryptosystems by Better Error Detection / Correction Strategies.", Lecture Notes in Computer Science vol 2260, 2001, <https://doi.org/10.1007/3-540-45325-3_22>.
[Schindler11]: Schindler, W. and K. Itoh, "Exponent Blinding Does Not Always Lift (Partial) Spa Resistance to Higher-Level Security", Lecture Notes in Computer Science vol 6715, 2011, <https://doi.org/10.1007/978-3-642-21554-4_5>.
[Schindler14]: Schindler, W. and A. Wiemers, "Power attacks in the presence of exponent blinding.", Journal of Cryptographic Engineering 4, 2014, <https://doi.org/10.1007/s13389-014-0081-y>.

Appendix A. Test Vectors

A.1. 2048 bit key

«provide test vectors here»¶

see also: https://github.com/tlsfuzzer/tlslite-ng/blob/master/unit_tests/test_tlslite_utils_rsakey.py#L1694¶

proposed test vectors:¶

Otherwise valid, but with wrong first byte of plaintext¶

Otherwise valid, but with padding type specifying signature¶

Otherwise valid, but with PS of 7 bytes¶

Otherwise valid, but with PS of 0 bytes¶

Otherwise valid, but with the message separator byte missing¶

invalid ciphertext that decrypts to a synthehic message of maximum size¶

invalid ciphertext that decrypts to a 0-bytes long message¶

invalid ciphertext that needs to use the second-to-last synthethic length for the returned message¶

valid ciphertext that starts with a zero byte¶

Purkynova 115

61200 Brno

Czech Republic

Email: hkario@redhat.com