idnits 2.17.1 draft-bellovin-hpw-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 11, 2012) is 4426 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 1321 ** Downref: Normative reference to an Informational RFC: RFC 2104 ** Obsolete normative reference: RFC 3454 (Obsoleted by RFC 7564) ** Downref: Normative reference to an Informational RFC: RFC 6124 ** Downref: Normative reference to an Informational RFC: RFC 6234 Summary: 6 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Bellovin 3 Internet-Draft Columbia University 4 Intended status: Standards Track March 11, 2012 5 Expires: September 12, 2012 7 Hashed Password Exchange 8 draft-bellovin-hpw-01.txt 10 Abstract 12 Many systems (e.g., cryptographic protocols relying on symmetric 13 cryptography) require that plaintext passwords be stored. Given how 14 often people reuse passwords on different systems, this poses a very 15 serious risk if a single machine is compromised. We propose a scheme 16 to derive passwords limited to a single machine from a typed 17 password, and explain how a protocol definition can specify this 18 scheme. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on September 12, 2012. 37 Copyright Notice 39 Copyright (c) 2012 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 1. Introduction 54 Today, despite the lessons of more than 30 years [[cite Morris and 55 Thomson]], many systems store plaintext passwords. This is often 56 done for good reasons, such as authenticating some cryptographic 57 exchanges or as a convenience to users with many passwords; see, for 58 example, the password store in many browsers or the Keychain in 59 MacOS. That said, this practice does pose a security risk to users, 60 since their passwords are in danger if the system is compromised. 62 The big problem is not compromise of the actual password used on that 63 system; while regrettable, it is inherent in the service definition. 64 Rather, the problem is that users tend to reuse passwords on 65 different systems. If a password is compromised on one machine, the 66 user is at risk on many different systems. Accordingly, we describe 67 a scheme for storing a single-site-only password, derived from the 68 user's typed password; a compromise of a service thus affects just 69 that service. 71 To accomplish this, we specify a "Hashed Password Exchange" standard, 72 or rather, a metastandard. Rather than specifying a precise way to 73 store and use hashed passwords, we give rules for specifying hashed 74 passwords for use in a given protocol or application. We take 75 advantage of the fact that unlike 1979, when users used very dumb 76 terminals to transmit passwords directly to the receiving 77 applications, most passwords these days are entered into user- 78 controlled software; these programs in turn transmit the passwords to 79 the verifying applications. There is thus intelligence on the user's 80 side; we will use this to irreversibly transform the entered password 81 into some other string. By the same token, the receiving system must 82 apply the same transform to the authenticator supplied at user 83 enrollment time or password change time. Because two independent 84 pieces of software must apply the same transformation, the algorithm 85 must be precisely specified in standards documents. 87 Note that defeating guessing attacks on a captured password file is 88 not the primary goal of this work. That goal, though laudable, 89 ignores changes in technology and environment since the Morris and 90 Thompson paper; today, far more passwords are lost to keystroke 91 loggers, phishing attacks, direct compromise of the server itself, or 92 (as was a problem even 30+ years ago) online guessing attacks. Our 93 scheme helps against this last attack, in that generation of the 94 guesses becomes more expensive; against the other threats, password 95 strength is completely irrelevant. We also note that today, people 96 have very many different passwords. It is impossible to remember 97 large numbers of strong passwords; absent use of a password generator 98 and manager, there *will* be reuse across different services. 100 1.1. Requirements Notation 102 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 103 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 104 document are to be interpreted as described in [RFC2119]. 106 2. Definitions and Goals 108 We use the following definitions: 110 Username An arbitrary string, the syntax of which is application- 111 dependent, employed by both the user and the verifying system to 112 uniquely identify a given user. 114 Entered Password The authenticator typed by the user to his or her 115 own software. The usual quality rules (length, special 116 characters, etc.) can be applied; that is out of the scope of this 117 standard. 119 Effective Password The actual, over-the-wire, string transmitted by 120 the user's software. 122 Service A particular application on a particular machine or cluster 123 of machines appearing as a single machine 125 Hostname The hostname as supplied by the user. 127 Service URI A URI [RFC3986] for which this effective password should 128 be valid. Only the scheme name, userinfo, and host name portions 129 are discussed here; use of path information is protocol-dependent. 130 In the userinfo field, only the username is used. An example is 131 given below. 133 Our scheme has the following goals: 135 1. No two users of a given service should have the same effecive 136 password, even if the entered passwords are the same. 138 2. No two effective passwords for the same user should be the same 139 for different services, even if the entered passwords are the 140 same. 142 3. It should be infeasible to invert the hashing function to 143 retrieve the entered password from an effective password and 144 service URI. 146 4. It should be computationally expensive to mount dictionary 147 attacks on compromised effective passwords. 149 3. The Hashed Password Scheme 151 Fundamentally, we calculate the effective password by iterating HMAC 152 [RFC2104], using the entered password as the key and the service URI 153 as the data. This meets all four of our goals: 155 1. Since the username is part of the service URI, different users 156 will have different URIs, and hence different effective 157 passwords. 159 2. Since the hostname is part of the URI, different services for any 160 given user will have different URIs, and hence different 161 effective passwords. 163 3. For any reasonable underlying hash function, it is believed to be 164 infeasible to invert HMAC; see [RFC2104] for details. (Arguably, 165 HMAC is overkill. Nevertheless, it is a well-studied, well- 166 understood mechanism for combining known plaintext with a secret 167 key. We see little benefit to concocting some other scheme.) 169 4. By iterating a sufficient number of times, dictionary attacks can 170 be made arbitrarily expensive. (Although guessing attacks can be 171 made arbitrarily cheap today by use of cloud services or botnets, 172 we prefer to look at it somewhat differently. Whatever the 173 resources the attacker has, his or her effective guessing rate is 174 cut by a factor of the iteration count.) 176 We do not use a salt in this scheme. The primary purposes of a salt 177 are to achieve our first and second goals, which we achieve in other 178 ways. A salt also protects against precomputation of possible 179 passwords of known users in anticipation of a later password file 180 compromise. Our use of service-, host-, and user-specific hashed 181 passwords provides the same protection against untargeted guessing 182 attacks; furthermore, and as noted, guessing attacks are not the 183 primary threat today. Since the salt must be used in calculating the 184 effective password, it would have to be known to the user as well as 185 the server, and users typically have multiple devices on which they 186 enter passwords. Using a salt would require that users know it and 187 reenter it, which we regard as of limited benefit and highly user- 188 hostile: people will *not* tolerate copying random strings or numbers 189 onto multiple platforms, especially phones and the like. 191 Usernames and the hostname portions of service URIs must be 192 canonicalized before applying HMAC. Legal characters in a username 193 are upper and lower case US-ASCII letters, period, hyphen, 194 underscore, and digits. All other characters MUST be percent- 195 encoded, per section 2.1 of [RFC3986]. Hostnames MUST be 196 canonicalized per [RFC5890][RFC5891] and converted to lower case. 197 How usernames and hostnames are entered is application- and 198 implementation-dependent, and not part of this specification. The 199 hostname used is either the string users type or unambiguously 200 derivable from it per specified rules. 202 The URI scheme name is given by the protocol specification and MUST 203 NOT be entered directly by the user. 205 The iteration count is protocol- and use-dependent, and given in the 206 protocol specification. 208 The effective password, then, is calculated by iterating HMAC some 209 number of times over the message 211 scheme://username@hostname 213 with the entered password as the key. 215 3.1. Examples 217 ipsec://someuser@gw.example.net 218 imap://someuser@mail.example.com 219 submission://someuser@mail.example.com 221 Note that although someuser can specify the same entered password for 222 both 'imap' and 'submission' on mail.example.com, the effective 223 passwords will be different. 225 4. Specifying Hashed Password Exchange 227 The following elements must be in any protocol specification that 228 uses Hashed Password Exchange. 230 o The scheme name MUST be specified. Generally, this will be taken 231 from the IANA name assigned to the port, but this is not required. 232 Thus, a mail submission URI (TCP port 587) might use the scheme 233 name "submission". 235 o The rules for deriving the hostname from what users enter MUST be 236 specified. They may be as simple as "use the name the user 237 specifies, e.g., imap.example.com", or they may account for common 238 alternatives: "If the specified host name does not begin with 239 'www.', prepend it; thus, both 'example.com' and 'www.example.com' 240 would use the hostname 'www.example.com' in forming the URI. 242 o The iteration count MUST be specified. The value -- typically in 243 the hundreds of thousands with today's technology -- SHOULD be 244 different for different services, and MAY be adjusted based on the 245 platforms on which the calculations are typically done. Note that 246 the iteration is done at password change time rather than run- 247 time, so expense is not a major concern. (Just how long the 248 iterations should take will depend on the protocol designers' 249 understanding of likely platforms and usage patterns. Something 250 that will be run exclusively on fast devices and with stored 251 hashed passwords should use a higher count; something where run- 252 time user password entry on a slow device is considered likely 253 should use a lower count.) 255 o To support internationalized, non-ASCII passwords, we adopt the 256 specification text from [RFC6124]. The input password string 257 SHOULD be processed according to the rules of the [RFC4103] 258 profile of [RFC3454] A password SHOULD be considered a "stored 259 string" per [RFC3454] and unassigned code points are therefore 260 prohibited. The output is the binary representation of the 261 processed UTF-8 [RFC3629] character string. Prohibited output and 262 unassigned code points encountered in SASLprep preprocessing 263 SHOULD cause a preprocessing failure and the output SHOULD NOT be 264 used. 266 o The hash function to be used with HMAC MUST be specified. MD5 267 [RFC1321] is more than sufficient; however, the tradeoff is likely 268 to be between what code is likely to be available in 269 implenetations versus the iteration count. SHA-512 [RFC6234] is 270 much slower than MD5, but since the goal is constant time, this 271 matters very little; thus, MD5 would have a higher iteration count 272 than SHA-512 would for the same protocol. 274 o The encoding rules for sending the effective password over the 275 wire are not crucial but must be specified. The output of HMAC is 276 an arbitrary byte string. Given the length of typical HMAC output 277 and the infrequency with which they are sent, transmission 278 efficiency is not a major concern, so a simple hexadecimal 279 encoding is fine. Implementations MAY specify truncation; 280 however, they SHOULD NOT use effective passwords shorter than 16 281 octets before encoding. 283 o If the password is not transmitted but is used internally (e.g., 284 as part of a cryptopgrahic exchange), how the effective password 285 is used MUST be specified. Some protocols will use it directly as 286 a key; others will use the hexadecimal ASCII string in place of a 287 password. 289 o Some protocols, such as HTTP, permit multiple hosts to appear on a 290 single IP address. For such protocols, the desired hostname must 291 be transmitted prior to or along with the hashed password, to 292 allow the host to calculate the proper hashed password value. How 293 this is done MUST be specified. 295 o If the protocol permits negotiation of authentication methods, a 296 separate code point MUST be assigned to this scheme. 298 How passwords are changed -- that is, how new effective passwords are 299 supplied to the verifying machine -- is beyond the scope of this 300 specification. If the entered password is sent directly at password 301 change time, quality checks can be enforced; however, this exposes 302 entered passwords to attacks who have compromised the verifying 303 machine. This is not a major risk, since the rate of password change 304 is low. Conversely, client-side code (e.g., Javascript) can make 305 advisory recommendations on password strength; while the server 306 cannot enforce this, since it will see only effective passwords, very 307 few users will have the will and the skill to override this. 309 If effective passwords are used only for the usual password 310 verification and not for cryptographic purposes, they should be 311 treated with the care used for ordinary password, i.e., read- 312 protected, hashed, etc. There is little need for extra iterations, 313 though, since the iteration used in calculating them already provides 314 strong protection against dictionary attacks, and it is unlikely that 315 the extra server-side iterations will be significantly larger than 316 the iterations already performed to comply with this specification. 317 As before, there is no need for an additional salt. 319 5. Related Work 321 A number of papers have described schemes for browser-based password 322 stores that simplify the process of having separate effective 323 passwords for different web sites. Many -- [[Abadi--pwdhash]] 324 [[Halderman et al.]] -- use a cryptographic function of the domain 325 name and a master password to calculate it. [[Abadi-pwdhash]] has 326 many pointers. 328 This work differs in two important ways. First, it applies to more 329 services than just HTTP. Second, it specifies how other protocol 330 specification documents should handle the situation, independent of 331 requirements for password strength. 333 6. Acknowledgments 335 A number of people made useful comments and suggestions, even if they 336 didn't agree with all parts of this document. They include Martin 337 Abadi, Uri Blumenthal, Dan Harkins, Mouse, Yaron Sheffer, Joe Touch, 338 and Sujing Zhou. 340 7. Security Considerations 342 To be written. 344 8. Normative References 346 [RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, 347 April 1992. 349 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 350 Hashing for Message Authentication", RFC 2104, 351 February 1997. 353 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 354 Requirement Levels", BCP 14, RFC 2119, March 1997. 356 [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of 357 Internationalized Strings ("stringprep")", RFC 3454, 358 December 2002. 360 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 361 10646", STD 63, RFC 3629, November 2003. 363 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 364 Resource Identifier (URI): Generic Syntax", STD 66, 365 RFC 3986, January 2005. 367 [RFC4103] Hellstrom, G. and P. Jones, "RTP Payload for Text 368 Conversation", RFC 4103, June 2005. 370 [RFC5890] Klensin, J., "Internationalized Domain Names for 371 Applications (IDNA): Definitions and Document Framework", 372 RFC 5890, August 2010. 374 [RFC5891] Klensin, J., "Internationalized Domain Names in 375 Applications (IDNA): Protocol", RFC 5891, August 2010. 377 [RFC6124] Sheffer, Y., Zorn, G., Tschofenig, H., and S. Fluhrer, "An 378 EAP Authentication Method Based on the Encrypted Key 379 Exchange (EKE) Protocol", RFC 6124, February 2011. 381 [RFC6234] Eastlake, D. and T. Hansen, "US Secure Hash Algorithms 382 (SHA and SHA-based HMAC and HKDF)", RFC 6234, May 2011. 384 Appendix A. Change History 386 A.1. Changes from -00 to -01 388 Added more text explaining why salting isn't particularly helpful 390 Add the requirement to transmit the hostname for some services 392 Started a related work section 394 Clarified the internationalization requirement 396 Miscellaneous edits 398 Appendix B. Open Issues 400 How should related domains (e.g., www.amazon.com and 401 www.amazon.co.uk) be handled, if the site wishes the same password 402 to work on all of them. 404 A particular case in point is the way the prefix "www." should be 405 handled. Should there be a general rule about the service name 406 appearing in the hostname? 408 Author's Address 410 S.M. Bellovin 411 Columbia University 412 1214 Amsterdam Avenue 413 MC 0401 414 New York, NY 10027 415 US 417 Phone: +1 212 939 7149 418 EMail: bellovin@acm.org