idnits 2.17.1 

draft-bellovin-hpw-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (March 11, 2012) is 4426 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Downref: Normative reference to an Informational RFC: RFC 1321

  ** Downref: Normative reference to an Informational RFC: RFC 2104

  ** Obsolete normative reference: RFC 3454 (Obsoleted by RFC 7564)

  ** Downref: Normative reference to an Informational RFC: RFC 6124

  ** Downref: Normative reference to an Informational RFC: RFC 6234


     Summary: 6 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                        S. Bellovin
3	Internet-Draft                                       Columbia University
4	Intended status: Standards Track                          March 11, 2012
5	Expires: September 12, 2012

7	                        Hashed Password Exchange
8	                       draft-bellovin-hpw-01.txt

10	Abstract

12	   Many systems (e.g., cryptographic protocols relying on symmetric
13	   cryptography) require that plaintext passwords be stored.  Given how
14	   often people reuse passwords on different systems, this poses a very
15	   serious risk if a single machine is compromised.  We propose a scheme
16	   to derive passwords limited to a single machine from a typed
17	   password, and explain how a protocol definition can specify this
18	   scheme.

20	Status of This Memo

22	   This Internet-Draft is submitted in full conformance with the
23	   provisions of BCP 78 and BCP 79.

25	   Internet-Drafts are working documents of the Internet Engineering
26	   Task Force (IETF).  Note that other groups may also distribute
27	   working documents as Internet-Drafts.  The list of current Internet-
28	   Drafts is at http://datatracker.ietf.org/drafts/current/.

30	   Internet-Drafts are draft documents valid for a maximum of six months
31	   and may be updated, replaced, or obsoleted by other documents at any
32	   time.  It is inappropriate to use Internet-Drafts as reference
33	   material or to cite them other than as "work in progress."

35	   This Internet-Draft will expire on September 12, 2012.

37	Copyright Notice

39	   Copyright (c) 2012 IETF Trust and the persons identified as the
40	   document authors.  All rights reserved.

42	   This document is subject to BCP 78 and the IETF Trust's Legal
43	   Provisions Relating to IETF Documents
44	   (http://trustee.ietf.org/license-info) in effect on the date of
45	   publication of this document.  Please review these documents
46	   carefully, as they describe your rights and restrictions with respect
47	   to this document.  Code Components extracted from this document must
48	   include Simplified BSD License text as described in Section 4.e of
49	   the Trust Legal Provisions and are provided without warranty as
50	   described in the Simplified BSD License.

52	1.  Introduction

54	   Today, despite the lessons of more than 30 years [[cite Morris and
55	   Thomson]], many systems store plaintext passwords.  This is often
56	   done for good reasons, such as authenticating some cryptographic
57	   exchanges or as a convenience to users with many passwords; see, for
58	   example, the password store in many browsers or the Keychain in
59	   MacOS.  That said, this practice does pose a security risk to users,
60	   since their passwords are in danger if the system is compromised.

62	   The big problem is not compromise of the actual password used on that
63	   system; while regrettable, it is inherent in the service definition.
64	   Rather, the problem is that users tend to reuse passwords on
65	   different systems.  If a password is compromised on one machine, the
66	   user is at risk on many different systems.  Accordingly, we describe
67	   a scheme for storing a single-site-only password, derived from the
68	   user's typed password; a compromise of a service thus affects just
69	   that service.

71	   To accomplish this, we specify a "Hashed Password Exchange" standard,
72	   or rather, a metastandard.  Rather than specifying a precise way to
73	   store and use hashed passwords, we give rules for specifying hashed
74	   passwords for use in a given protocol or application.  We take
75	   advantage of the fact that unlike 1979, when users used very dumb
76	   terminals to transmit passwords directly to the receiving
77	   applications, most passwords these days are entered into user-
78	   controlled software; these programs in turn transmit the passwords to
79	   the verifying applications.  There is thus intelligence on the user's
80	   side; we will use this to irreversibly transform the entered password
81	   into some other string.  By the same token, the receiving system must
82	   apply the same transform to the authenticator supplied at user
83	   enrollment time or password change time.  Because two independent
84	   pieces of software must apply the same transformation, the algorithm
85	   must be precisely specified in standards documents.

87	   Note that defeating guessing attacks on a captured password file is
88	   not the primary goal of this work.  That goal, though laudable,
89	   ignores changes in technology and environment since the Morris and
90	   Thompson paper; today, far more passwords are lost to keystroke
91	   loggers, phishing attacks, direct compromise of the server itself, or
92	   (as was a problem even 30+ years ago) online guessing attacks.  Our
93	   scheme helps against this last attack, in that generation of the
94	   guesses becomes more expensive; against the other threats, password
95	   strength is completely irrelevant.  We also note that today, people
96	   have very many different passwords.  It is impossible to remember
97	   large numbers of strong passwords; absent use of a password generator
98	   and manager, there *will* be reuse across different services.

100	1.1.  Requirements Notation

102	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
103	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
104	   document are to be interpreted as described in [RFC2119].

106	2.  Definitions and Goals

108	   We use the following definitions:

110	   Username  An arbitrary string, the syntax of which is application-
111	      dependent, employed by both the user and the verifying system to
112	      uniquely identify a given user.

114	   Entered Password  The authenticator typed by the user to his or her
115	      own software.  The usual quality rules (length, special
116	      characters, etc.) can be applied; that is out of the scope of this
117	      standard.

119	   Effective Password  The actual, over-the-wire, string transmitted by
120	      the user's software.

122	   Service  A particular application on a particular machine or cluster
123	      of machines appearing as a single machine

125	   Hostname  The hostname as supplied by the user.

127	   Service URI  A URI [RFC3986] for which this effective password should
128	      be valid.  Only the scheme name, userinfo, and host name portions
129	      are discussed here; use of path information is protocol-dependent.
130	      In the userinfo field, only the username is used.  An example is
131	      given below.

133	   Our scheme has the following goals:

135	   1.  No two users of a given service should have the same effecive
136	       password, even if the entered passwords are the same.

138	   2.  No two effective passwords for the same user should be the same
139	       for different services, even if the entered passwords are the
140	       same.

142	   3.  It should be infeasible to invert the hashing function to
143	       retrieve the entered password from an effective password and
144	       service URI.

146	   4.  It should be computationally expensive to mount dictionary
147	       attacks on compromised effective passwords.

149	3.  The Hashed Password Scheme

151	   Fundamentally, we calculate the effective password by iterating HMAC
152	   [RFC2104], using the entered password as the key and the service URI
153	   as the data.  This meets all four of our goals:

155	   1.  Since the username is part of the service URI, different users
156	       will have different URIs, and hence different effective
157	       passwords.

159	   2.  Since the hostname is part of the URI, different services for any
160	       given user will have different URIs, and hence different
161	       effective passwords.

163	   3.  For any reasonable underlying hash function, it is believed to be
164	       infeasible to invert HMAC; see [RFC2104] for details.  (Arguably,
165	       HMAC is overkill.  Nevertheless, it is a well-studied, well-
166	       understood mechanism for combining known plaintext with a secret
167	       key.  We see little benefit to concocting some other scheme.)

169	   4.  By iterating a sufficient number of times, dictionary attacks can
170	       be made arbitrarily expensive.  (Although guessing attacks can be
171	       made arbitrarily cheap today by use of cloud services or botnets,
172	       we prefer to look at it somewhat differently.  Whatever the
173	       resources the attacker has, his or her effective guessing rate is
174	       cut by a factor of the iteration count.)

176	   We do not use a salt in this scheme.  The primary purposes of a salt
177	   are to achieve our first and second goals, which we achieve in other
178	   ways.  A salt also protects against precomputation of possible
179	   passwords of known users in anticipation of a later password file
180	   compromise.  Our use of service-, host-, and user-specific hashed
181	   passwords provides the same protection against untargeted guessing
182	   attacks; furthermore, and as noted, guessing attacks are not the
183	   primary threat today.  Since the salt must be used in calculating the
184	   effective password, it would have to be known to the user as well as
185	   the server, and users typically have multiple devices on which they
186	   enter passwords.  Using a salt would require that users know it and
187	   reenter it, which we regard as of limited benefit and highly user-
188	   hostile: people will *not* tolerate copying random strings or numbers
189	   onto multiple platforms, especially phones and the like.

191	   Usernames and the hostname portions of service URIs must be
192	   canonicalized before applying HMAC.  Legal characters in a username
193	   are upper and lower case US-ASCII letters, period, hyphen,
194	   underscore, and digits.  All other characters MUST be percent-
195	   encoded, per section 2.1 of [RFC3986].  Hostnames MUST be
196	   canonicalized per [RFC5890][RFC5891] and converted to lower case.
197	   How usernames and hostnames are entered is application- and
198	   implementation-dependent, and not part of this specification.  The
199	   hostname used is either the string users type or unambiguously
200	   derivable from it per specified rules.

202	   The URI scheme name is given by the protocol specification and MUST
203	   NOT be entered directly by the user.

205	   The iteration count is protocol- and use-dependent, and given in the
206	   protocol specification.

208	   The effective password, then, is calculated by iterating HMAC some
209	   number of times over the message

211	      scheme://username@hostname

213	   with the entered password as the key.

215	3.1.  Examples

217	      ipsec://someuser@gw.example.net
218	      imap://someuser@mail.example.com
219	      submission://someuser@mail.example.com

221	   Note that although someuser can specify the same entered password for
222	   both 'imap' and 'submission' on mail.example.com, the effective
223	   passwords will be different.

225	4.  Specifying Hashed Password Exchange

227	   The following elements must be in any protocol specification that
228	   uses Hashed Password Exchange.

230	   o  The scheme name MUST be specified.  Generally, this will be taken
231	      from the IANA name assigned to the port, but this is not required.
232	      Thus, a mail submission URI (TCP port 587) might use the scheme
233	      name "submission".

235	   o  The rules for deriving the hostname from what users enter MUST be
236	      specified.  They may be as simple as "use the name the user
237	      specifies, e.g., imap.example.com", or they may account for common
238	      alternatives: "If the specified host name does not begin with
239	      'www.', prepend it; thus, both 'example.com' and 'www.example.com'
240	      would use the hostname 'www.example.com' in forming the URI.

242	   o  The iteration count MUST be specified.  The value -- typically in
243	      the hundreds of thousands with today's technology -- SHOULD be
244	      different for different services, and MAY be adjusted based on the
245	      platforms on which the calculations are typically done.  Note that
246	      the iteration is done at password change time rather than run-
247	      time, so expense is not a major concern.  (Just how long the
248	      iterations should take will depend on the protocol designers'
249	      understanding of likely platforms and usage patterns.  Something
250	      that will be run exclusively on fast devices and with stored
251	      hashed passwords should use a higher count; something where run-
252	      time user password entry on a slow device is considered likely
253	      should use a lower count.)

255	   o  To support internationalized, non-ASCII passwords, we adopt the
256	      specification text from [RFC6124].  The input password string
257	      SHOULD be processed according to the rules of the [RFC4103]
258	      profile of [RFC3454] A password SHOULD be considered a "stored
259	      string" per [RFC3454] and unassigned code points are therefore
260	      prohibited.  The output is the binary representation of the
261	      processed UTF-8 [RFC3629] character string.  Prohibited output and
262	      unassigned code points encountered in SASLprep preprocessing
263	      SHOULD cause a preprocessing failure and the output SHOULD NOT be
264	      used.

266	   o  The hash function to be used with HMAC MUST be specified.  MD5
267	      [RFC1321] is more than sufficient; however, the tradeoff is likely
268	      to be between what code is likely to be available in
269	      implenetations versus the iteration count.  SHA-512 [RFC6234] is
270	      much slower than MD5, but since the goal is constant time, this
271	      matters very little; thus, MD5 would have a higher iteration count
272	      than SHA-512 would for the same protocol.

274	   o  The encoding rules for sending the effective password over the
275	      wire are not crucial but must be specified.  The output of HMAC is
276	      an arbitrary byte string.  Given the length of typical HMAC output
277	      and the infrequency with which they are sent, transmission
278	      efficiency is not a major concern, so a simple hexadecimal
279	      encoding is fine.  Implementations MAY specify truncation;
280	      however, they SHOULD NOT use effective passwords shorter than 16
281	      octets before encoding.

283	   o  If the password is not transmitted but is used internally (e.g.,
284	      as part of a cryptopgrahic exchange), how the effective password
285	      is used MUST be specified.  Some protocols will use it directly as
286	      a key; others will use the hexadecimal ASCII string in place of a
287	      password.

289	   o  Some protocols, such as HTTP, permit multiple hosts to appear on a
290	      single IP address.  For such protocols, the desired hostname must
291	      be transmitted prior to or along with the hashed password, to
292	      allow the host to calculate the proper hashed password value.  How
293	      this is done MUST be specified.

295	   o  If the protocol permits negotiation of authentication methods, a
296	      separate code point MUST be assigned to this scheme.

298	   How passwords are changed -- that is, how new effective passwords are
299	   supplied to the verifying machine -- is beyond the scope of this
300	   specification.  If the entered password is sent directly at password
301	   change time, quality checks can be enforced; however, this exposes
302	   entered passwords to attacks who have compromised the verifying
303	   machine.  This is not a major risk, since the rate of password change
304	   is low.  Conversely, client-side code (e.g., Javascript) can make
305	   advisory recommendations on password strength; while the server
306	   cannot enforce this, since it will see only effective passwords, very
307	   few users will have the will and the skill to override this.

309	   If effective passwords are used only for the usual password
310	   verification and not for cryptographic purposes, they should be
311	   treated with the care used for ordinary password, i.e., read-
312	   protected, hashed, etc.  There is little need for extra iterations,
313	   though, since the iteration used in calculating them already provides
314	   strong protection against dictionary attacks, and it is unlikely that
315	   the extra server-side iterations will be significantly larger than
316	   the iterations already performed to comply with this specification.
317	   As before, there is no need for an additional salt.

319	5.  Related Work

321	   A number of papers have described schemes for browser-based password
322	   stores that simplify the process of having separate effective
323	   passwords for different web sites.  Many -- [[Abadi--pwdhash]]
324	   [[Halderman et al.]] -- use a cryptographic function of the domain
325	   name and a master password to calculate it.  [[Abadi-pwdhash]] has
326	   many pointers.

328	   This work differs in two important ways.  First, it applies to more
329	   services than just HTTP.  Second, it specifies how other protocol
330	   specification documents should handle the situation, independent of
331	   requirements for password strength.

333	6.  Acknowledgments

335	   A number of people made useful comments and suggestions, even if they
336	   didn't agree with all parts of this document.  They include Martin
337	   Abadi, Uri Blumenthal, Dan Harkins, Mouse, Yaron Sheffer, Joe Touch,
338	   and Sujing Zhou.

340	7.  Security Considerations

342	   To be written.

344	8.  Normative References

346	   [RFC1321]  Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321,
347	              April 1992.

349	   [RFC2104]  Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-
350	              Hashing for Message Authentication", RFC 2104,
351	              February 1997.

353	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
354	              Requirement Levels", BCP 14, RFC 2119, March 1997.

356	   [RFC3454]  Hoffman, P. and M. Blanchet, "Preparation of
357	              Internationalized Strings ("stringprep")", RFC 3454,
358	              December 2002.

360	   [RFC3629]  Yergeau, F., "UTF-8, a transformation format of ISO
361	              10646", STD 63, RFC 3629, November 2003.

363	   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
364	              Resource Identifier (URI): Generic Syntax", STD 66,
365	              RFC 3986, January 2005.

367	   [RFC4103]  Hellstrom, G. and P. Jones, "RTP Payload for Text
368	              Conversation", RFC 4103, June 2005.

370	   [RFC5890]  Klensin, J., "Internationalized Domain Names for
371	              Applications (IDNA): Definitions and Document Framework",
372	              RFC 5890, August 2010.

374	   [RFC5891]  Klensin, J., "Internationalized Domain Names in
375	              Applications (IDNA): Protocol", RFC 5891, August 2010.

377	   [RFC6124]  Sheffer, Y., Zorn, G., Tschofenig, H., and S. Fluhrer, "An
378	              EAP Authentication Method Based on the Encrypted Key
379	              Exchange (EKE) Protocol", RFC 6124, February 2011.

381	   [RFC6234]  Eastlake, D. and T. Hansen, "US Secure Hash Algorithms
382	              (SHA and SHA-based HMAC and HKDF)", RFC 6234, May 2011.

384	Appendix A.  Change History

386	A.1.  Changes from -00 to -01

388	      Added more text explaining why salting isn't particularly helpful

390	      Add the requirement to transmit the hostname for some services

392	      Started a related work section

394	      Clarified the internationalization requirement

396	      Miscellaneous edits

398	Appendix B.  Open Issues

400	      How should related domains (e.g., www.amazon.com and
401	      www.amazon.co.uk) be handled, if the site wishes the same password
402	      to work on all of them.

404	      A particular case in point is the way the prefix "www." should be
405	      handled.  Should there be a general rule about the service name
406	      appearing in the hostname?

408	Author's Address

410	   S.M. Bellovin
411	   Columbia University
412	   1214 Amsterdam Avenue
413	   MC 0401
414	   New York, NY  10027
415	   US

417	   Phone: +1 212 939 7149
418	   EMail: bellovin@acm.org