Preparation and Comparison of Internationalized Strings Representing Simple User Names and User SecretsCisco Systems, Inc.1899 Wynkoop Street, Suite 600DenverCO80202USA+1-303-308-3282psaintan@cisco.comIsode Ltd5 Castle Business Village36 Station RoadHamptonMiddlesexTW12 2BXUKAlexey.Melnikov@isode.com
Applications
PrecisUsernamePasswordUnicodeSASLprepThis document describes how to handle Unicode strings representing simple user names and user secrets, primarily for purposes of comparison. This profile is intended to be used by Simple Authentication and Security Layer (SASL) mechanisms (such as PLAIN and SCRAM-SHA-1), as well as other protocols that exchange simple user names or user secrets. This document obsoletes RFC 4013.The use of simple user names and user secrets in authentication and authorization is pervasive on the Internet. To increase the likelihood that the input and comparison of user names and user secrets will work in ways that make sense for typical users throughout the world, this document defines rules for preparing and comparing internationalized strings that represent simple user names and user secrets.The algorithms defined in this document assume that all strings are comprised of characters from the Unicode character set .The algorithms are designed for use in Simple Authentication and Security Layer (SASL) mechanisms, such as PLAIN and SCRAM-SHA-1 . However, they might be applicable wherever simple user names or user secrets are used. This profile is not intended for use in preparing strings that are not simple user names (e.g., email addresses, DNS domain names, LDAP distinguished names), nor in cases where identifiers or secrets are not character data or require different handling (e.g., case folding).This document builds upon the PRECIS framework defined in , which differs fundamentally from the stringprep technology used in SASLprep . The primary difference is that stringprep profiles allowed all characters except those which were explicitly disallowed, whereas PRECIS profiles disallow all characters except those which are explicitly allowed (this "inclusion model" was originally used for internationalized domain names in ; see for further discussion). It is important to keep this distinction in mind when comparing the technology defined in this document to SASLprep .This document obsoletes RFC 4013.Many important terms used in this document are defined in , , , , and . The term "non-ASCII" space refers to any Unicode code point with a general category of "Zs", with the exception of U+0020 (here called "ASCII space").The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in .Some SASL mechanisms (e.g., CRAM-MD5, DIGEST-MD5, and SCRAM) specify that the authentication identity used in the context of such mechanisms is a "simple user name" (see Section 2 of as well as ). However, the exact form of a simple user name in any particular mechanism or deployment thereof is a local matter, and a simple user name does not necessarily map to an application identifier such as the localpart of an email address.For purposes of preparation and comparison of authentication identities, this document specifies that a simple user name is a string of code points, encoded using UTF-8 , and structured as an ordered sequence of "simpleparts" (where the complete simple user name can consist of a single simplepart or a space-separated sequence of simpleparts).Therefore the syntax for a simple user name is defined as follows using the Augmented Backus-Naur Form (ABNF) as specified in .A simple user name MUST NOT be zero bytes in length. This rule is to be enforced after any normalization or mapping of code points.Each simplepart of a simple user name MUST be treated as follows, where the operations specified MUST be completed in the order shown:Apply Unicode Normalization Form C (NFC) to all characters.Map uppercase and titlecase characters to their lowercase equivalents.Optionally apply additional mappings, such as those defined in .Ensure that the resulting string conforms to the definition of the PRECIS NameClass.With regard to directionality, the "Bidi Rule" provided in applies.The rules defined in the previous section differ slightly from those defined by the SASLprep specification . Therefore, deployments that currently use SASLprep for handling user names will need to scrub existing data when migrating to use of the rules defined here. In particular:SASLprep specified the use of Unicode Normalization Form KC (NFKC), whereas this usage of the PRECIS NameClass employs Unicode Normalization Form C (NFC). In practice this change is unlikely to cause significant problems, because NFKC provides methods for mapping Unicode code points with compatibility equivalents to those equivalents, whereas the PRECIS NameClass entirely disallows Unicode code points with compatibility equivalents. For migration purposes, deployments need to search their simple user names for Unicode code points with compatibility equivalents and map those code points to their compatibility equivalents.SASLprep mapped non-ASCII spaces to ASCII space (U+0020), whereas the PRECIS NameClass entirely disallows non-ASCII spaces. For migration purposes, deployments need to convert non-ASCII space characters to ASCII space in simple user names.SASLprep mapped the "characters commonly mapped to nothing" from Appendix B.1 of ) to nothing, whereas the PRECIS NameClass entirely disallows such characters, which correspond to the code points from the "M" category defined under Section 6.13 of (with the exception of U+1806 MONGOLIAN TODO SOFT HYPHEN, which was commonly mapped to nothing in Unicode 3.2 but at the time of this writing is allowed by Unicode 6.1). For migration purposes, deployments need to remove code points from the PRECIS "M" category in simple user names.SASLprep allowed uppercase and titlecase characters, whereas this usage of the PRECIS NameClass maps uppercase and titlecase characters to their lowercase equivalents. For migration purposes, deployments can either convert uppercase and titlecase characters to their lowercase equivalents in simple user names (thus losing the case information) or preserve uppercase and titlecase characters and ignore the case difference when comparing simple user names.Note well that all code points and blocks not explicitly allowed in the PRECIS NameClass are disallowed; this includes private use characters, surrogate code points, and the other code points and blocks defined as "Prohibited Output" in Section 2.3 of RFC 4013.For purposes of preparation and comparison of user secrets (i.e., passwords and passphrases), this document specifies that a user secret is a string of code points, encoded using UTF-8 , and conformant to the PRECIS FreeClass.Therefore the syntax for a user secret is defined as follows using the Augmented Backus-Naur Form (ABNF) as specified in .A user secret MUST NOT be zero bytes in length. This rule is to be enforced after any normalization or mapping of code points.A user secret MUST be treated as follows, where the operations specified MUST be completed in the order shown:Apply Unicode Normalization Form C (NFC) to all characters.Map any instances of non-ASCII space to ASCII space (U+0020).Ensure that the resulting string conforms to the definition of the PRECIS FreeClass.With regard to directionality, the "Bidi Rule" provided in applies.The rules defined in the previous section differ slightly from those defined by the SASLprep specification . Depending on local service policy, migration from RFC 4013 to this specification might not involve any scrubbing of data (since user secrets might not be stored in the clear anyway); however, service providers need to be aware of possible issues that might arise during migration. In particular:SASLprep specified the use of Unicode Normalization Form KC (NFKC), whereas this usage of the PRECIS FreeClass employs Unicode Normalization Form C (NFC). Because NFKC is more aggressive about finding matches than NFC, in practice this change is unlikely to cause significant problems and indeed will probably result in fewer false positives when comparing user secrets.SASLprep mapped the "characters commonly mapped to nothing" from Appendix B.1 of ) to nothing, whereas the PRECIS FreeClass entirely disallows such characters, which correspond to the code points from the "M" category defined under Section 6.13 of (with the exception of U+1806 MONGOLIAN TODO SOFT HYPHEN, which was commonly mapped to nothing in Unicode 3.2 but at the time of this writing is allowed by Unicode 6.1).Note well that all code points and blocks not explicitly allowed in the PRECIS FreeClass are disallowed; this includes private use characters, surrogate code points, and the other code points and blocks defined as "Prohibited Output" in Section 2.3 of RFC 4013.We need to compare the output obtained when applying the new rules with Unicode 3.2 and Unicode 6.1 data to the output obtained when applying the SASLprep rules with Unicode 3.2 data, then make sure that the PRECIS Working Group and KITTEN Working Group are comfortable with any changes to the Unicode characters that are allowed and disallowed. (See also the migration issues described in the foregoing sections.)The ability to include a wide range of characters in passwords and passphrases can increase the potential for creating a strong password with high entropy. However, in practice, the ability to include such characters ought to be weighed against the possible need to reproduce them on various devices using various input methods.The security considerations described in apply to the "NameClass" and "FreeClass" base string classes used in this document for user names and user secrets, respectively.The security considerations described in apply to the use of Unicode characters in user names and user secrets.The IANA shall add an entry to the PRECIS Usage Registry for reuse of the PRECIS NameClass in SASL, as follows:SASL/Kerberos.NameClass.No.The "Bidi Rule" defined in RFC 5893 applies.Map uppercase and titlecase code points to their lowercase equivalents.NFC.RFC &rfc.number;.The IANA shall add an entry to the PRECIS Usage Registry for reuse of the PRECIS FreeClass in SASL, as follows:SASL/Kerberos.FreeClassNo.The "Bidi Rule" defined in RFC 5893 applies.None.NFC.RFC &rfc.number;.Precis Framework: Handling Internationalized Strings in ProtocolsCiscoViagenieApplication protocols using Unicode code points in protocol strings need to prepare such strings in order to perform comparison operations (e.g., for purposes of authentication or authorization). This document defines a framework enabling application protocols to handle various classes of strings in a way that depends on the properties of Unicode code points and that is agile with respect to versions of Unicode; as a result, this framework provides a more sustainable approach to the handling of internationalized strings than the previous framework, known as Stringprep (RFC 3454). A specification that reuses this framework can either directly use the base string classes or subclass the base string classes as needed. This framework takes an approach similar to the revised internationalized domain names in applications (IDNA) technology (RFC 5890, RFC 5891, RFC 5892, RFC 5893, RFC 5894) and thus adheres to the high-level design goals described in RFC 4690, albeit for application technologies other than the Domain Name System (DNS). This document obsoletes RFC 3454.Key words for use in RFCs to Indicate Requirement LevelsHarvard University1350 Mass. Ave.CambridgeMA 02138- +1 617 495 3864sob@harvard.edu
General
keyword
In many standards track documents several words are used to signify
the requirements in the specification. These words are often
capitalized. This document defines these words as they should be
interpreted in IETF documents. Authors who follow these guidelines
should incorporate this phrase near the beginning of their document:
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
RFC 2119.
Note that the force of these words is modified by the requirement
level of the document in which they are used.
UTF-8, a transformation format of ISO 10646ISO/IEC 10646-1 defines a large character set called the Universal Character Set (UCS) which encompasses most of the world's writing systems. The originally proposed encodings of the UCS, however, were not compatible with many current applications and protocols, and this has led to the development of UTF-8, the object of this memo. UTF-8 has the characteristic of preserving the full US-ASCII range, providing compatibility with file systems, parsers and other software that rely on US-ASCII values but are transparent to other values. This memo obsoletes and replaces RFC 2279.Augmented BNF for Syntax Specifications: ABNFInternet technical specifications often need to define a formal syntax. Over the years, a modified version of Backus-Naur Form (BNF), called Augmented BNF (ABNF), has been popular among many Internet specifications. The current specification documents ABNF. It balances compactness and simplicity with reasonable representational power. The differences between standard BNF and ABNF involve naming rules, repetition, alternatives, order-independence, and value ranges. This specification also supplies additional rule definitions and encoding for a core lexical analyzer of the type common to several Internet specifications. [STANDARDS-TRACK]The Unicode Standard, Version 6.1The Unicode ConsortiumMapping characters for PRECIS classesPreparation and comparison of internationalized strings ("PRECIS") Framework [I-D.ietf-precis-framework] is defining several classes of strings for preparation and comparison. In the document, case mapping is defined because many of protocols handle case sensitive or case insensitive string comparison and therefore preparation of string is mandatory. As described in IDNA mapping [RFC5895] and PRECIS problem statement [I-D.ietf-precis-problem-statement], mappings in internationalized strings are not limited to case, but also width, delimiters and/or other specials are taken into consideration. This document considers mappings other than case mapping in PRECIS context.Preparation of Internationalized Strings ("stringprep")SASLprep: Stringprep Profile for User Names and PasswordsThis document describes how to prepare Unicode strings representing user names and passwords for comparison. The document defines the "SASLprep" profile of the "stringprep" algorithm to be used for both user names and passwords. This profile is intended to be used by Simple Authentication and Security Layer (SASL) mechanisms (such as PLAIN, CRAM-MD5, and DIGEST-MD5), as well as other protocols exchanging simple user names and/or passwords. [STANDARDS-TRACK]Simple Authentication and Security Layer (SASL)<p>The Simple Authentication and Security Layer (SASL) is a framework for providing authentication and data security services in connection-oriented protocols via replaceable mechanisms. It provides a structured interface between protocols and mechanisms. The resulting framework allows new protocols to reuse existing mechanisms and allows old protocols to make use of new mechanisms. The framework also provides a protocol for securing subsequent protocol exchanges within a data security layer.</p><p> This document describes how a SASL mechanism is structured, describes how protocols include support for SASL, and defines the protocol for carrying a data security layer over a connection. In addition, this document defines one SASL mechanism, the EXTERNAL mechanism.</p><p> This document obsoletes RFC 2222. [STANDARDS TRACK]</p>The PLAIN Simple Authentication and Security Layer (SASL) MechanismThis document defines a simple clear-text user/password Simple Authentication and Security Layer (SASL) mechanism called the PLAIN mechanism. The PLAIN mechanism is intended to be used, in combination with data confidentiality services provided by a lower layer, in protocols that lack a simple password authentication command. [STANDARDS-TRACK]Salted Challenge Response Authentication Mechanism (SCRAM) SASL and GSS-API MechanismsThe secure authentication mechanism most widely deployed and used by Internet application protocols is the transmission of clear-text passwords over a channel protected by Transport Layer Security (TLS). There are some significant security concerns with that mechanism, which could be addressed by the use of a challenge response authentication mechanism protected by TLS. Unfortunately, the challenge response mechanisms presently on the standards track all fail to meet requirements necessary for widespread deployment, and have had success only in limited use.</t><t> This specification describes a family of Simple Authentication and Security Layer (SASL; RFC 4422) authentication mechanisms called the Salted Challenge Response Authentication Mechanism (SCRAM), which addresses the security concerns and meets the deployability requirements. When used in combination with TLS or an equivalent security layer, a mechanism from this family could improve the status quo for application protocol authentication and provide a suitable choice for a mandatory-to-implement mechanism for future application protocol standards. [STANDARDS-TRACK]Internationalized Domain Names for Applications (IDNA): Definitions and Document FrameworkThis document is one of a collection that, together, describe the protocol and usage context for a revision of Internationalized Domain Names for Applications (IDNA), superseding the earlier version. It describes the document collection and provides definitions and other material that are common to the set. [STANDARDS TRACK]Internationalized Domain Names in Applications (IDNA): ProtocolThis document is the revised protocol definition for Internationalized Domain Names (IDNs). The rationale for changes, the relationship to the older specification, and important terminology are provided in other documents. This document specifies the protocol mechanism, called Internationalized Domain Names in Applications (IDNA), for registering and looking up IDNs in a way that does not require changes to the DNS itself. IDNA is only meant for processing domain names, not free text. [STANDARDS TRACK]Right-to-Left Scripts for Internationalized Domain Names for Applications (IDNA)The use of right-to-left scripts in Internationalized Domain Names (IDNs) has presented several challenges. This memo provides a new Bidi rule for Internationalized Domain Names for Applications (IDNA) labels, based on the encountered problems with some scripts and some shortcomings in the 2003 IDNA Bidi criterion. [STANDARDS-TRACK]Internationalized Domain Names for Applications (IDNA): Background, Explanation, and RationaleSeveral years have passed since the original protocol for Internationalized Domain Names (IDNs) was completed and deployed. During that time, a number of issues have arisen, including the need to update the system to deal with newer versions of Unicode. Some of these issues require tuning of the existing protocols and the tables on which they depend. This document provides an overview of a revised system and provides explanatory material for its components. This document is not an Internet Standards Track specification; it is published for informational purposes.Terminology Used in Internationalization in the IETFThis document provides a list of terms used in the IETF when discussing internationalization. The purpose is to help frame discussions of internationalization in the various areas of the IETF and to help introduce the main concepts to IETF participants. This memo documents an Internet Best Current Practice.Unicode Technical Report #39: Unicode Security MechanismsThe Unicode ConsortiumThe following substantive modifications were made from RFC 3920.A single SASLprep algorithm was replaced by two separate algorithms: one for user names and another for passwords.The new preparation algorithms use PRECIS instead of a stringprep profile. The new algorithms work independenctly of Unicode versions.As recommended in the PRECIS framwork, changed the Unicode normalization form from NFKC to NFC.Some Unicode code points that were mapped to nothing in RFC 4013 are simply disallowed by PRECIS.Thanks to Yoshiro YONEYA and Takahiro NEMOTO for their implementation feedback.This document borrows some text from RFC 4013 and RFC 6120.