< draft-ietf-idnabis-protocol-15.txt   draft-ietf-idnabis-protocol-16.txt >
Network Working Group J. Klensin Network Working Group J. Klensin
Internet-Draft September 1, 2009 Internet-Draft September 13, 2009
Obsoletes: 3490, 3491 Obsoletes: 3490, 3491
(if approved) (if approved)
Updates: 3492 (if approved) Updates: 3492 (if approved)
Intended status: Standards Track Intended status: Standards Track
Expires: March 5, 2010 Expires: March 17, 2010
Internationalized Domain Names in Applications (IDNA): Protocol Internationalized Domain Names in Applications (IDNA): Protocol
draft-ietf-idnabis-protocol-15.txt draft-ietf-idnabis-protocol-16.txt
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79. This document may contain material provisions of BCP 78 and BCP 79. This document may contain material
from IETF Documents or IETF Contributions published or made publicly from IETF Documents or IETF Contributions published or made publicly
available before November 10, 2008. The person(s) controlling the available before November 10, 2008. The person(s) controlling the
copyright in some of this material may not have granted the IETF copyright in some of this material may not have granted the IETF
Trust the right to allow modifications of such material outside the Trust the right to allow modifications of such material outside the
IETF Standards Process. Without obtaining an adequate license from IETF Standards Process. Without obtaining an adequate license from
skipping to change at page 1, line 45 skipping to change at page 1, line 45
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on March 5, 2010. This Internet-Draft will expire on March 17, 2010.
Copyright Notice Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents in effect on the date of Provisions Relating to IETF Documents in effect on the date of
publication of this document (http://trustee.ietf.org/license-info). publication of this document (http://trustee.ietf.org/license-info).
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
skipping to change at page 3, line 16 skipping to change at page 3, line 16
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1. Discussion Forum . . . . . . . . . . . . . . . . . . . . . 5 1.1. Discussion Forum . . . . . . . . . . . . . . . . . . . . . 5
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Requirements and Applicability . . . . . . . . . . . . . . . . 6 3. Requirements and Applicability . . . . . . . . . . . . . . . . 6
3.1. Requirements . . . . . . . . . . . . . . . . . . . . . . . 6 3.1. Requirements . . . . . . . . . . . . . . . . . . . . . . . 6
3.2. Applicability . . . . . . . . . . . . . . . . . . . . . . 6 3.2. Applicability . . . . . . . . . . . . . . . . . . . . . . 6
3.2.1. DNS Resource Records . . . . . . . . . . . . . . . . . 7 3.2.1. DNS Resource Records . . . . . . . . . . . . . . . . . 7
3.2.2. Non-domain-name Data Types Stored in the DNS . . . . . 7 3.2.2. Non-domain-name Data Types Stored in the DNS . . . . . 7
4. Registration Protocol . . . . . . . . . . . . . . . . . . . . 7 4. Registration Protocol . . . . . . . . . . . . . . . . . . . . 7
4.1. Input to IDNA Registration Process . . . . . . . . . . . . 8 4.1. Input to IDNA Registration . . . . . . . . . . . . . . . . 8
4.2. Permitted Character and Label Validation . . . . . . . . . 8 4.2. Permitted Character and Label Validation . . . . . . . . . 8
4.2.1. Input Format . . . . . . . . . . . . . . . . . . . . . 8 4.2.1. Input Format . . . . . . . . . . . . . . . . . . . . . 8
4.2.2. Rejection of Characters that are not Permitted . . . . 8 4.2.2. Rejection of Characters that are not Permitted . . . . 9
4.2.3. Label Validation . . . . . . . . . . . . . . . . . . . 8 4.2.3. Label Validation . . . . . . . . . . . . . . . . . . . 9
4.2.4. Registration Validation Summary . . . . . . . . . . . 9 4.2.4. Registration Validation Summary . . . . . . . . . . . 9
4.3. Registry Restrictions . . . . . . . . . . . . . . . . . . 9 4.3. Registry Restrictions . . . . . . . . . . . . . . . . . . 10
4.4. Punycode Conversion . . . . . . . . . . . . . . . . . . . 10 4.4. Punycode Conversion . . . . . . . . . . . . . . . . . . . 10
4.5. Insertion in the Zone . . . . . . . . . . . . . . . . . . 10 4.5. Insertion in the Zone . . . . . . . . . . . . . . . . . . 10
5. Domain Name Lookup Protocol . . . . . . . . . . . . . . . . . 10 5. Domain Name Lookup Protocol . . . . . . . . . . . . . . . . . 11
5.1. Label String Input . . . . . . . . . . . . . . . . . . . . 11 5.1. Label String Input . . . . . . . . . . . . . . . . . . . . 11
5.2. Conversion to Unicode . . . . . . . . . . . . . . . . . . 11 5.2. Conversion to Unicode . . . . . . . . . . . . . . . . . . 11
5.3. A-label Input . . . . . . . . . . . . . . . . . . . . . . 11 5.3. A-label Input . . . . . . . . . . . . . . . . . . . . . . 11
5.4. Validation and Character List Testing . . . . . . . . . . 12 5.4. Validation and Character List Testing . . . . . . . . . . 12
5.5. Punycode Conversion . . . . . . . . . . . . . . . . . . . 13 5.5. Punycode Conversion . . . . . . . . . . . . . . . . . . . 13
5.6. DNS Name Resolution . . . . . . . . . . . . . . . . . . . 13 5.6. DNS Name Resolution . . . . . . . . . . . . . . . . . . . 13
6. Security Considerations . . . . . . . . . . . . . . . . . . . 13 6. Security Considerations . . . . . . . . . . . . . . . . . . . 14
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14
8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 14 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 14
9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 14 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 14
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15
10.1. Normative References . . . . . . . . . . . . . . . . . . . 15 10.1. Normative References . . . . . . . . . . . . . . . . . . . 15
10.2. Informative References . . . . . . . . . . . . . . . . . . 16 10.2. Informative References . . . . . . . . . . . . . . . . . . 16
Appendix A. Summary of Major Changes from IDNA2003 . . . . . . . 17 Appendix A. Summary of Major Changes from IDNA2003 . . . . . . . 17
Appendix B. Change Log . . . . . . . . . . . . . . . . . . . . . 18 Appendix B. Change Log . . . . . . . . . . . . . . . . . . . . . 18
B.1. Changes between Version -00 and -01 of B.1. Changes between Version -00 and -01 of
draft-ietf-idnabis-protocol . . . . . . . . . . . . . . . 18 draft-ietf-idnabis-protocol . . . . . . . . . . . . . . . 18
skipping to change at page 4, line 8 skipping to change at page 4, line 8
B.6. Version -06 . . . . . . . . . . . . . . . . . . . . . . . 19 B.6. Version -06 . . . . . . . . . . . . . . . . . . . . . . . 19
B.7. Version -07 . . . . . . . . . . . . . . . . . . . . . . . 20 B.7. Version -07 . . . . . . . . . . . . . . . . . . . . . . . 20
B.8. Version -08 . . . . . . . . . . . . . . . . . . . . . . . 20 B.8. Version -08 . . . . . . . . . . . . . . . . . . . . . . . 20
B.9. Version -09 . . . . . . . . . . . . . . . . . . . . . . . 20 B.9. Version -09 . . . . . . . . . . . . . . . . . . . . . . . 20
B.10. Version -10 . . . . . . . . . . . . . . . . . . . . . . . 21 B.10. Version -10 . . . . . . . . . . . . . . . . . . . . . . . 21
B.11. Version -11 . . . . . . . . . . . . . . . . . . . . . . . 21 B.11. Version -11 . . . . . . . . . . . . . . . . . . . . . . . 21
B.12. Version -12 . . . . . . . . . . . . . . . . . . . . . . . 21 B.12. Version -12 . . . . . . . . . . . . . . . . . . . . . . . 21
B.13. Version -13 . . . . . . . . . . . . . . . . . . . . . . . 22 B.13. Version -13 . . . . . . . . . . . . . . . . . . . . . . . 22
B.14. Version -14 . . . . . . . . . . . . . . . . . . . . . . . 22 B.14. Version -14 . . . . . . . . . . . . . . . . . . . . . . . 22
B.15. Version -15 . . . . . . . . . . . . . . . . . . . . . . . 22 B.15. Version -15 . . . . . . . . . . . . . . . . . . . . . . . 22
B.16. Version -16 . . . . . . . . . . . . . . . . . . . . . . . 23
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 23 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 23
1. Introduction 1. Introduction
This document supplies the protocol definition for internationalized This document supplies the protocol definition for internationalized
domain names. Essential definitions and terminology for domain names. Essential definitions and terminology for
understanding this document and a road map of the collection of understanding this document and a road map of the collection of
documents that make up IDNA2008 appear in [IDNA2008-Defs]. documents that make up IDNA2008 appear in [IDNA2008-Defs].
Appendix A discusses the relationship between this specification and Appendix A discusses the relationship between this specification and
the earlier version of IDNA (referred to here as "IDNA2003"). The the earlier version of IDNA (referred to here as "IDNA2003"). The
rationale for these changes, along with considerable explanatory rationale for these changes, along with considerable explanatory
material and advice to zone administrators who support IDNs is material and advice to zone administrators who support IDNs is
provided in another document, [IDNA2008-Rationale]. provided in another document, [IDNA2008-Rationale].
IDNA works by allowing applications to use certain ASCII string IDNA works by allowing applications to use certain ASCII string
labels (beginning with a special prefix) to represent non-ASCII name labels (beginning with a special prefix) to represent non-ASCII name
labels. Lower-layer protocols need not be aware of this; therefore labels. Lower-layer protocols need not be aware of this; therefore
IDNA does not change any infrastructure. In particular, IDNA does IDNA does not change any infrastructure. In particular, IDNA does
not depend on any changes to DNS servers, resolvers, or protocol not depend on any changes to DNS servers, resolvers, or DNS protocol
elements, because the ASCII name service provided by the existing DNS elements, because the ASCII name service provided by the existing DNS
can be used for IDNA. can be used for IDNA.
IDNA applies only to DNS labels. The base DNS standards [RFC1034] IDNA applies only to DNS labels. The base DNS standards [RFC1034]
[RFC1035] and their various updates specify how to combine labels [RFC1035] and their various updates specify how to combine labels
into fully-qualified domain names and parse labels out of those into fully-qualified domain names and parse labels out of those
names. names.
This document describes two separate protocols, one for IDN This document describes two separate protocols, one for IDN
registration (Section 4) and one for IDN lookup (Section 5), that registration (Section 4) and one for IDN lookup (Section 5). These
share some terminology, reference data and operations. [[anchor2: two protocols share some terminology, reference data and operations.
Note in draft: See the note in the introduction to.]]Section 5
1.1. Discussion Forum 1.1. Discussion Forum
[[anchor4: RFC Editor: please remove this section.]] [[ RFC Editor: please remove this section. ]]
This work is being discussed in the IETF IDNABIS WG and on the This work is being discussed in the IETF IDNABIS WG and on the
mailing list idna-update@alvestrand.no mailing list idna-update@alvestrand.no
2. Terminology 2. Terminology
Terminology used in IDNA, but also in Unicode or other character set Terminology used as part of the definition of IDNA appears in
standards and the DNS, appears in [IDNA2008-Defs]. Terminology that [IDNA2008-Defs]. It is worth noting that some of this terminology
is required as part of the IDNA definition, including the definitions overlaps with, and is consistent with, that used , but also in
of "ACE", appears in that document as well. Readers of this document Unicode or other character set standards and the DNS. Readers of
are assumed to be familiar with [IDNA2008-Defs] and with the DNS- this document are assumed to be familiar with [IDNA2008-Defs] and
specific terminology in RFC 1034 [RFC1034]. with the DNS-specific terminology in RFC 1034 [RFC1034].
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in BCP 14, RFC 2119 document are to be interpreted as described in BCP 14, RFC 2119
[RFC2119]. [RFC2119].
3. Requirements and Applicability 3. Requirements and Applicability
3.1. Requirements 3.1. Requirements
skipping to change at page 6, line 30 skipping to change at page 6, line 29
2. Labels MUST be compared using equivalent forms: either both 2. Labels MUST be compared using equivalent forms: either both
A-Label forms or both U-Label forms. Because A-labels and A-Label forms or both U-Label forms. Because A-labels and
U-labels can be transformed into each other without loss of U-labels can be transformed into each other without loss of
information, these comparisons are equivalent. A pair of information, these comparisons are equivalent. A pair of
A-labels MUST be compared as case-insensitive ASCII (as with all A-labels MUST be compared as case-insensitive ASCII (as with all
comparisons of ASCII DNS labels). U-labels must be compared comparisons of ASCII DNS labels). U-labels must be compared
as-is, without case-folding or other intermediate steps. Note as-is, without case-folding or other intermediate steps. Note
that it is not necessary to validate labels in order to compare that it is not necessary to validate labels in order to compare
them and that successful comparison does not imply validity. In them and that successful comparison does not imply validity. In
many cases, validation may be important for other reasons and many cases, not limited to comparison, validation may be
SHOULD be performed. important for other reasons and SHOULD be performed.
3. Labels being registered MUST conform to the requirements of 3. Labels being registered MUST conform to the requirements of
Section 4. Labels being looked up and the lookup process MUST Section 4. Labels being looked up and the lookup process MUST
conform to the requirements of Section 5. conform to the requirements of Section 5.
3.2. Applicability 3.2. Applicability
IDNA applies to all domain names in all domain name slots in IDNA applies to all domain names in all domain name slots in
protocols except where it is explicitly excluded. It does not apply protocols except where it is explicitly excluded. It does not apply
to domain name slots which do not use the Letter/Digit/Hyphen (LDH) to domain name slots which do not use the Letter/Digit/Hyphen (LDH)
syntax rules. syntax rules.
Because it uses the DNS, IDNA applies to many protocols that were Because IDNA uses the DNS, IDNA applies to many protocols that were
specified before it was designed. IDNs occupying domain name slots specified before it was designed. IDNs occupying domain name slots
in those older protocols MUST be in A-label form until and unless in those older protocols MUST be in A-label form until and unless
those protocols and implementations of them are explicitly upgraded those protocols and their implementations are explicitly upgraded to
to be aware of IDNs in Unicode. IDNs actually appearing in DNS be aware of IDNs. IDNs actually appearing in DNS queries or
queries or responses MUST be A-labels. responses MUST be A-labels.
IDNA is not defined for extended label types (see RFC 2671, Section 3 IDNA is not defined for extended label types (see RFC 2671, Section 3
[RFC2671]). [RFC2671]).
3.2.1. DNS Resource Records 3.2.1. DNS Resource Records
IDNA applies only to domain names in the NAME and RDATA fields of DNS IDNA applies only to domain names in the NAME and RDATA fields of DNS
resource records whose CLASS is IN. See RFC 1034 [RFC1034] for resource records whose CLASS is IN. See RFC 1034 [RFC1034] for
precise definitions of these terms. precise definitions of these terms.
The application of IDNA to DNS resource records depends entirely on The application of IDNA to DNS resource records depends entirely on
the CLASS of the record, and not on the TYPE except as noted below. the CLASS of the record, and not on the TYPE except as noted below.
This will remain true, even as new types are defined, unless a new This will remain true, even as new types are defined, unless a new
type defines type-specific rules. Special naming conventions for SRV type defines type-specific rules. Special naming conventions for SRV
records (and "underscore names" more generally) are incompatible with records (and "underscore labels" more generally) are incompatible
IDNA coding. The first two labels on a SRV type record (the ones with IDNA coding as discussed in [IDNA2008-Defs], especially Section
required to start in "_") MUST NOT be A-labels or U-labels, because 2.3.2.3. Of course, underscore labels may be part of a domain that
conversion to an A-label would lose information (since the underscore uses IDN labels at higher levels in the tree.
is not a letter, digit, or hyphen and is consequently DISALLOWED in
IDNs). Of course, those labels may be part of a domain that uses IDN
labels at higher levels in the tree.
3.2.2. Non-domain-name Data Types Stored in the DNS 3.2.2. Non-domain-name Data Types Stored in the DNS
Although IDNA enables the representation of non-ASCII characters in Although IDNA enables the representation of non-ASCII characters in
domain names, that does not imply that IDNA enables the domain names, that does not imply that IDNA enables the
representation of non-ASCII characters in other data types that are representation of non-ASCII characters in other data types that are
stored in domain names, specifically in the RDATA field for types stored in domain names, specifically in the RDATA field for types
that have structured RDATA format. For example, an email address that have structured RDATA format. For example, an email address
local part is stored in a domain name in the RNAME field as part of local part is stored in a domain name in the RNAME field as part of
the RDATA of an SOA record (hostmaster@example.com would be the RDATA of an SOA record (hostmaster@example.com would be
represented as hostmaster.example.com). IDNA does not update the represented as hostmaster.example.com). IDNA does not update the
existing email standards, which allow only ASCII characters in local existing email standards, which allow only ASCII characters in local
parts. Even though work is in progress to define parts. Even though work is in progress to define
internationalization for email addresses [RFC4952], changes to the internationalization for email addresses [RFC4952], changes to the
email address part of the SOA RDATA would require action in, or email address part of the SOA RDATA would require action in, or
updates to, other standards, specifically those that specify the updates to, other standards, specifically those that specify the
format of the SOA RR. format of the SOA RR.
4. Registration Protocol 4. Registration Protocol
This section defines the procedure for registering an IDN. The This section defines the model for registering an IDN. The model is
procedure is implementation independent; any sequence of steps that implementation independent; any sequence of steps that produces
produces exactly the same result for all labels is considered a valid exactly the same result for all labels is considered a valid
implementation. implementation.
Note that, while the registration and lookup protocols (Section 5) Note that, while the registration (this section) and lookup protocols
are very similar in most respects, they are different and (Section 5) are very similar in most respects, they not identical and
implementers should carefully follow the appropriate steps. implementers should carefully follow the steps described in this
specification.
4.1. Input to IDNA Registration Process 4.1. Input to IDNA Registration
Registration processes, especially processing by entities, such as Registration processes, especially processing by entities (often
"registrars" who deal with registrants before the request actually called "registrars") who deal with registrants before the request
reaches the zone manager ("registry") are outside the scope of these actually reaches the zone manager ("registry") are outside the scope
protocols and may differ significantly depending on local needs. By of this definition and may differ significantly depending on local
the time a string enters the IDNA registration process as described needs. By the time a string enters the IDNA registration process as
in this specification, it MUST be in Unicode and in Normalization described in this specification, it MUST be in Unicode and in
Form C (NFC [Unicode-UAX15]). Entities responsible for zone files Normalization Form C (NFC [Unicode-UAX15]). Entities responsible for
("registries") are expected to accept only the exact string for which zone files ("registries") MUST accept only the exact string for which
registration is requested, free of any mappings or local adjustments. registration is requested, free of any mappings or local adjustments.
They SHOULD avoid any possible ambiguity by accepting registrations They MAY accept that input in any of three forms:
only for A-labels, possibly paired with the relevant U-labels so that
they can verify the correspondence. 1. As a pair of A-label and U-label.
2. As an A-label only.
3. As a U-label only.
The first two of these forms are RECOMMENDED because the use of
A-labels avoids any possibility of ambiguity. The first is normally
preferred over the second because it permits further verification of
user intent (see Section 4.2.1).
4.2. Permitted Character and Label Validation 4.2. Permitted Character and Label Validation
4.2.1. Input Format 4.2.1. Input Format
The registry SHOULD permit submission of labels in A-label form and If both the U-label and A-label forms are available, the registry
is encouraged to accept both the A-label form and the U-label one. MUST ensure that the A-label form is in lower case, perform a
If both label forms are available, it MUST ensure that the A-label conversion to a U-label, perform the steps and tests described below
form is in lower case, perform a conversion to a U-label, perform the on that U-label, and then verify that the A-label produced by the
steps and tests described below on that U-label, and then verify that step in Section 4.4 matches the one provided as input. In addition,
the A-label produced by the step in Section 4.4 matches the one the U-label that was provided as input and the one obtained by
provided as input. In addition, if a U-label was provided, that conversion of the A-label MUST match exactly. If, for some reason,
U-label and the one obtained by conversion of the A-label MUST match these tests fail, the registration MUST be rejected.
exactly. If, for some reason, these tests fail, the registration
MUST be rejected. If the conversion to a U-label is not performed, If only an A-label was provided and the conversion to a U-label is
the registry MUST still verify that the A-label is superficially not performed, the registry MUST still verify that the A-label is
valid, i.e., that it does not violate any of the rules of Punycode superficially valid, i.e., that it does not violate any of the rules
[RFC3492] encoding such as the prohibition on trailing hyphen-minus, of Punycode [RFC3492] encoding such as the prohibition on trailing
appearance of non-basic characters before the delimiter, and so on. hyphen-minus, appearance of non-basic characters before the
Fake A-labels, i.e., invalid strings that appear to be A-labels but delimiter, and so on. Fake A-labels, i.e., invalid strings that
are not, MUST NOT be placed in DNS zones that support IDNA. appear to be A-labels but are not, MUST NOT be placed in DNS zones
that support IDNA.
4.2.2. Rejection of Characters that are not Permitted 4.2.2. Rejection of Characters that are not Permitted
The candidate Unicode string MUST NOT contain characters that appear The candidate Unicode string MUST NOT contain characters that appear
in the "DISALLOWED" and "UNASSIGNED" lists specified in in the "DISALLOWED" and "UNASSIGNED" lists specified in
[IDNA2008-Tables]. [IDNA2008-Tables].
4.2.3. Label Validation 4.2.3. Label Validation
The proposed label (in the form of a Unicode string, i.e., a string The proposed label (in the form of a Unicode string, i.e., a string
skipping to change at page 10, line 6 skipping to change at page 10, line 17
if the characters they contain are valid individually, and for labels if the characters they contain are valid individually, and for labels
that do not conform to the restrictions for strings containing right that do not conform to the restrictions for strings containing right
to left characters. to left characters.
4.3. Registry Restrictions 4.3. Registry Restrictions
In addition to the rules and tests above, there are many reasons why In addition to the rules and tests above, there are many reasons why
a registry could reject a label. Registries at all levels of the a registry could reject a label. Registries at all levels of the
DNS, not just the top level, are expected to establish policies about DNS, not just the top level, are expected to establish policies about
label registrations. Policies are likely to be informed by the local label registrations. Policies are likely to be informed by the local
languages and may depend on many factors including what characters languages and the scripts that are used to write them and may depend
are in the label (for example, a label may be rejected based on other on many factors including what characters are in the label (for
labels already registered). See [IDNA2008-Rationale] for a example, a label may be rejected based on other labels already
discussion and recommendations about registry policies. registered). See [IDNA2008-Rationale] Section 3.2 for a discussion
and recommendations about registry policies.
The string produced by the steps in Section 4.2 is checked and The string produced by the steps in Section 4.2 is checked and
processed as appropriate to local registry restrictions. Application processed as appropriate to local registry restrictions. Application
of those registry restrictions may result in the rejection of some of those registry restrictions may result in the rejection of some
labels or the application of special restrictions to others. labels or the application of special restrictions to others.
4.4. Punycode Conversion 4.4. Punycode Conversion
The resulting U-label is converted to an A-label (defined in The resulting U-label is converted to an A-label (defined in Section
[IDNA2008-Defs] [[anchor12: ?? Insert section number]]). The 2.3.2.1 of [IDNA2008-Defs]). The A-label is the encoding of the
A-label is the encoding of the U-label according to the Punycode U-label according to the Punycode algorithm [RFC3492] with the ACE
algorithm [RFC3492] with the ACE prefix "xn--" added at the beginning prefix "xn--" added at the beginning of the string. The resulting
of the string. The resulting string must, of course, conform to the string must, of course, conform to the length limits imposed by the
length limits imposed by the DNS. This document updates RFC 3492 DNS. This document does not update or alter the Punycode algorithm
only to the extent of replacing the reference to the discussion of specified in RFC 3492 in any way. That document does make a non-
the ACE prefix. The ACE prefix is now specified in this document normative reference to the information about the value and
rather than as part of RFC 3490 or Nameprep [RFC3491] but is the same construction of the ACE prefix that appears "in RFC 3490 or Nameprep
in both sets of documents. [RFC3491]". For consistency and reader convenience, IDNA2008
effectively updates that reference to point to this document. That
change does not alter the prefix itself. The prefix, "xn--", is the
same in both sets of documents.
The failure conditions identified in the Punycode encoding procedure The failure conditions identified in the Punycode encoding procedure
cannot occur if the input is a U-label as determined by the steps cannot occur if the input is a U-label as determined by the steps
above. above.
4.5. Insertion in the Zone 4.5. Insertion in the Zone
The A-label is registered in the DNS by insertion into a zone. The label is registered in the DNS by inserting the A-label into a
zone.
5. Domain Name Lookup Protocol 5. Domain Name Lookup Protocol
Lookup is different from registration and different tests are applied Lookup is different from registration and different tests are applied
on the client. Although some validity checks are necessary to avoid on the client. Although some validity checks are necessary to avoid
serious problems with the protocol, the lookup-side tests are more serious problems with the protocol, the lookup-side tests are more
permissive and rely on the assumption that names that are present in permissive and rely on the assumption that names that are present in
the DNS are valid. That assumption is, however, a weak one because the DNS are valid. That assumption is, however, a weak one because
the presence of wild cards in the DNS might cause a string that is the presence of wild cards in the DNS might cause a string that is
not actually registered in the DNS to be successfully looked up. not actually registered in the DNS to be successfully looked up.
The two steps described in Section 5.2 are required.
5.1. Label String Input 5.1. Label String Input
The user supplies a string in the local character set, typically by The user supplies a string in the local character set, for example by
typing it or clicking on, or copying and pasting, a resource typing it or clicking on, or copying and pasting, a resource
identifier, e.g., a URI [RFC3986] or IRI [RFC3987] from which the identifier, e.g., a URI [RFC3986] or IRI [RFC3987] from which the
domain name is extracted. Alternately, some process not directly domain name is extracted. Alternately, some process not directly
involving the user may read the string from a file or obtain it in involving the user may read the string from a file or obtain it in
some other way. Processing in this step and the next two are local some other way. Processing in this step and that specified in
matters, to be accomplished prior to actual invocation of IDNA. Section 5.2 are local matters, to be accomplished prior to actual
invocation of IDNA.
5.2. Conversion to Unicode 5.2. Conversion to Unicode
The string is converted from the local character set into Unicode, if The string is converted from the local character set into Unicode, if
it is not already Unicode. Depending on local needs, this conversion it is not already in Unicode. Depending on local needs, this
may involve mapping some characters into other characters as well as conversion may involve mapping some characters into other characters
coding conversions. Those issues are discussed in [IDNA2008-Mapping] as well as coding conversions. Those issues are discussed in
and the mapping-related sections of [IDNA2008-Rationale].[[anchor13: [IDNA2008-Mapping] and the mapping-related sections (Sections 4.4, 6,
?? Supply section number.]] A Unicode string may require and 7.3) of [IDNA2008-Rationale]. The result MUST be a Unicode
normalization as discussed in Section 4.1. The result MUST be a string in NFC form.
Unicode string in NFC form.
5.3. A-label Input 5.3. A-label Input
If the input to this procedure appears to be an A-label (i.e., it If the input to this procedure appears to be an A-label (i.e., it
starts in "xn--"), the lookup application MAY attempt to convert it starts in "xn--"), the lookup application MAY attempt to convert it
to a U-label, first ensuring that the A-label is entirely in lower to a U-label, first ensuring that the A-label is entirely in lower
case, and apply the tests of Section 5.4 and the conversion of case, and apply the tests of Section 5.4 and the conversion of
Section 5.5 to that form. If the label is converted to Unicode Section 5.5 to that form. If the label is converted to Unicode
(i.e., to U-label form) using the Punycode decoding algorithm, then (i.e., to U-label form) using the Punycode decoding algorithm, then
the processing specified in those two sections MUST be performed, and the processing specified in those two sections MUST be performed, and
the label MUST be rejected if the resulting label is not identical to the label MUST be rejected if the resulting label is not identical to
the original. See the Name Server Considerations section of the original. See Section 8.1 of [IDNA2008-Rationale] for additional
[IDNA2008-Rationale] for additional discussion on this topic. discussion on this topic.
That conversion and testing SHOULD be performed if the domain name Conversion from the A-label and testing that the result is a U-label
will later be presented to the user in native character form (this SHOULD be performed if the domain name will later be presented to the
requires that the lookup application be IDNA-aware). If those steps user in native character form (this requires that the lookup
are not performed, the lookup process SHOULD at least make tests to application be IDNA-aware). If those steps are not performed, the
determine that the string is actually an A-label, examining it for lookup process SHOULD at least test to determine that the string is
the invalid formats specified in the Punycode decoding specification. actually an A-label, examining it for the invalid formats specified
Applications that are not IDNA-aware will obviously omit that in the Punycode decoding specification. Applications that are not
testing; others MAY treat the string as opaque to avoid the IDNA-aware will obviously omit that testing; others MAY treat the
additional processing at the expense of providing less protection and string as opaque to avoid the additional processing at the expense of
information to users. providing less protection and information to users.
5.4. Validation and Character List Testing 5.4. Validation and Character List Testing
As with the registration procedure described in Section 4, the As with the registration procedure described in Section 4, the
Unicode string is checked to verify that all characters that appear Unicode string is checked to verify that all characters that appear
in it are valid as input to IDNA lookup processing. As discussed in it are valid as input to IDNA lookup processing. As discussed
above and in [IDNA2008-Rationale], the lookup check is more liberal above and in [IDNA2008-Rationale], the lookup check is more liberal
than the registration one. Labels that have not been fully evaluated than the registration one. Labels that have not been fully evaluated
for conformance to the applicable rules are referred to as "putative" for conformance to the applicable rules are referred to as "putative"
labels as discussed in [IDNA2008-Defs][[anchor14: ??? Insert section labels as discussed in Section 2.3.2.1 of [IDNA2008-Defs]. Putative
number -- 2.2.3 as of Defs-09]]. Putative labels with any of the labels with any of the following characteristics MUST be rejected
following characteristics MUST be rejected prior to DNS lookup: prior to DNS lookup:
o Labels containing code points that are unassigned in the version
of Unicode being used by the application, i.e.,in the UNASSIGNED
category of [IDNA2008-Tables].
o Labels that are not in NFC form as defined in [Unicode-UAX15]. o Labels that are not in NFC [Unicode-UAX15].
o Labels containing "--" (two consecutive hyphens) in the third and o Labels containing "--" (two consecutive hyphens) in the third and
fourth character positions. fourth character positions.
o Labels whose first character is a combining mark (see The Unicode
Standard, Section 2.11 [Unicode]).
o Labels containing prohibited code points, i.e., those that are o Labels containing prohibited code points, i.e., those that are
assigned to the "DISALLOWED" category in the permitted character assigned to the "DISALLOWED" category of [IDNA2008-Tables].
table [IDNA2008-Tables].
o Labels containing code points that are identified in o Labels containing code points that are identified in
[IDNA2008-Tables] as "CONTEXTJ", i.e., requiring exceptional [IDNA2008-Tables] as "CONTEXTJ", i.e., requiring exceptional
contextual rule processing on lookup, but that do not conform to contextual rule processing on lookup, but that do not conform to
those rules. Note that this implies that a rule must be defined, those rules. Note that this implies that a rule must be defined,
not null: a character that requires a contextual rule but for not null: a character that requires a contextual rule but for
which the rule is null is treated in this step as having failed to which the rule is null is treated in this step as having failed to
conform to the rule. conform to the rule.
o Labels containing code points that are identified in o Labels containing code points that are identified in
[IDNA2008-Tables] as "CONTEXTO", but for which no such rule [IDNA2008-Tables] as "CONTEXTO", but for which no such rule
appears in the table of rules. Applications resolving DNS names appears in the table of rules. Applications resolving DNS names
or carrying out equivalent operations are not required to test or carrying out equivalent operations are not required to test
contextual rules for "CONTEXTO" characters, only to verify that a contextual rules for "CONTEXTO" characters, only to verify that a
rule is defined (although they MAY make such tests to provide rule is defined (although they MAY make such tests to provide
better protection or give better information to the user). better protection or give better information to the user).
o Labels whose first character is a combining mark (see o Labels containing code points that are unassigned in the version
Section 4.2.3.2). of Unicode being used by the application, i.e.,in the UNASSIGNED
category of [IDNA2008-Tables].
This requirement means that the application must use a list of
unassigned characters that is matched to the version of Unicode
that is being used for the other requirements in this section. It
is not required that the application know which version of Unicode
is being used; that information might be part of the operating
environment in which the application is running.
In addition, the application SHOULD apply the following test. In addition, the application SHOULD apply the following test.
o Verification that the string is compliant with the requirements o Verification that the string is compliant with the requirements
for right to left characters, specified in [IDNA2008-BIDI]. for right to left characters, specified in [IDNA2008-BIDI].
This test may be omitted in special circumstances, such as when the This test may be omitted in special circumstances, such as when the
lookup application knows that the conditions are enforced elsewhere, lookup application knows that the conditions are enforced elsewhere,
because an attempt to look up and resolve such strings will almost because an attempt to look up and resolve such strings will almost
certainly lead to a DNS lookup failure except when wild cards are certainly lead to a DNS lookup failure except when wild cards are
present in the zone. However, applying the test is likely to give present in the zone. However, applying the test is likely to give
much better information about the reason for a lookup failure -- much better information about the reason for a lookup failure --
information that may be usefully passed to the user when that is information that may be usefully passed to the user when that is
feasible -- than DNS resolution failure information alone. In any feasible -- than DNS resolution failure information alone.
event, lookup applications should avoid attempting to resolve labels
that are invalid under that test.
For all other strings, the lookup application MUST rely on the For all other strings, the lookup application MUST rely on the
presence or absence of labels in the DNS to determine the validity of presence or absence of labels in the DNS to determine the validity of
those labels and the validity of the characters they contain. If those labels and the validity of the characters they contain. If
they are registered, they are presumed to be valid; if they are not, they are registered, they are presumed to be valid; if they are not,
their possible validity is not relevant. While a lookup application their possible validity is not relevant. While a lookup application
may reasonably issue warnings about strings it believes may be may reasonably issue warnings about strings it believes may be
problematic, applications that decline to process a string that problematic, applications that decline to process a string that
conforms to the rules above (i.e., does not look it up in the DNS) conforms to the rules above (i.e., does not look it up in the DNS)
are not in conformance with this protocol. are not in conformance with this protocol.
5.5. Punycode Conversion 5.5. Punycode Conversion
The string that has now been validated for lookup is converted to ACE The string that has now been validated for lookup is converted to ACE
form using the Punycode algorithm (with the ACE prefix added). With form by applying the Punycode algorithm to the string and then adding
the understanding that this summary is not normative (the steps above the ACE prefix.
are), the string is either
o in Unicode NFC form that contains no leading combining marks,
contains no DISALLOWED or UNASSIGNED code points, has rules
associated with any code points in CONTEXTJ or CONTEXTO, and, for
those in CONTEXTJ, to satisfy the conditions of the rules; or
o in A-label form, was supplied under circumstances in which the
U-label conversions and tests have not been performed (see
Section 5.3).
5.6. DNS Name Resolution 5.6. DNS Name Resolution
That resulting validated string is looked up in the DNS, using normal The A-label resulting from the conversion in Section 5.5 or supplied
DNS resolver procedures. That lookup can obviously either succeed directly (see Section 5.3) is looked up in the DNS, using normal DNS
resolver procedures. The lookup can obviously either succeed
(returning information) or fail. (returning information) or fail.
6. Security Considerations 6. Security Considerations
Security Considerations for this version of IDNA, except for the Security Considerations for this version of IDNA are described in
special issues associated with right to left scripts and characters, [IDNA2008-Defs], except for the special issues associated with right
are described in [IDNA2008-Defs]. Specific issues for labels to left scripts and characters. The latter are discussed in
containing characters associated with scripts written right to left [IDNA2008-BIDI].
appear in [IDNA2008-BIDI].
7. IANA Considerations 7. IANA Considerations
IANA actions for this version of IDNA are specified in IANA actions for this version of IDNA are specified in
[IDNA2008-Tables] and discussed informally in [IDNA2008-Rationale]. [IDNA2008-Tables] and discussed informally in [IDNA2008-Rationale].
The components of IDNA described in this document do not require any The components of IDNA described in this document do not require any
IANA actions. IANA actions.
8. Contributors 8. Contributors
skipping to change at page 16, line 27 skipping to change at page 16, line 31
[ASCII] American National Standards Institute (formerly United [ASCII] American National Standards Institute (formerly United
States of America Standards Institute), "USA Code for States of America Standards Institute), "USA Code for
Information Interchange", ANSI X3.4-1968, 1968. Information Interchange", ANSI X3.4-1968, 1968.
ANSI X3.4-1968 has been replaced by newer versions with ANSI X3.4-1968 has been replaced by newer versions with
slight modifications, but the 1968 version remains slight modifications, but the 1968 version remains
definitive for the Internet. definitive for the Internet.
[IDNA2008-Mapping] [IDNA2008-Mapping]
Resnick, P., "Mapping Characters in IDNA", August 2009, <h Resnick, P. and P. Hoffman, "Mapping Characters in IDNA",
ttps://datatracker.ietf.org/drafts/ September 2009, <https://datatracker.ietf.org/drafts/
draft-ietf-idnabis-mapping/>. draft-ietf-idnabis-mapping/>.
[IDNA2008-Rationale] [IDNA2008-Rationale]
Klensin, J., Ed., "Internationalized Domain Names for Klensin, J., Ed., "Internationalized Domain Names for
Applications (IDNA): Issues, Explanation, and Rationale", Applications (IDNA): Issues, Explanation, and Rationale",
February 2009, <https://datatracker.ietf.org/drafts/ February 2009, <https://datatracker.ietf.org/drafts/
draft-ietf-idnabis-rationale>. draft-ietf-idnabis-rationale>.
[RFC2136] Vixie, P., Thomson, S., Rekhter, Y., and J. Bound, [RFC2136] Vixie, P., Thomson, S., Rekhter, Y., and J. Bound,
"Dynamic Updates in the Domain Name System (DNS UPDATE)", "Dynamic Updates in the Domain Name System (DNS UPDATE)",
skipping to change at page 18, line 24 skipping to change at page 18, line 24
contexts or as part of running text in paragraphs. contexts or as part of running text in paragraphs.
9. Remove the dot separator from the mandatory part of the 9. Remove the dot separator from the mandatory part of the
protocol. protocol.
10. Make some currently-valid labels that are not actually IDNA 10. Make some currently-valid labels that are not actually IDNA
labels invalid. labels invalid.
Appendix B. Change Log Appendix B. Change Log
[[anchor21: RFC Editor: Please remove this appendix.]] [[ RFC Editor: Please remove this appendix. ]]
B.1. Changes between Version -00 and -01 of draft-ietf-idnabis-protocol B.1. Changes between Version -00 and -01 of draft-ietf-idnabis-protocol
o Corrected discussion of SRV records. o Corrected discussion of SRV records.
o Several small corrections for clarity. o Several small corrections for clarity.
o Inserted more "open issue" placeholders. o Inserted more "open issue" placeholders.
B.2. Version -02 B.2. Version -02
skipping to change at page 23, line 12 skipping to change at page 23, line 12
o Added text to deal with the "upper case in A-labels" problem. o Added text to deal with the "upper case in A-labels" problem.
o Adjusted Acknowledgments to remove Mark Davis's name, per his o Adjusted Acknowledgments to remove Mark Davis's name, per his
request and advice from IETF Trust Counsel. request and advice from IETF Trust Counsel.
o Incorporated other changes from WG Last Call. o Incorporated other changes from WG Last Call.
o Small typographical and editorial corrections. o Small typographical and editorial corrections.
B.16. Version -16
o Adjusted references to current versions.
o Adjusted discussion of changes to Punycode to make more precise.
o Inserted text to clarify version matching between IDNA and
Unicode.
o Made several small changes based on Martin Duerst's review.
o Substituted in Section numbers in references to other IDNA2008
documents.
Author's Address Author's Address
John C Klensin John C Klensin
1770 Massachusetts Ave, Ste 322 1770 Massachusetts Ave, Ste 322
Cambridge, MA 02140 Cambridge, MA 02140
USA USA
Phone: +1 617 245 1457 Phone: +1 617 245 1457
Email: john+ietf@jck.com Email: john+ietf@jck.com
 End of changes. 46 change blocks. 
144 lines changed or deleted 164 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/