On Interoperation of
Labels Between DNS and Other Resolution Systems
Dyn
150 Dow St.
Manchester
NH
03101
U.S.A.
asullivan@dyn.com
DNSSD
Despite its name, DNS-Based Service Discovery can use naming
systems other than the Domain Name System when looking for
services. Moreover, when it uses the DNS, DNS-Based Service
Discovery uses the full capability of DNS, rather than using a
subset of available octets. In order for DNS-SD to be used
effectively in environments where multiple different name systems
and conventions for their operation are in use, it is important to
attend to differences in the underlying technology and operational
environment. This memo presents an outline of the requirements
for selection of labels for conventional DNS and other resolution
systems when they are expected to interoperate in this manner.
DNS-Based Service Discovery (DNS-SD, )
specifies a mechanism for discovering services using queries to
the Domain Name System (DNS, , ); and to any other system that uses domain
names, such as Multicast DNS (mDNS, ).
Conventional use of the DNS generally follows the host name rules
for labels -- the so-called LDH rule.
That convention is the reason behind the development of
Internationalized Domain Names for Applications (IDNA2008, , , , , , ). It is worth
noting that the LDH rule is a convention, and not a rule of the
DNS; this is made entirely plain by ,
section 11. Nevertheless, there is a widespread belief that in
many circumstances domain names cannot be used in the DNS unless
they cleave to the LDH rule.
At the same time, mDNS requires that labels be encoded in UTF-8,
and permits a range of characters in labels that are not permitted
by IDNA2008 or the LDH rule. For example, mDNS encourages the use of
spaces and punctuation in mDNS names (see ,
section 4.1.3). It does not restrict which Unicode code points may
be used in those labels, so long as the code points are UTF-8 in
Net-Unicode format.
Users of applications are, of course, frequently unconcerned
with (not to say oblivious to) the name-resolution system(s) in
service at any given moment, and are inclined simply to use the
same domain names in different contexts. As a result, the same
domain name might be tried using different name resolution
technologies. If DNS-SD is to be used in an environment where
multiple resolution systems (such as mDNS and DNS) are to be
queried for services, then some parts of the domain names to be
queried will need to be compatible with the rules and conventions
for all the relevant technologies.
One approach to interoperability under these circumstances is
to use a single operational convention (a "profile") for domain
names under the different naming systems. This memo assumes such
a use profile, and attempts to outline what is necessary to make
it work without specifying any particular technology. It does
assume, however, that the global DNS is eventually likely to be
implicated. Given the general tendency of all resolution
eventually to fall through to the DNS, that assumption does not
seem controversial.
It is worth noting that users of DNS-SD do not use the service
discovery names in the same way that users of other domain names
might. Domain names often might as easily be entered as direct
user input as by any other method. But the service discovery
context generally assumes users are picking a service from a list.
As a result, the sorts of application considerations that are
appropriate to the general-purpose DNS name, and that resulted in
the A-label/U-label split (see below) in IDNA2008, are not
entirely the right approach for DNS-SD.
Wherever appropriate, this memo uses the terminology defined
in Section 2 of . In particular, the
reader is assumed to be familiar with the terms "U-label", "LDH
label", and "A-label" from that document. Similarly, the reader
is assumed to be familiar with the U+NNNN notation for Unicode
code points used in and other documents
dealing with Unicode code points. In the interests of brevity
and consistency, the definitions are not repeated here.
Sometimes this memo refers to names in the DNS as though the
LDH rule and IDNA2008 are strict requirements. They are not.
DNS labels are, in principle, just collections of octets, and
therefore in principle the LDH rule is not a constraint. In
practice, applications sometimes intercept labels that do not
conform to the LDH rule and apply IDNA and other
transformations.
The DNS, perhaps unfortunately, has produced its own jargon.
Unfamiliar DNS-related terms in this memo should be found in
.
The term "owner name" (common to the DNS vernacular; see
above) is used here to apply not just to the domain names to be
looked up in the DNS, but to any name that might be looked up
either in the DNS or using other technologies. It therefore
includes names that might not actually exist anywhere. In
addition, what follows depends on the idea that not every domain
name may be looked up in the DNS. For instance, names ending in
"local." (in the presentation format) are not ordinarily looked
up in the DNS, but instead by querying mDNS.
DNS-SD specifies three portions of the owner name for a
DNS-SD resource record. These are the <Instance> portion,
the <Service> portion, and the <Domain>. The owner
name made of these three parts is called the Service Instance
Name. It is worth observing that a portion may be more than one
label long. See , section 4.1.
Further discussion of the parts is found in .
Throughout this memo, mDNS is used liberally as the
alternative resolution mechanism to DNS. This is for
convenience rather than rigour: any alternative name resolution
to DNS could present the same friction with the prevailing
operational conventions of the global DNS. It so happens that
mDNS is the overwhelmingly successful alternative as of this
writing, so it is used in order to make the issues plainer to
the reader. Other alternative resolution mechanisms may in
general be read wherever mDNS appears in the text, except where
details of the mDNS specification appear.
One might reasonably wonder why there is a problem to be solved
at all. After all, DNS labels permit any octet whatsoever, and
anything that can be useful with DNS-SD cannot use any names that
are outside the protocol strictures of the DNS.
The reason for the trouble is twofold. First, and least
troublesome, is the possibility of resolvers that are attempting
to offer IDNA service system-wide. Given the design of IDNA2008,
it is reasonable to suppose that on some systems high-level name
resolution libraries will perform the U-label/A-label
transformation automatically, saving applications from these
details. If this were the main problem, however, it would
presumably be self-correcting; for the right answer would be,
"Don't use those libraries for DNS-SD," and DNS-SD would not work
reliably in cases where such libraries were in use. This would be
unfortunate; but given that DNS-SD in Internet contexts is as of
this writing not in ubiquitous use, it should not represent a
fatal issue.
The greater problem is that the "infrastructure" types of DNS
service -- the root zone, the top-level domains, and so on -- have
embraced IDNA and refuse registration of raw UTF-8 into their
zones. As of this writing there is (perhaps unfortunately) no
reliable way to discover where these sorts of DNS services end.
Nevertheless, some client programs (notably web browsers) have
adopted a number of different policies about how domain names will
be looked up and presented to users given the policies of the
relevant DNS zone operators. None of these policies permit raw
UTF-8. Since it is anticipated that DNS-SD when used with the DNS
will be inside domain names beneath those kinds of
"infrastructure" domains, the implications of IDNA2008 must be a
consideration.
For further exploration of issues relating to encoding of
domain names generally, the reader should consult .
Any interoperability between DNS (including prevailing
operational conventions) and other resolution technologies will
require interoperability across the portions of a DNS-SD Service
Instance Name that are implicated in regular DNS lookups. Only
some portions are implicated. In any case, if a given portion is
implicated, the profile will need to apply to all labels in that
portion.
In addition, because DNS-SD Service Instance Names can be used
in a domain name slot, care must be taken by DNS-SD-aware
resolvers to handle the different portions as outlined here, so
that DNS-SD portions that do not use IDNA2008 will not be treated
as U-labels and will not accidentally undergo IDNA processing.
Because the profile will need to apply to names that might need
to interoperate with names in the public DNS, and because other
resolution mechanisms (such as mDNS) could permit labels that IDNA
does not, the profile might reduce the labels that could be used
with those other resolution mechanisms. One consequence of this
is that some recommendations from will
not really be possible to implement using names subject to the
profile. In particular, , section 4.1.3
recommends that labels always be stored and communicated as UTF-8,
even in the DNS. Because of the way the public DNS is currently
operated (see ), the advice to
store and transmit labels as UTF-8 in the DNS is likely either to
encounter problems or result in unnecessary traffic to the public
DNS (or both). In particular, many labels in the <Domain>
part of a Service Instance Name is unlikely to be found in its
UTF-8 form in the public DNS tree for zones that are using
IDNA2008. By contrast, for example, mDNS normally uses UTF-8.
U-labels cannot contain upper case letters. That restriction
extends to ASCII-range upper case letters that work fine in
LDH-labels. It may be confusing that the character "A" works in
the DNS when none of the characters in the label has a diacritic,
but does not work when there is such a diacritic in the label.
Labels in mDNS names (or other resolution technologies) may
contain upper case characters, so the profile will need either to
restrict the use of upper case or come up with a reliable and
predictable (to users) convention for case folding even in the
presence of diacritics.
Service Instance Names are made up of three portions.
is clear that the <Instance>
portion of the Service Instance Name is intended for
presentation to users, and therefore virtually any character is
permitted in it. There are two ways that a profile might address
this portion.
The first way would be to treat this portion as likely to be
intercepted by system-wide IDNA-aware resolvers, or likely
subject to strict IDNA conformance requirements for publication
in the relevant zone. In this case, the portion would need to
be made subject to the profile, thereby curtailing what
characters may appear in this portion. This approach permits
DNS-SD to use any standard system resolver but presents
inconsistencies with the DNS-SD specification and with DNS-SD
that is exclusively mDNS-based. Therefore, this strategy is
rejected.
Instead, DNS-SD implementations can intercept the
<Instance> portion of a Service Instance Name and ensure
that those labels are never handed to IDNA-aware resolvers that
might attempt to convert these labels into A-labels. Under this
approach, the DNS-SD <Instance> portion works as it always
does, but at the cost of using special resolution code built
into the DNS-SD system. A practical consequence of this is that
zone operators need to be prepared not to apply the LDH rule to
all labels, and may need to make special concessions to ensure
that the <Instance> portion can contain spaces, upper and
lower case, and any UTF-8 code point; or else to prepare a user
interface to handle the exceptions that would otherwise be
generated. Automatic conversion to A-labels is not
acceptable.
DNS-SD includes a <Service> component in the Service
Instance Name. This component is not really user-facing data,
but is instead control data embedded in the Service Instance
Name. This component includes so-called "underscore labels",
which are labels prepended with U+005F (_). The underscore
label convention was established by DNS SRV () for identifying metadata inside DNS names.
A system-wide resolver (or DNS middlebox) that cannot handle
underscore labels will not work with DNS-SD at all, so it is
safe to suppose that such resolvers will not attempt to do
special processing on these labels. Therefore, the
<Service> portion of the Service Instance Name will not be
subject to the profile. By the same token, it should be noted
that underscore labels are never subject to IDNA processing
(they're formally incompatible), and therefore concerns about
IDNA are irrelevant for these labels.
The <Domain> portion of the Service Instance Name forms
an integral part of the owner name submitted for DNS resolution.
A system-wide resolver that is IDNA2008-aware is likely to
interpret labels with UTF-8 in the owner name as candidates for
IDNA2008 processing. More important, operators of
internationalized domain names will frequently publish such names
in the DNS as A-labels; certainly, the top-most labels will always
be A-labels. Therefore, these labels will need to be subject to
the profile. DNS-SD implementations ought to identify the
<Domain> portion of the Service Instance Name and treat it
subject to IDNA2008 in case the domain is to be queried from the
global DNS. In the event that the <Domain> portion of the
Service Instance Name fails to resolve, it is acceptable to
substitute labels with plain UTF-8, starting at the lowest label
in the DNS tree and working toward the root. This approach
differs from the rule for resolution published in , because it privileges IDNA2008-compatible
labels over UTF-8 labels.
One might argue against this restriction on either of two
grounds:
It is possible the names may be in the DNS in UTF-8, and RFC
6763 already specifies a fallback strategy of progressively
attempting first the UTF-8 label lookup (it might not be a
U-label) and then if possible the A-label lookup.
Zone administrators that wish to support DNS-SD can publish a
UTF-8 version of the zone along side the A-label version of the
zone.
The first of these is rejected because it represents a potentially
significant increase in DNS lookup traffic for no value. It is
possible for a DNS-SD application to identify the <Domain>
portion of the Service Instance Name. The standard way to publish
IDNs on the Internet uses IDNA. Therefore, additional lookups
should not be encouraged. When was
published, the bulk of IDNs were lower in the tree. Now that
there are internationalized labels in the root zone, it is
desirable to minimize queries to the Internet infrastructure if
they are sure to be answered in the negative.
The second reason depends on the idea that it is possible to
maintain two names in sync with one another. This is not strictly
speaking true, although in this case the domain operator could
simply create a DNAME record from the
UTF-8 name to the IDNA2008 zone. This still, however, relies on
being able to reach the (UTF-8) name in question, and it is
unlikely that the UTF-8 version of the zone will be delegated from
anywhere. Moreover, in many organizations the support for DNS-SD
and the support for domain name delegations are not performed by
the same department, and depending on a co-ordination between the
two will make the system more fragile, or slower, or both.
Some resolvers -- particularly those that are used in mixed DNS
and non-DNS environments -- may be aware of different operational
conventions in different parts of the DNS tree. For example, it
may be possible for implementations to use hints about the
boundary of an organization's domain name infrastructure, in order
to tell (for instance) that example.com. is part of the Example
Organization while com. is a large delegation-centric zone on the
public Internet. In such cases, the resolution system might
reverse its preferences to prefer plain UTF-8 labels when
resolving names below the boundary point in the DNS tree. The
result would be that any lookup past the boundary point and closer
to the root would use LDH-labels first, falling back to UTF-8 only
after a failure; but a lookup below the boundary point would use
UTF-8 labels first, and try other strategies only in case of
negative answers. The mechanism to learn such a boundary is beyond
the scope of this document.
The author gratefully acknowledges the insights of Joe Abley,
Stuart Cheshire, Paul Hoffman, Kerry Lynn, and Dave Thaler. Kerry
Lynn deserves special gratitude for his energy and persistence in
pressing unanswered questions. Doug Otis sent many comments about
visual confusability.
This memo makes no requests of IANA.
This memo presents some requirements for future development, but
does not specify anything. It makes no additional security-specific
requirements. Issues arising due to visual confusability of names
apply to this case as well as to any other case of internationalized
names, but interoperation between different resolution systems and
conventions does not alter the severity of those issues.
DoD Internet host table specification
SRI International
SRI International
SRI International
Domain names - concepts and facilities
Information Sciences Institute (ISI)
Domain names - implementation and specification
USC/ISI
4676 Admiralty Way
Marina del Rey
CA
90291
US
+1 213 822 1511
Clarifications to the DNS Specification
Computer Science
Parkville
Victoria
3052
Australia.
kre@munnari.OZ.AU
e
RGnet, Inc.
5147 Crystal Springs Drive
Bainbridge Island
Washington
98110
United States.
NE
randy@psg.com
Applications
DNS
domain name system
A DNS RR for specifying the location of services (DNS SRV)
Troll Tech
Waldemar Thranes gate 98B
Oslo
N-0175
NO
+47 22 806390
+47 22 806380
arnt@troll.no
Internet Software Consortium
950 Charter Street
Redwood City
CA
94063
US
+1 650 779 7001
Microsoft Corporation
One Microsoft Way
Redmond
WA
98052
US
levone@microsoft.com
This document describes a DNS RR which specifies the location of the
server(s) for a specific protocol and domain.
Unicode Format for Network Interchange
The Internet today is in need of a standardized form for the transmission of internationalized "text" information, paralleling the specifications for the use of ASCII that date from the early days of the ARPANET. This document specifies that format, using UTF-8 with normalization and specific line-ending sequences. [STANDARDS-TRACK]
Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework
This document is one of a collection that, together, describe the protocol and usage context for a revision of Internationalized Domain Names for Applications (IDNA), superseding the earlier version. It describes the document collection and provides definitions and other material that are common to the set. [STANDARDS-TRACK]
Internationalized Domain Names in Applications (IDNA): Protocol
This document is the revised protocol definition for Internationalized Domain Names (IDNs). The rationale for changes, the relationship to the older specification, and important terminology are provided in other documents. This document specifies the protocol mechanism, called Internationalized Domain Names in Applications (IDNA), for registering and looking up IDNs in a way that does not require changes to the DNS itself. IDNA is only meant for processing domain names, not free text. [STANDARDS-TRACK]
The Unicode Code Points and Internationalized Domain Names for Applications (IDNA)
This document specifies rules for deciding whether a code point, considered in isolation or in context, is a candidate for inclusion in an Internationalized Domain Name (IDN).</t><t> It is part of the specification of Internationalizing Domain Names in Applications 2008 (IDNA2008). [STANDARDS-TRACK]
Right-to-Left Scripts for Internationalized Domain Names for Applications (IDNA)
The use of right-to-left scripts in Internationalized Domain Names (IDNs) has presented several challenges. This memo provides a new Bidi rule for Internationalized Domain Names for Applications (IDNA) labels, based on the encountered problems with some scripts and some shortcomings in the 2003 IDNA Bidi criterion. [STANDARDS-TRACK]
Internationalized Domain Names for Applications (IDNA): Background, Explanation, and Rationale
Several years have passed since the original protocol for Internationalized Domain Names (IDNs) was completed and deployed. During that time, a number of issues have arisen, including the need to update the system to deal with newer versions of Unicode. Some of these issues require tuning of the existing protocols and the tables on which they depend. This document provides an overview of a revised system and provides explanatory material for its components. This document is not an Internet Standards Track specification; it is published for informational purposes.
Mapping Characters for Internationalized Domain Names in Applications (IDNA) 2008
In the original version of the Internationalized Domain Names in Applications (IDNA) protocol, any Unicode code points taken from user input were mapped into a set of Unicode code points that "made sense", and then encoded and passed to the domain name system (DNS). The IDNA2008 protocol (described in RFCs 5890, 5891, 5892, and 5893) presumes that the input to the protocol comes from a set of "permitted" code points, which it then encodes and passes to the DNS, but does not specify what to do with the result of user input. This document describes the actions that can be taken by an implementation between receiving user input and passing permitted code points to the new IDNA protocol. This document is not an Internet Standards Track specification; it is published for informational purposes.
IAB Thoughts on Encodings for Internationalized Domain Names
This document explores issues with Internationalized Domain Names (IDNs) that result from the use of various encoding schemes such as UTF-8 and the ASCII-Compatible Encoding produced by the Punycode algorithm. It focuses on the importance of agreeing on a single encoding and how complicated the state of affairs ends up being as a result of using different encodings today.
DNAME Redirection in the DNS
The DNAME record provides redirection for a subtree of the domain name tree in the DNS. That is, all names that end with a particular suffix are redirected to another part of the DNS. This document obsoletes the original specification in RFC 2672 as well as updates the document on representing IPv6 addresses in DNS (RFC 3363). [STANDARDS-TRACK]
Multicast DNS
As networked devices become smaller, more portable, and more ubiquitous, the ability to operate with less configured infrastructure is increasingly important. In particular, the ability to look up DNS resource record data types (including, but not limited to, host names) in the absence of a conventional managed DNS server is useful.</t><t> Multicast DNS (mDNS) provides the ability to perform DNS-like operations on the local link in the absence of any conventional Unicast DNS server. In addition, Multicast DNS designates a portion of the DNS namespace to be free for local use, without the need to pay any annual fee, and without the need to set up delegations or otherwise configure a conventional DNS server to answer for those names.</t><t> The primary benefits of Multicast DNS names are that (i) they require little or no administration or configuration to set them up, (ii) they work when no infrastructure is present, and (iii) they work during infrastructure failures.
DNS-Based Service Discovery
This document specifies how DNS resource records are named and structured to facilitate service discovery. Given a type of service that a client is looking for, and a domain in which the client is looking for that service, this mechanism allows clients to discover a list of named instances of that desired service, using standard DNS queries. This mechanism is referred to as DNS-based Service Discovery, or DNS-SD.
DNS Terminology
The DNS is defined in literally dozens of different RFCs. The terminology used in by implementers and developers of DNS protocols, and by operators of DNS systems, has sometimes changed in the decades since the DNS was first defined. This document gives current definitions for many of the terms used in the DNS in a single document.
Note to RFC Editor: this section should be removed prior to
publication.
Altered the title to make it more generic than mDNS.
Addressed issues raised by Dave Thaler in review on
2015-07-18.
Added a note to about visual
confusion. I don't know whether this will satisfy Doug Otis
but it is the only thing I can see that could possibly be
relevant.
Added discussion of finding "boundary" in .
Alter text to make clear that the main issue is the way the
public DNS is currently administered, not system resolvers. I
suppose this should have been clear before, but I didn't do that.
Many thanks to Kerry Lynn for penetrating questions that
illuminated what I'd left out.
1st WG version
Add text to make clear that fallback from A-label lookup to
UTF-8 label lookup ok, per WG comments at IETF 91
Decided which portions would be affected
Explained the difference in user interfaces between DNS-SD
and usual DNS operation
Provided background on why the Domain portion should be
treated differently