idnits 2.17.1 

draft-mealling-human-friendly-identifier-req-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-25) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 1) being 302 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack an Authors' Addresses Section.

  ** There are 14 instances of too long lines in the document, the longest
     one being 3 characters in excess of 72.

  ** There are 21 instances of lines with control characters in the document.

  == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the
     document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

     No issues found here.

     Summary: 10 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	INTERNET-DRAFT                                                M.Mealling
2	Expires six months from June 1998                Network Solutions, Inc.
3	Intended category: Experimental
4	draft-mealling-human-friendly-identifier-req-00.txt

6	                Requirements for Human Friendly Identifiers

8	Status of this Memo

10	This document is an Internet-Draft. Internet-Drafts are working documents
11	of the Internet Engineering Task Force (IETF), its areas, and its working
12	groups. Note that other groups may also distribute working documents as
13	Internet-Drafts.

15	Internet-Drafts are draft documents valid for a maximum of six months and
16	may be updated, replaced, or obsoleted by other documents at any time. It
17	is inappropriate to use Internet-Drafts as reference material or to cite
18	them other than as work in progress.

20	To view the entire list of current Internet-Drafts, please check
21	the "1id-abstracts.txt" listing contained in the Internet-Drafts
22	Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net
23	(Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au
24	(Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu
25	(US West Coast).

27	Abstract

29	This document includes a set of requirements for an identifier that
30	is engineered for human consumption. While the identifier is still
31	machine consumable, the services and capabilities of the underlying
32	system are designed with humans in mind. This includes concepts of
33	geographic and context specific constraints, non-uniqueness, and
34	natural language match semantics.

36	1. Introduction

38	The many identifiers used on the Internet are, in general, designed with
39	machines in mind. Domains, URIs, and email addresses are all used to
40	identify some component of the network. They are not engineered to provide
41	the easiest interface for users. Users routinely handle identifiers where
42	two entities are known by the same name (companies with the same name in
43	two different geographic locations) or short versions of long names
44	(Coke and Coca-Cola). This document specifies requirements for
45	an identifier and resolution system that are can be engineered to solve
46	human oriented identification needs. Identifiers that solve these
47	problems are referred to as "human friendly identifiers".

49	2. Justification

51	The phenomenal growth of the Internet over the past few years has
52	had in immense impact on systems designed for a community of users
53	who were highly computer literate. These early users were willing
54	(eager?) to use systems in a manner that suited the machine more than
55	the user.

57	Some would argue that DNS was built specifically for the purpose of
58	making life easier for the user. On closer examination though, DNS'
59	intended user was significantly more sophisticated than today's
60	Internet user. The dot notation and the strict hierarchy are
61	foreign to today's users and do not match their methods of
62	organizing information and resources very well.

64	To exacerbate the user-unfriendliness of domain-names, the
65	WorldWideWeb has added the additional property of specifying
66	resources and protocols at the particular host identified
67	by the domain-name. Many of those in the URI working group cringed
68	when the first attempts at URLs were heard on radio or printed
69	in newspapers. The quintessential example was heard on the
70	Larry King Live show on CNN. The guest was David Letterman.

72	   LETTERMAN: Can I just take a second here, Larry -- I'm
73	   sorry, I don't mean to interrupt -- To give our World Wide Web
74	   address. If people want to e-mail, we are on the World Wide Web
75	   as well.

77	   KING: You are too?

79	   LETTERMAN: wwwwww.com com com com ........ com com
80	   diggity diggity diggity dank.com.com diggity www.com
81	   Dave.com.com. So give us some of that e-mail or something.

83	   KING: Hold on.

85	   LETTERMAN: Have you got that, Larry?

87	   KING: Would you repeat that? I want to get it right.

89	   LETTERMAN: Come on, Larry. The bit's over. Pick it up.

91	While humorous, Letterman's point is that URLs and domain-names
92	are not suited for the regular, day-to-day information
93	needs of humans. Internet identifiers usually contain odd characters
94	that are needed to delimit syntax elements. The mutual exclusivity
95	of DNS means that two entities cannot have the same name, thus
96	causing those without the desired domain to resort to acronyms
97	or other combinations that simply do not meet users expectations.

99	This inappropriate use of existing identifiers has created two problems:

101		Users are left confused and intimidated. While growth
102		of the Internet is large, there are significant sections
103		of the population that are so intimidated by the
104		technical lingo that they refuse to go online.

106		Identifiers and services are abused in order to squeeze
107		out some modicum of human oriented functionality. DNS'
108		".com" domain is one such example. Companies and governments
109	        routinely attempt to apply trademark law to a medium that
110	        cannot cope with the basic tenets of trademarks.

112	These two problems cannot be solved using existing Internet systems.
113	Intimidated users will not feel comfortable until they can use
114	the same identifiers they use in everyday conversation. Existing
115	systems will be further pressed into service until some system
116	accommodates most of the needs of marketers and lawyers.

118	A solution is needed. This document intends to explore the requirements
119	needed to supply a solution. The first task is to identify the specific
120	user communities and the specific HFI oriented problems they face.
121	Secondly, specific parts of the problem space are analysed for being
122	in or out of scope for this effort.  The remaining problems are then merged
123	into a simple set of requirements that define a solvalbe and useful
124	problem space.

126	3. Intended Audience

128	There are three distinct user communities that have an interest in
129	a human friendly identifier:

131		Users - The general Internet user desires an identifier that
132			can be easily remembered and guessed. This makes it
133			much easier to find important resources.

135		Marketing - Businesses desire an identifier that gives
136			their marketing campaigns the greatest latitude
137			in terms of character sets, length, and simplicity.
138			In many cases the identifier will be determined as
139	   		much by the media in which it is conveyed as the
140			idea it is attempting to convey.

142		Trademark holders - Businesses that own trademarks desire
143			an easy way to protect those resources. Many
144			have invested large amounts of money in protecting
145			their marks according to an existing legal
146			framework.

148	The features that make sense to users are fairly straightforward. They
149	desire an identifier that is as close as possible to the identifiers
150	they use in everyday life. When someone mentioned the term "tide", most
151	users can differentiate the laundry detergent from the rise
152	and fall of oceans by context. At the very least a user would expect an
153	identifier to be able to support two definitions for the same term.

155	The features that a marketing campaign needs are subtly different.
156	Currently there is a desire in marketing campaigns that deal with
157	the Internet to use the Internet connection as an additional marketing
158	point. The ".com" suffix has become a brandname of sorts that
159	signifies that the resource being marketed as "Internet savvy".
160	Additionally, marketing desires identifiers that are short and
161	not syntactically complex. It should be very easy for either
162	the user or marketer to use the same identifier for radio and
163	television as well as the Internet. For example, Network Solutions
164	does not like to use "NSI" as an identifier because it
165	does not convey meaning. The string "Network Solutions" is preferable.
166	In existing Internet identifiers the space (ASCII 20) character
167	is problematic. A marketing campaign should not have to know this
168	or change their techniques just to advertise on the Internet.

170	Once a marketing campaign begins to use a slogan or name in
171	the public, that name or slogan takes on value as a trademark.
172	Trademark makes one very important assumption: a mark can be used
173	by two different entities as long as they are either geographically
174	separate or exist in two distinctly different industry segments.
175	There are exceptions to this of course (federal anti-dilution
176	laws) but by and large it is how trademarks have been used
177	for hundreds of years. Any system that hopes to be usable by
178	a marketing campaign must also be capable of co-existing to some
179	degree with existing trademark law. This means any identifier should
180	be capable of being used by two separate entities. It
181	also means that in order to create the geographic and
182	industry specific segmentation, the user should be able to
183	specify these components when requesting the resource for
184	the identifier.

186	4. Scope

188	Each of these user communities, when asked, would probably suggest a rather
189	expansive system that would normally be characterized as a full directory
190	service.  The task here is to decide which of those features are
191	required and which are out of scope.

193	One feature that the end-user will probably request is that the system
194	allow for keyword searches on the data returned by an identifier or that
195	the HFIs be organized into some topical hierarchy to be used for browsing.
196	These features are simply to elaborate and would turn the entire system
197	into a simple search engine. These already exist and should not be
198	standardized as part of this problem.

200	Another feature that the marketing and trademark owners would prefer is
201	that the system itself protect trademarks by inserting legal/business
202	logic deep into the resolution system. Due to the massive differences
203	in legal systems, customs and user expectations, this is simply impossible
204	to do with current technology. Thus, all decisions about what entity is
205	allowed to insert which identifiers is a policy issue to be decided
206	by those entities that participate in the system. In other words, this
207	system is not a generic trademark enforcement mechanism anymore than
208	the printing press is. Trademark disputes are still adjudicated within
209	the legal system. This system should merely reflect that, not enforce it.

211	Succinctly stated, the requirements that are considered out of scope are
212	generic search/navigation and trademark enforcement.

214	5. Requirements

216	The requirements that are left are fairly simple and should allow for
217	a system that can be implemented but that still solves enough problems
218	to be useful.

220	Shortness

222	The identifier should be short so that those dealing with marketing
223	and media can create very short identifiers that users can remember
224	easily.

226	Internationalization

228	The identifier should be fully internationalized. This includes
229	matching semantics for left-right, right-left, top-bottom orientation;
230	multi-language soundex, etc.

232	N-to-N mapping

234	A single identifier should be capable of being used by two separate
235	entities. Conversely, an entity should be capable of having more than
236	one identifier.

238	Matching semantics

240	At the least, substring matches are required. Other methods
241	of matching should be evaluated based on performance and ability
242	to give the user an accurate result set.

244	User level context

246	The client should be able to communicate to the resolution service
247	its geographic and semantic context so that matches can be ranked
248	according to location and relevance to the users current context.
249	The system should be capable of conveying other contexts on a
250	per-application basis.

252	Hierarchy

254	The identifier should be capable of expressing hierarchy. In some
255	cases it makes sense for an identifier to appear to belong to a
256	hierarchy. But this is merely a capability. It is not a hierarchy.
257	It is expected that hierarchical identifiers will be a distinct
258	minority.

260	Openness

262	The system should allow for end-users to insert their own identifiers
263	into the system in an open manner.

265	Quality of Service

267	The user should be presented with some simple system for understanding
268	whether the identifier was created by an entity that puts a higher
269	quality of service on the data represented by the identifier. The basic
270	level of service is where any entity can insert any identifier so
271	long as no gaurantees are made about that identifiers legal or commercial
272	status.  The highest level of service is where the identifier is
273	gauranteed to be a legal trademark in all of the specified contexts and
274	the data returned by the service is gauranteed to be complete and accurate.

276	Distributed

278	While the namespace is inherently flag, the top level servers must be
279	distributable and should only contain referrals to servers where the
280	actual data is stored.

282	Data representation

284	The data returned to the client should be in a format that allows for
285	fairly rich content but that does not require the content to be rich.

287	6. Conclusions

289	These requirements define a problem space that currently does not have
290	a solution. They do, on the other hand, describe a problem that is
291	solvable using existing or easiliy developed/evolved technology.

293	7. Author Contact Information

295	Michael Mealling
296	Network Solutions
297	505 Huntmar Park Drive
298	Herndon, VA 22070
299	voice: (703) 742-0400
300	fax: (703) 742-9552
301	email: michaelm@rwhois.net