INTERNET-DRAFT I. Nejgebauer draft-nejgebauer-numeric-locators-00.txt University of Novi Sad Expires September 2003 March 2003 Numeric Locators for Uniform Resource Identifiers Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The Internet Society (2003). All Rights Reserved. Abstract This document describes the syntax of numeric character strings, called numeric locators or simply locators, which form a hierarchical name space where any node except the root can (or, if it is a leaf node, must) resolve to a single Uniform Resource Identifier. It also defines a mapping of the locator name space to the Domain Name System, which allows the use of the existing DNS infrastructure to resolve locators. 1. Introduction Accessing an Internet resource, such as a World Wide Web page, often involves keying in its Uniform Resource Identifier, which is an alphanumeric string. For a large number of mobile devices, equipped only with a numeric keypad, alphabetic text entry is cumbersome, slow Nejgebauer Expires September 2003 [Page 1] Internet-Draft Numeric Locators for URIs March 2003 and error-prone. Punctuation characters typically encountered in URIs, such as the dot and the forward slash, present an additional obstacle, as their location on the keypad is not standardized. One way to alleviate the problem of manual URI entry is to eliminate it altogether by presenting the user with an organized collection of hyperlinks to commonly accessed resources. This approach works well in certain contexts, but it is too constraining for a general solution; necessarily so, since boundless growth of a hyperlink collection would quickly make it unmanageable. This document introduces a method for indirect specification of URIs through strings called (numeric) locators, which have the following characteristics: - Numeric representation. Limiting the character set to decimal digits makes manual entry of locators much easier on devices with a numeric keypad. Locator syntax is formally described in Section 2. - Hierarchical organization. Locators form a tree-structured name space closely resembling, by design, the structure of the Domain Name System. This makes it possible to define a 1-to-1 mapping between locators and domain names, which is done in Section 3. - Compactness. With the organization proposed in Appendix A, a typical locator should be no more than 10-12 digits long, comparable to the length of an international telephone number. Length can be further reduced by using relative locators, specified in Section 2. Conceptually, a locator can point to a URI, a set of lower-level locators, or both. The root node of the locator name space can only contain pointers to first-level locators, and every leaf node must resolve to a URI. Only one URI can be associated with a node, while the number of lower-level locators is unlimited. The task of retrieving the URI corresponding to a given locator is performed by a software component called a resolver. Several aspects of resolver operation are determined by this specification: syntax validation of a locator, its transformation to a domain name, and the subsequent DNS query for that name. A resolver can be implemented either locally to a device or as a remote procedure, and coupled with the user interface in a variety of ways; both considerations are outside the scope of this document. 2. Locator Construction and Syntax A locator is a hierarchical name composed of a series of labels, each Nejgebauer Expires September 2003 [Page 2] Internet-Draft Numeric Locators for URIs March 2003 representing one level of the hierarchy. Labels are written left to right, from the least specific (closest to the root) to the most specific (farthest from the root). The digit zero is used as the label separator, leaving the other nine digits for label text. Labels are one to 63 digits in length; an implicit zero-length label denotes the root. The digits used for the construction of locators are taken from the US-ASCII character set, so the length of a locator in digits equals its length in octets. This does not preclude the use of other sets of decimal numerals in the user interface, but in such cases conversion to the US-ASCII representation is required before the locator is passed on to the resolver. The formal locator syntax is defined by a BNF-like grammar, taken from [RFC2396]. Briefly, rules are separated from definitions by an equal sign "=", literals are quoted with "", parentheses "(" and ")" are used to group elements, alternatives are designated by "|", and elements may be preceded with * to designate n or more repetitions of the following element; n defaults to 0. There are two forms of locators: absolute and relative. Without explicit qualification, a locator is understood to be absolute. An absolute locator starts at the root and ends with the most specific label. The total length of a locator is limited to 240 digits. locator = 1*(separator label) separator = "0" label = 1*label_text label_text = "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" A relative locator starts with a label at some level below the root and continues to the most specific label. It can be transformed to an absolute locator by prepending it with the appropriately formatted string, called an origin. relative_locator = label *(separator label) origin = *(separator label) separator The relative-to-absolute transformation is performed by the resolver, which has to be configured with the appropriate value for the origin. In the absence of such configuration, the resolver must signal an error to its caller if presented with a relative locator. Nejgebauer Expires September 2003 [Page 3] Internet-Draft Numeric Locators for URIs March 2003 3. Locators and the Domain Name System Locators are designed for integration into the DNS, hence the similarity in terminology and the compatibility of label syntax and length limits. A locator can be unambiguously transformed to a domain name if the following differences between locators and domain names are compensated for: - The labels of a locator are written from the least specific to the most specific, exactly the reverse of the order specified in [RFC1034] for the labels of a domain name. - Locators use the digit zero as a separator, while domain names use the dot. - All locator labels start with a digit, while Section 2.1 of [RFC1123] requires the highest-level label of a domain name to start with an alphabetic character, in order to make domain names distinguishable from dotted-decimal IPv4 addresses. Taking these issues into account, the transformation of a locator to a domain name consists of the following conceptual operations: - The order of the labels is reversed. Each label is followed by a separator. - The digit zero is replaced with a dot throughout the resulting string. - An alphabetic label is appended to the resulting name. That label represents a special top-level domain which is necessary to anchor the locator-derived names to the DNS hierarchy. This specification uses the LOCATOR top-level domain for that purpose. For example, transforming the locator: 0110392675301 produces the domain name: 1.3926753.11.locator The URI corresponding to a locator is stored in the DNS as the value of the TXT RR (see [RFC1035]) associated with the domain name derived from that locator. After computing the domain name from a locator, the resolver issues a DNS query for that name with QTYPE set to TXT, acting as a DNS stub resolver. The standard resolution algorithm of [RFC1034] is followed, with the difference that a response with Nejgebauer Expires September 2003 [Page 4] Internet-Draft Numeric Locators for URIs March 2003 multiple TXT records is treated as an error. Below is an example of a complete DNS zone containing the mapping for the locator "0110392675301": $ORIGIN 3926753.11.locator. @ SOA ns.example.com. hostmaster.example.com. ( 2003030101 3600 3600 604800 86400 ) NS ns.example.com. NS backup.example.com. 1 TXT "http://www.example.com/news" The appearance of names from the domain "example.com" in this zone is not arbitrary: the example follows the organization of Appendix A and assumes that the same administrative entity is the authority for "example.com" and "3926753.11.locator". 4. Security Considerations Locator resolution depends on the DNS and inherits its security concerns, such as spoofing and denial of service. The use of locators presents an additional opportunity for DNS-based attacks, since URI retrieval will typically be followed by a DNS query for the server named in the URI. URIs are a general resource naming mechanism, and might identify resources and functions local to the requestor, including those which, when accessed, invoke operations with non-negligible side effects (e.g., initiating a telephone call or sending a network message, both of which are charged for.) A locator could resolve to a "dangerous" URI either through spoofing, or due to malicious or erroneous configuration. References [RFC1034] Mockapetris, P., "Domain Names - Concepts and Facilities", STD 13, RFC 1034, November 1987. [RFC1035] Mockapetris, P., "Domain Names - Implementation and Specification", STD 13, RFC 1035, November 1987. [RFC1123] Braden, R., "Requirements for Internet Hosts - Application and Support", STD 3, RFC 1123, January 1989. [RFC2396] Berners-Lee, T., Fielding, R.T. and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax", RFC 2396, August 1998. Nejgebauer Expires September 2003 [Page 5] Internet-Draft Numeric Locators for URIs March 2003 Author's Address Ivan Nejgebauer University of Novi Sad ARMUNS Trg Dositeja Obradovica 5 21000 Novi Sad Serbia and Montenegro Phone: +381 21 350 525 EMail: ian@uns.ns.ac.yu Nejgebauer Expires September 2003 [Page 6] Internet-Draft Numeric Locators for URIs March 2003 Appendix A: Organization of the Locator Name Space The locator name space is distinct from that of the DNS, and could be organized in an arbitrarily incompatible way. However, in the interests of easing the administrative burden of locator management and making the use of locators acceptable performance-wise, it is beneficial to keep the two name spaces closely aligned. To that end, two organizational rules for locator assignment and management are specified. The first rule sets the requirements for assigning a locator label to a requestor, namely: - The requestor must have an ordinary DNS domain registered in the DNS. This is intended to reduce the potential for locator abuse, and is the basis of the next requirement. - The zone of the requested locator label must have the same authority information (SOA MNAME and RNAME fields and NS RRs) as the zone of the ordinary DNS domain of the requestor. Furthermore, the NS RRs must have the same TTL values in both zones. The locators defined by the requestor are expected to identify URIs pointing to its DNS name space. If a locator is successfully resolved, the NS records for its zone will typically be cached by the DNS resolver performing the query, and available for querying the domain name of the server pointed to by the URI. The other rule defines a 1-to-1 mapping between top-level domains and first-level locator labels by providing a numeric equivalent for each TLD. Generic TLD (gTLD) mappings are shown in Table 1. Table 1: gTLD Numeric Equivalents com 11 aero 111 edu 12 biz 112 gov 13 coop 113 int 14 info 114 mil 15 museum 115 net 16 name 116 org 17 pro 117 Labels starting with "1" are reserved for gTLD mapping. Country-code TLD (ccTLD) equivalents are three-digit strings, starting from "211" and proceeding sequentially. The next label in sequence is obtained by taking the integer value of a label, incrementing it until the result has no zeros in its decimal representation, and converting the resulting value back to a string. Nejgebauer Expires September 2003 [Page 7] Internet-Draft Numeric Locators for URIs March 2003 Mapping for the currently defined ccTLDs appears in Table 2. Table 2: ccTLD Numeric Equivalents ac 211 ad 212 ae 213 af 214 ag 215 ai 216 al 217 am 218 an 219 ao 221 aq 222 ar 223 as 224 at 225 au 226 aw 227 az 228 ba 229 bb 231 bd 232 be 233 bf 234 bg 235 bh 236 bi 237 bj 238 bm 239 bn 241 bo 242 br 243 bs 244 bt 245 bv 246 bw 247 by 248 bz 249 ca 251 cc 252 cd 253 cf 254 cg 255 ch 256 ci 257 ck 258 cl 259 cm 261 cn 262 co 263 cr 264 cu 265 cv 266 cx 267 cy 268 cz 269 de 271 dj 272 dk 273 dm 274 do 275 dz 276 ec 277 ee 278 eg 279 eh 281 er 282 es 283 et 284 fi 285 fj 286 fk 287 fm 288 fo 289 fr 291 ga 292 gd 293 ge 294 gf 295 gg 296 gh 297 gi 298 gl 299 gm 311 gn 312 gp 313 gq 314 gr 315 gs 316 gt 317 gu 318 gw 319 gy 321 hk 322 hm 323 hn 324 hr 325 ht 326 hu 327 id 328 ie 329 il 331 im 332 in 333 io 334 iq 335 ir 336 is 337 it 338 je 339 jm 341 jo 342 jp 343 ke 344 kg 345 kh 346 ki 347 km 348 kn 349 kp 351 kr 352 kw 353 ky 354 kz 355 la 356 lb 357 lc 358 li 359 lk 361 lr 362 ls 363 lt 364 lu 365 lv 366 ly 367 ma 368 mc 369 md 371 mg 372 mh 373 mk 374 ml 375 mm 376 mn 377 mo 378 mp 379 mq 381 mr 382 ms 383 mt 384 mu 385 mv 386 mw 387 mx 388 my 389 mz 391 na 392 nc 393 ne 394 nf 395 ng 396 ni 397 nl 398 no 399 np 411 nr 412 nu 413 nz 414 om 415 pa 416 pe 417 pf 418 pg 419 ph 421 pk 422 pl 423 pm 424 pn 425 pr 426 ps 427 pt 428 pw 429 py 431 qa 432 re 433 ro 434 ru 435 rw 436 sa 437 sb 438 sc 439 sd 441 se 442 sg 443 sh 444 si 445 sj 446 sk 447 sl 448 sm 449 sn 451 so 452 sr 453 st 454 sv 455 sy 456 sz 457 tc 458 td 459 tf 461 tg 462 th 463 tj 464 tk 465 tm 466 tn 467 to 468 tp 469 tr 471 tt 472 tv 473 tw 474 tz 475 ua 476 ug 477 uk 478 um 479 us 481 uy 482 uz 483 va 484 vc 485 ve 486 vg 487 vi 488 vn 489 vu 491 wf 492 ws 493 ye 494 yt 495 yu 496 za 497 zm 498 zw 499 Policies for the assignment and management of labels below the first level are at the discretion of the authority governing their name space, as long as the requirements of the first rule are met. Nejgebauer Expires September 2003 [Page 8] Internet-Draft Numeric Locators for URIs March 2003 Full Copyright Statement Copyright (C) The Internet Society (2003). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Nejgebauer Expires September 2003 [Page 9]