idnits 2.17.1 draft-ietf-urn-naptr-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 917 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Abstract section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** There are 62 instances of too long lines in the document, the longest one being 6 characters in excess of 72. ** There are 5 instances of lines with control characters in the document. == There are 20 instances of lines with non-RFC2606-compliant FQDNs in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 185: '...der in which records MUST be processed...' RFC 2119 keyword, line 191: '...s the order in which records SHOULD be...' RFC 2119 keyword, line 210: '... clients MUST skip NAPTR records whic...' RFC 2119 keyword, line 311: '...implementors SHOULD provide additional...' RFC 2119 keyword, line 452: '... MUST be processed to ensure cor...' (18 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 69 has weird spacing: '..., their britt...' == Line 361 has weird spacing: '... regexp repla...' == Line 413 has weird spacing: '... order pref...' == Line 426 has weird spacing: '...f flags serv...' == Line 579 has weird spacing: '...im-char ere ...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '15' on line 846 looks like a reference -- Missing reference section? '1' on line 799 looks like a reference -- Missing reference section? '2' on line 802 looks like a reference -- Missing reference section? '3' on line 806 looks like a reference -- Missing reference section? '4' on line 818 looks like a reference -- Missing reference section? '5' on line 811 looks like a reference -- Missing reference section? '6' on line 815 looks like a reference -- Missing reference section? '7' on line 820 looks like a reference -- Missing reference section? '8' on line 824 looks like a reference -- Missing reference section? 'A-Z0-9' on line 557 looks like a reference -- Missing reference section? '9' on line 826 looks like a reference -- Missing reference section? '10' on line 829 looks like a reference -- Missing reference section? '11' on line 832 looks like a reference -- Missing reference section? '12' on line 835 looks like a reference -- Missing reference section? '13' on line 838 looks like a reference -- Missing reference section? '14' on line 843 looks like a reference -- Missing reference section? '0' on line 685 looks like a reference Summary: 13 errors (**), 0 flaws (~~), 8 warnings (==), 19 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET DRAFT Ron Daniel 3 draft-ietf-urn-naptr-04.txt Los Alamos National Laboratory 4 Michael Mealling 5 Network Solutions, Inc. 6 20 March, 1997 8 Resolution of Uniform Resource Identifiers 9 using the Domain Name System 11 Status of this Memo 12 =================== 14 This document is an Internet-Draft. Internet-Drafts are working 15 documents of the Internet Engineering Task Force (IETF), its 16 areas, and its working groups. Note that other groups may also 17 distribute working documents as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six 20 months and may be updated, replaced, or obsoleted by other 21 documents at any time. It is inappropriate to use Internet- 22 Drafts as reference material or to cite them other than as 23 ``work in progress.'' 25 To learn the current status of any Internet-Draft, please check 26 the ``1id-abstracts.txt'' listing contained in the Internet- 27 Drafts Shadow Directories on ftp.is.co.za (Africa), 28 nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), 29 ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). 31 This draft expires 26 Sept., 1997. 33 Abstract: 34 ========= 36 Uniform Resource Locators (URLs) are the foundation of the World Wide 37 Web, and are a vital Internet technology. However, they have proven to 38 be brittle in practice. The basic problem is that URLs typically 39 identify a particular path to a file on a particular host. There is no 40 graceful way of changing the path or host once the URL has been 41 assigned. Neither is there a graceful way of replicating the resource 42 located by the URL to achieve better network utilization and/or fault 43 tolerance. Uniform Resource Names (URNs) have been hypothesized as a 44 adjunct to URLs that would overcome such problems. URNs and URLs are 45 both instances of a broader class of identifiers known as Uniform 46 Resource Identifiers (URIs). 48 The requirements document for URN resolution systems[15] defines the concept 49 of a "resolver discovery service". This document describes the first, 50 experimental, RDS. It is implemented by a new DNS Resource Record, NAPTR 51 (Naming Authority PoinTeR), that provides rules for mapping parts of URIs 52 to domain names. By changing the mapping rules, we can change the host 53 that is contacted to resolve a URI. This will allow a more graceful 54 handling of URLs over long time periods, and forms the foundation for a 55 new proposal for Uniform Resource Names. 57 In addition to locating resolvers, the NAPTR provides for other naming 58 systems to be grandfathered into the URN world, provides independence 59 between the name assignment system and the resolution protocol system, 60 and allows multiple services (Name to Location, Name to Description, 61 Name to Resource, ...) to be offered. In conjunction with the SRV RR, 62 the NAPTR record allows those services to be replicated for the purposes 63 of fault tolerance and load balancing. 65 Introduction: 66 ============= 68 Uniform Resource Locators have been a significant advance in retrieving 69 Internet-accessible resources. However, their brittle nature over time 70 has been recognized for several years. The Uniform Resource Identifier 71 working group proposed the development of Uniform Resource Names to serve 72 as persistent, location-independent identifiers for Internet resources 73 in order to overcome most of the problems with URLs. RFC-1737 [1] sets 74 forth requirements on URNs. 76 During the lifetime of the URI-WG, a number of URN proposals were 77 generated. The developers of several of those proposals met in a series 78 of meetings, resulting in a compromise known as the Knoxville framework. 79 The major principle behind the Knoxville framework is that the resolution 80 system must be separate from the way names are assigned. This is in 81 marked contrast to most URLs, which identify the host to contact and 82 the protocol to use. Readers are referred to [2] for background on the 83 Knoxville framework and for additional information on the context and 84 purpose of this proposal. 86 Separating the way names are resolved from the way they are constructed 87 provides several benefits. It allows multiple naming approaches and 88 resolution approaches to compete, as it allows different protocols and 89 resolvers to be used. There is just one problem with such a separation - 90 how do we resolve a name when it can't give us directions to its 91 resolver? 93 For the short term, DNS is the obvious candidate for the resolution 94 framework, since it is widely deployed and understood. However, it is 95 not appropriate to use DNS to maintain information on a per-resource 96 basis. First of all, DNS was never intended to handle that many 97 records. Second, the limited record size is inappropriate for catalog 98 information. Third, domain names are not appropriate as URNs. 100 Therefore our approach is to use DNS to locate "resolvers" that can 101 provide information on individual resources, potentially including the 102 resource itself. To accomplish this, we "rewrite" the URI into a domain 103 name following the rules provided in NAPTR records. Rewrite rules 104 provide considerable power, which is important when trying to meet the 105 goals listed above. However, collections of rules can become difficult 106 to understand. To lessen this problem, the NAPTR rules are *always* 107 applied to the original URI, *never* to the output of previous rules. 109 Locating a resolver through the rewrite procedure may take multiple 110 steps, but the beginning is always the same. The start of the URI 111 is scanned to extract its colon-delimited prefix. (For URNs, the 112 prefix is always "urn:" and we extract the following colon-delimited 113 namespace identifier. [3]). NAPTR resolution begins by taking the 114 extracted string, appending the well-known suffix ".urn.net", and 115 querying the DNS for NAPTR records at that domain name. Based on the 116 results of this query, zero or more additional DNS queries may be 117 needed to locate resolvers for the URI. The details of the conversation 118 between the client and the resolver thus located are outside the bounds 119 of this draft. Three brief examples of this procedure are given in the 120 next section. 122 The NAPTR RR provides the level of indirection needed to keep the 123 naming system independent of the resolution system, its protocols, and 124 services. Coupled with the new SRV resource record proposal[4] there 125 is also the potential for replicating the resolver on multiple hosts, 126 overcoming some of the most significant problems of URLs. This is an 127 important and subtle point. Not only do the NAPTR and SRV records allow 128 us to replicate the resource, we can replicate the resolvers that know 129 about the replicated resource. Preventing a single point of failure at 130 the resolver level is a significant benefit. Separating the resolution 131 procedure from the way names are constructed has additional benefits. 132 Different resolution procedures can be used over time, and resolution 133 procedures that are determined to be useful can be extended to deal 134 with additional namespaces. 136 Caveats 137 ======= 139 The NAPTR proposal is the first resolution procedure to be considered 140 by the URN-WG. There are several concerns about the proposal which have 141 motivated the group to recommend it for publication as an Experimental 142 rather than a standards-track RFC. 144 First, URN resolution is new to the IETF and we wish to gain 145 operational experience before recommending any procedure for the 146 standards track. Second, the NAPTR proposal is based on DNS and 147 consequently inherits concerns about security and administration. The 148 recent advancement of the DNSSEC and secure update drafts to Proposed 149 Standard reduce these concerns, but we wish to experiment with those 150 new capabilities in the context of URN administration. A third area of 151 concern is the potential for a noticeable impact on the DNS. We 152 believe that the proposal makes appropriate use of caching and 153 additional information, but it is best to go slow where the potential 154 for impact on a core system like the DNS is concerned. Fourth, the 155 rewrite rules in the NAPTR proposal are based on regular expressions. 156 Since regular expressions are difficult for humans to construct 157 correctly, concerns exist about the usability and maintainability of 158 the rules. This is especially true where international character sets 159 are concerned. Finally, the URN-WG is developing a requirements document 160 for URN Resolution Services[15], but that document is not complete. That 161 document needs to precede any resolution service proposals on the standards 162 track. 164 Terminology 165 =========== 167 "Must" or "Shall" - Software that does not behave in the manner that this 168 document says it must is not conformant to this document. 169 "Should" - Software that does not follow the behavior that this document 170 says it should may still be conformant, but is probably broken 171 in some fundamental way. 172 "May" - Implementations may or may not provide the described behavior, 173 while still remaining conformant to this document. 175 Brief overview and examples of the NAPTR RR: 176 ============================================ 178 A detailed description of the NAPTR RR will be given later, but to give 179 a flavor for the proposal we first give a simple description of the 180 record and three examples of its use. 182 The key fields in the NAPTR RR are order, preference, service, flags, 183 regexp, and replacement: 185 * The order field specifies the order in which records MUST be processed 186 when multiple NAPTR records are returned in response to a single query. 187 A naming authority may have delegated a portion of its namespace to 188 another agency. Evaluating the NAPTR records in the correct order is 189 necessary for delegation to work properly. 191 * The preference field specifies the order in which records SHOULD be 192 processed when multiple NAPTR records have the same value of "order". 193 This field lets a service provider specify the order in which resolvers 194 are contacted, so that more capable machines are contacted in preference 195 to less capable ones. 197 * The service field specifies the resolution protocol and resolution 198 service(s) that will be available if the rewrite specified by the 199 regexp or replacement fields is applied. Resolution protocols are 200 the protocols used to talk with a resolver. They will be specified in 201 other documents, such as [5]. Resolution services are operations such 202 as N2R (URN to Resource), N2L (URN to URL), N2C (URN to URC), etc. 203 These will be discussed in the URN Resolution Services document[6], and 204 their behavior in a particular resolution protocol will be given in 205 the specification for that protocol (see [5] for a concrete example). 207 * The flags field contains modifiers that affect what happens in the 208 next DNS lookup, typically for optimizing the process. Flags may also 209 affect the interpretation of the other fields in the record, therefore, 210 clients MUST skip NAPTR records which contain an unknown flag value. 212 * The regexp field is one of two fields used for the rewrite rules, and 213 is the core concept of the NAPTR record. The regexp field is a String 214 containing a sed-like substitution expression. (The actual grammar 215 for the substitution expressions is given later in this draft). The 216 substitution expression is applied to the original URN to determine 217 the next domain name to be queried. The regexp field should be used 218 when the domain name to be generated is conditional on information in 219 the URI. If the next domain name is always known, which is 220 anticipated to be a common occurrence, the replacement field should 221 be used instead. 223 * The replacement field is the other field that may be used for the 224 rewrite rule. It is an optimization of the rewrite process for the 225 case where the next domain name is fixed instead of being conditional 226 on the content of the URI. The replacement field is a domain name 227 (subject to compression if a DNS sender knows that a given recipient 228 is able to decompress names in this RR type's RDATA field). If the 229 rewrite is more complex than a simple substitution of a domain name, 230 the replacement field should be set to . and the regexp field used. 232 Note that the client applies all the substitutions and performs all 233 lookups, they are not performed in the DNS servers. Note also that it 234 is the belief of the developers of this document that regexps should 235 rarely be used. The replacement field seems adequate for the vast 236 majority of situations. Regexps are only necessary when portions of a 237 namespace are to be delegated to different resolvers. Finally, note 238 that the regexp and replacement fields are, at present, mutually 239 exclusive. However, developers of client software should be aware that 240 a new flag might be defined which requires values in both fields. 242 Example 1 243 --------- 245 Consider a URN that uses the hypothetical DUNS namespace. DUNS numbers 246 are identifiers for approximately 30 million registered businesses 247 around the world, assigned and maintained by Dunn and Bradstreet. The 248 URN might look like: 250 urn:duns:002372413:annual-report-1997 252 The first step in the resolution process is to find out about the DUNS 253 namespace. The namespace identifier, "duns", is extracted from the URN, 254 prepended to urn.net, and the NAPTRs for duns.urn.net looked up. It 255 might return records of the form: 257 duns.urn.net 258 ;; order pref flags service regexp replacement 259 IN NAPTR 100 10 "s" "dunslink+N2L+N2C" "" dunslink.udp.isi.dandb.com 260 IN NAPTR 100 20 "s" "rcds+N2C" "" rcds.udp.isi.dandb.com 261 IN NAPTR 100 30 "s" "http+N2L+N2C+N2R" "" http.tcp.isi.dandb.com 263 The order field contains equal values, indicating that no name 264 delegation order has to be followed. The preference field indicates 265 that the provider would like clients to use the special dunslink 266 protocol, followed by the RCDS protocol, and that HTTP is offered as a 267 last resort. All the records specify the "s" flag, which will be 268 explained momentarily. The service fields say that if we speak 269 dunslink, we will be able to issue either the N2L or N2C requests to 270 obtain a URL or a URC (description) of the resource. The Resource 271 Cataloging and Distribution Service (RCDS)[7] could be used to get a 272 URC for the resource, while HTTP could be used to get a URL, URC, or 273 the resource itself. All the records supply the next domain name to 274 query, none of them need to be rewritten with the aid of regular 275 expressions. 277 The general case might require multiple NAPTR rewrites to locate a 278 resolver, but eventually we will come to the "terminal NAPTR". Once we 279 have the terminal NAPTR, our next probe into the DNS will be for a SRV 280 or A record instead of another NAPTR. Rather than probing for a 281 non-existent NAPTR record to terminate the loop, the flags field is 282 used to indicate a terminal lookup. If it has a value of "s", the next 283 lookup should be for SRV RRs, "a" denotes that A records should sought. 284 A "p" flag is also provided to indicate that the next action is 285 Protocol-specific, but that looking up another NAPTR will not be part 286 of it. 288 Since our example RR specified the "s" flag, it was terminal. Assuming 289 our client does not know the dunslink protocol, our next action is to 290 lookup SRV RRs for rcds.udp.isi.dandb.com, which will tell us hosts that 291 can provide the necessary resolution service. That lookup might return: 293 ;; Pref Weight Port Target 294 rcds.udp.isi.dandb.com IN SRV 0 0 1000 defduns.isi.dandb.com 295 IN SRV 0 0 1000 dbmirror.com.au 296 IN SRV 0 0 1000 ukmirror.com.uk 298 telling us three hosts that could actually do the resolution, and 299 giving us the port we should use to talk to their RCDS server. 300 (The reader is referred to the SRV proposal [4] for the interpretation 301 of the fields above). 303 There is opportunity for significant optimization here. We can return 304 the SRV records as additional information for terminal NAPTRs (and the 305 A records as additional information for those SRVs). While this 306 recursive provision of additional information is not explicitly blessed 307 in the DNS specifications, it is not forbidden, and BIND does take 308 advantage of it [8]. This is a significant optimization. In conjunction 309 with a long TTL for *.urn.net records, the average number of probes to 310 DNS for resolving DUNS URNs would approach one. Therefore, DNS server 311 implementors SHOULD provide additional information with NAPTR 312 responses. The additional information will be either SRV or A records. 313 If SRV records are available, their A records should be provided as 314 recursive additional information. 316 Note that the example NAPTR records above are intended to represent the 317 reply the client will see. They are not quite identical to what the 318 domain administrator would put into the zone files. For one thing, the 319 administrator should supply the trailing '.' character on any FQDNs. 321 Example 2 322 --------- 324 Consider a URN namespace based on MIME Content-Ids. The URN might look 325 like this: 327 urn:cid:199606121851.1@mordred.gatech.edu 329 (Note that this example is chosen for pedagogical purposes, and does 330 not conform to the recently-approved CID URL scheme.) 332 The first step in the resolution process is to find out about the CID 333 namespace. The namespace identifier, cid, is extracted from the URN, 334 prepended to urn.net, and the NAPTR for cid.urn.net looked up. It might 335 return records of the form: 337 cid.urn.net 338 ;; order pref flags service regexp replacement 339 IN NAPTR 100 10 "" "" "/urn:cid:.+@([^\.]+\.)(*)$/\2/i" . 341 We have only one NAPTR response, so ordering the responses is not 342 a problem. The replacement field is empty, so we check the regexp 343 field and use the pattern provided there. We apply that regexp to the 344 entire URN to see if it matches, which it does. The \2 part of the 345 substitution expression returns the string "gatech.edu". Since 346 the flags field does not contain "s" or "a", the lookup is not terminal 347 and our next probe to DNS is for more NAPTR records: lookup(query=NAPTR, 348 "gatech.edu"). 350 Note that the rule does not extract the full domain name from the CID, 351 instead it assumes the CID comes from a host and extracts its domain. 352 While all hosts, such as mordred, could have their very own NAPTR, 353 maintaining those records for all the machines at a site as large as 354 Georgia Tech would be an intolerable burden. Wildcards are not appropriate 355 here since they only return results when there is no exactly matching 356 names already in the system. 358 The record returned from the query on "gatech.edu" might look like: 360 gatech.edu IN NAPTR 361 ;; order pref flags service regexp replacement 362 IN NAPTR 100 50 "s" "z3950+N2L+N2C" "" z3950.tcp.gatech.edu 363 IN NAPTR 100 50 "s" "rcds+N2C" "" rcds.udp.gatech.edu 364 IN NAPTR 100 50 "s" "http+N2L+N2C+N2R" "" http.tcp.gatech.edu 366 Continuing with our example, we note that the values of the order and 367 preference fields are equal in all records, so the client is free to 368 pick any record. The flags field tells us that these are the last NAPTR 369 patterns we should see, and after the rewrite (a simple replacement in 370 this case) we should look up SRV records to get information on the 371 hosts that can provide the necessary service. 373 Assuming we prefer the Z39.50 protocol, our lookup might return: 375 ;; Pref Weight Port Target 376 z3950.tcp.gatech.edu IN SRV 0 0 1000 z3950.gatech.edu 377 IN SRV 0 0 1000 z3950.cc.gatech.edu 378 IN SRV 0 0 1000 z3950.uga.edu 380 telling us three hosts that could actually do the resolution, and 381 giving us the port we should use to talk to their Z39.50 server. 383 Recall that the regular expression used \2 to extract a domain name 384 from the CID, and \. for matching the literal '.' characters seperating 385 the domain name components. Since '\' is the escape character, literal 386 occurances of a backslash must be escaped by another backslash. For the 387 case of the cid.urn.net record above, the regular expression entered 388 into the zone file should be "/urn:cid:.+@([^\\.]+\\.)(*)$/\\2/i". 389 When the client code actually receives the record, the pattern will 390 have been converted to "/urn:cid:.+@([^\.]+\.)(*)$/\2/i". 392 Example 3 393 --------- 395 Even if URN systems were in place now, there would still be a 396 tremendous number of URLs. It should be possible to develop a URN 397 resolution system that can also provide location independence for those 398 URLs. This is related to the requirement in [1] to be able to 399 grandfather in names from other naming systems, such as ISO Formal 400 Public Identifiers, Library of Congress Call Numbers, ISBNs, ISSNs, 401 etc. 403 The NAPTR RR could also be used for URLs that have already been assigned. 404 Assume we have the URL for a very popular piece of software that the 405 publisher wishes to mirror at multiple sites around the world: 407 http://www.foo.com/software/latest-beta.exe 409 We extract the prefix, "http", and lookup NAPTR records for 410 http.urn.net. This might return a record of the form 412 http.urn.net IN NAPTR 413 ;; order pref flags service regexp replacement 414 100 90 "" "" "!http://([^/:]+)!\1!i" . 416 This expression returns everything after the first double slash and 417 before the next slash or colon. (We use the '!' character to delimit the 418 parts of the substitution expression. Otherwise we would have to use 419 backslashes to escape the forward slashes, and would have a regexp in 420 the zone file that looked like "/http:\\/\\/([^\\/:]+)/\\1/i".). 422 Applying this pattern to the URL extracts "www.foo.com". Looking up NAPTR 423 records for that might return: 425 www.foo.com 426 ;; order pref flags service regexp replacement 427 IN NAPTR 100 100 "s" "http+L2R" "" http.tcp.foo.com 428 IN NAPTR 100 100 "s" "ftp+L2R" "" ftp.tcp.foo.com 430 Looking up SRV records for http.tcp.foo.com would return information 431 on the hosts that foo.com has designated to be its mirror sites. The 432 client can then pick one for the user. 434 NAPTR RR Format 435 =============== 437 The format of the NAPTR RR is given below. The DNS type code for 438 NAPTR is 35. 440 Domain TTL Class Order Preference Flags Service Regexp Replacement 442 where: 444 Domain 445 The domain name this resource record refers to. 446 TTL 447 Standard DNS Time To Live field 448 Class 449 Standard DNS meaning 450 Order 451 A 16-bit integer specifying the order in which the NAPTR records 452 MUST be processed to ensure correct delegation of portions 453 of the namespace over time. Low numbers are processed before 454 high numbers, and once a NAPTR is found that "matches" a URN, 455 the client MUST NOT consider any NAPTRs with a higher value 456 for order. 458 Preference 459 A 16-bit integer which specifies the order in which NAPTR records 460 with equal "order" values SHOULD be processed, low numbers 461 being processed before high numbers. This is similar to the 462 preference field in an MX record, and is used so domain 463 administrators can direct clients towards more capable hosts 464 or lighter weight protocols. 466 Flags 467 A String giving flags to control aspects of the rewriting and 468 interpretation of the fields in the record. Flags are single 469 characters from the set [A-Z0-9]. The case of the alphabetic 470 characters is not significant. 472 At this time only three flags, "S", "A", and "P", are defined. 473 "S" means that the next lookup should be for SRV records instead 474 of NAPTR records. "A" means that the next lookup should be for A 475 records. The "P" flag says that the remainder of the resolution 476 shall be carried out in a Protocol-specific fashion, and we 477 should not do any more DNS queries. 479 The remaining alphabetic flags are reserved. The numeric flags 480 may be used for local experimentation. The S, A, and P flags are 481 all mutually exclusive, and resolution libraries MAY signal an 482 error if more than one is given. (Experimental code and code for 483 assisting in the creation of NAPTRs would be more likely to 484 signal such an error than a client such as a browser). We 485 anticipate that multiple flags will be allowed in the future, so 486 implementers MUST NOT assume that the flags field can only 487 contain 0 or 1 characters. Finally, if a client encounters a 488 record with an unknown flag, it MUST ignore it and move to the 489 next record. This test takes precedence even over the "order" 490 field. Since flags can control the interpretation placed on 491 fields, a novel flag might change the interpretation of the 492 regexp and/or replacement fields such that it is impossible to 493 determine if a record matched a URN. 495 Service 496 Specifies the resolution service(s) available down this rewrite 497 path. It may also specify the particular protocol that is used 498 to talk with a resolver. A protocol MUST be specified if the 499 flags field states that the NAPTR is terminal. If a protocol is 500 specified, but the flags field does not state that the NAPTR is 501 terminal, the next lookup MUST be for a NAPTR. The client MAY 502 choose not to perform the next lookup if the protocol is 503 unknown, but that behavior MUST NOT be relied upon. 505 The service field may take any of the values below (using the 506 Augmented BNF of RFC 822[9]): 508 service_field = [ [protocol] *("+" rs)] 509 protocol = ALPHA *31ALPHANUM 510 rs = ALPHA *31ALPHANUM 511 // The protocol and rs fields are limited to 32 512 // characters and must start with an alphabetic. 513 // The current set of "known" strings are: 514 // protocol = "rcds" / "thttp" / "hdl" / "rwhois" / "z3950" 515 // rs = "N2L" / "N2Ls" / "N2R" / "N2Rs" / "N2C" 516 // / "N2Ns" / "L2R" / "L2Ns" / "L2Ls" / "L2C" 518 i.e. an optional protocol specification followed by 0 or more 519 resolution services. Each resolution service is indicated by 520 an initial '+' character. 522 Note that the empty string is also a valid service field. This 523 will typically be seen at the top levels of a namespace, when it 524 is impossible to know what services and protocols will be offered 525 by a particular publisher within that name space. 527 At this time the known protocols are rcds[7], hdl[10] (binary, 528 UDP-based protocols), thttp[5] (a textual, TCP-based protocol), 529 rwhois[11] (textual, UDP or TCP based), and Z39.50[12] (binary, 530 TCP-based). More will be allowed later. The names of the 531 protocols must be formed from the characters [a-Z0-9]. Case of 532 the characters is not significant. 534 The service requests currently allowed will be described in more 535 detail in [6], but in brief they are: 536 N2L - Given a URN, return a URL 537 N2Ls - Given a URN, return a set of URLs 538 N2R - Given a URN, return an instance of the resource. 539 N2Rs - Given a URN, return multiple instances of the resouce, 540 typically encoded using multipart/alternative. 541 N2C - Given a URN, return a collection of meta-information 542 on the named resource. The format of this response is 543 the subject of another document. 544 N2Ns - Given a URN, return all URNs that are also identifers 545 for the resource. 546 L2R - Given a URL, return the resource. 547 L2Ns - Given a URL, return all the URNs that are identifiers 548 for the resource. 549 L2Ls - Given a URL, return all the URLs for instances of 550 of the same resource. 551 L2C - Given a URL, return a description of the resource. 553 The actual format of the service request and response will be 554 determined by the resolution protocol, and is the subject for 555 other documents (e.g. [5]). Protocols need not offer all 556 services. The labels for service requests shall be formed from 557 the set of characters [A-Z0-9]. The case of the alphabetic 558 characters is not significant. 560 Regexp 561 A STRING containing a substitution expression that is applied to 562 the original URI in order to construct the next domain name to 563 lookup. The grammar of the substitution expression is given in 564 the next section. 566 Replacement 567 The next NAME to query for NAPTR, SRV, or A records depending on 568 the value of the flags field. As mentioned above, this may be 569 compressed. 571 Substitution Expression Grammar: 572 ================================ 574 The content of the regexp field is a substitution expression. True sed(1) 575 substitution expressions are not appropriate for use in this application for a 576 variety of reasons, therefore the contents of the regexp field MUST follow the 577 grammar below: 579 subst_expr = delim-char ere delim-char repl delim-char *flags 580 delim-char = "/" / "!" / ... (Any non-digit or non-flag character other 581 than backslash '\'. All occurances of a delim_char in a 582 subst_expr must be the same character.) 583 ere = POSIX Extended Regular Expression (see [13], section 2.8.4) 584 repl = dns_str / backref / repl dns_str / repl backref 585 dns_str = 1*DNS_CHAR 586 backref = "\" 1POS_DIGIT 587 flags = "i" 588 DNS_CHAR = "-" / "0" / ... / "9" / "a" / ... / "z" / "A" / ... / "Z" 589 POS_DIGIT = "1" / "2" / ... / "9" ; 0 is not an allowed backref value 590 domain name (see RFC-1123 [14]). 592 The result of applying the substitution expression to the original URI MUST 593 result in a string that obeys the syntax for DNS host names [14]. Since it 594 is possible for the regexp field to be improperly specified, such that a 595 non-conforming host name can be constructed, client software SHOULD verify 596 that the result is a legal host name before making queries on it. 598 Backref expressions in the repl portion of the substitution expression 599 are replaced by the (possibly empty) string of characters enclosed by '(' 600 and ')' in the ERE portion of the substitution expression. N is a single 601 digit from 1 through 9, inclusive. It specifies the N'th backref expression, 602 the one that begins with the N'th '(' and continues to the matching ')'. 603 For example, the ERE 604 (A(B(C)DE)(F)G) 605 has backref expressions: 606 \1 = ABCDEFG 607 \2 = BCDE 608 \3 = C 609 \4 = F 610 \5..\9 = error - no matching subexpression 612 The "i" flag indicates that the ERE matching SHALL be performed in a 613 case-insensitive fashion. Furthermore, any backref replacements MAY be 614 normalized to lower case when the "i" flag is given. 616 The first character in the substitution expression shall be used as the 617 character that delimits the components of the substitution expression. 618 There must be exactly three non-escaped occurrences of the delimiter 619 character in a substitution expression. Since escaped occurrences of 620 the delimiter character will be interpreted as occurrences of that 621 character, digits MUST NOT be used as delimiters. Backrefs would be 622 confused with literal digits were this allowed. Similarly, if flags are 623 specified in the substitution expression, the delimiter character must not 624 also be a flag character. 626 Advice to domain administrators: 627 ================================ 629 Beware of regular expressions. Not only are they a pain to get 630 correct on their own, but there is the previously mentioned interaction 631 with DNS. Any backslashes in a regexp must be entered twice in a zone 632 file in order to appear once in a query response. More seriously, the 633 need for double backslashes has probably not been tested by all 634 implementors of DNS servers. We anticipate that urn.net will be the 635 heaviest user of regexps. Only when delegating portions of namespaces 636 should the typical domain administrator need to use regexps. 638 On a related note, beware of interactions with the shell when manipulating 639 regexps from the command line. Since '\' is a common escape character in 640 shells, there is a good chance that when you think you are saying "\\" you 641 are actually saying "\". Similar caveats apply to characters such as 642 '*', '(', etc. 644 The "a" flag allows the next lookup to be for A records rather than 645 SRV records. Since there is no place for a port specification in the 646 NAPTR record, when the "A" flag is used the specified protocol must 647 be running on its default port. 649 The URN Sytnax draft defines a canonical form for each URN, which requires 650 %encoding characters outside a limited repertoire. The regular expressions 651 MUST be written to operate on that canonical form. Since international 652 character sets will end up with extensive use of %encoded characters, 653 regular expressions operating on them will be essentially impossible to 654 read or write by hand. 656 Usage 657 ===== 659 For the edification of implementers, pseudocode for a client routine 660 using NAPTRs is given below. This code is provided merely as a 661 convience, it does not have any weight as a standard way to process 662 NAPTR records. Also, as is the case with pseudocode, it has never been 663 executed and may contain logical errors. You have been warned. 665 // 666 // findResolver(URN) 667 // Given a URN, find a host that can resolve it. 668 // 669 findResolver(string URN) { 670 // prepend prefix to urn.net 671 sprintf(key, "%s.urn.net", extractNS(URN)); 672 do { 673 rewrite_flag = false; 674 terminal = false; 675 if (key has been seen) { 676 quit with a loop detected error 677 } 678 add key to list of "seens" 679 records = lookup(type=NAPTR, key); // get all NAPTR RRs for 'key' 681 discard any records with an unknown value in the "flags" field. 682 sort NAPTR records by "order" field and "preference" field 683 (with "order" being more significant than "preference"). 684 n_naptrs = number of NAPTR records in response. 685 curr_order = records[0].order; 686 max_order = records[n_naptrs-1].order; 688 // Process current batch of NAPTRs according to "order" field. 689 for (j=0; j < n_naptrs && records[j].order <= max_order; j++) { 690 if (unknown_flag) // skip this record and go to next one 691 continue; 692 newkey = rewrite(URN, naptr[j].replacement, naptr[j].regexp); 693 if (!newkey) // Skip to next record if the rewrite didn't match 694 continue; 695 // We did do a rewrite, shrink max_order to current value 696 // so that delegation works properly 697 max_order = naptr[j].order; 698 // Will we know what to do with the protocol and services 699 // specified in the NAPTR? If not, try next record. 700 if(!isKnownProto(naptr[j].services)) { 701 continue; 702 } 703 if(!isKnownService(naptr[j].services)) { 704 continue; 705 } 707 // At this point we have a successful rewrite and we will know 708 // how to speak the protocol and request a known resolution 709 // service. Before we do the next lookup, check some 710 // optimization possibilities. 712 if (strcasecmp(flags, "S") 713 || strcasecmp(flags, "P")) 714 || strcasecmp(flags, "A")) { 715 terminal = true; 716 services = naptr[j].services; 717 addnl = any SRV and/or A records returned as additional info 718 for naptr[j]. 719 } 720 key = newkey; 721 rewriteflag = true; 722 break; 723 } 724 } while (rewriteflag && !terminal); 726 // Did we not find our way to a resolver? 727 if (!rewrite_flag) { 728 report an error 729 return NULL; 730 } 732 // Leave rest to another protocol? 733 if (strcasecmp(flags, "P")) { 734 return key as host to talk to; 735 } 737 // If not, keep plugging 738 if (!addnl) { // No SRVs came in as additional info, look them up 739 srvs = lookup(type=SRV, key); 740 } 742 sort SRV records by preference, weight, ... 743 foreach (SRV record) { // in order of preference 744 try contacting srv[j].target using the protocol and one of the 745 resolution service requests from the "services" field of the 746 last NAPTR record. 747 if (successful) 748 return (target, protocol, service); 749 // Actually we would probably return a result, but this 750 // code was supposed to just tell us a good host to talk to. 751 } 752 die with an "unable to find a host" error; 753 } 755 Notes: 756 ====== 757 - A client MUST process multiple NAPTR records in the order specified by 758 the "order" field, it MUST NOT simply use the first record that provides 759 a known protocol and service combination. 760 - If a record at a particular order matches the URI, but the client 761 doesn't know the specified protocol and service, the client SHOULD 762 continue to examine records that have the same order. The client 763 MUST NOT consider records with a higher value of order. This is 764 necessary to make delegation of portions of the namespace work. 765 The order field is what lets site administrators say "all requests for 766 URIs matching pattern x go to server 1, all others go to server 2". 767 (A match is defined as: 768 1) The NAPTR provides a replacement domain name 769 or 770 2) The regular expression matches the URN 771 ) 772 - When multiple RRs have the same "order", the client should use 773 the value of the preference field to select the next NAPTR to 774 consider. However, because of preferred protocols or services, 775 estimates of network distance and bandwidth, etc. clients 776 may use different criteria to sort the records. 777 - If the lookup after a rewrite fails, clients are strongly encouraged 778 to report a failure, rather than backing up to pursue other rewrite 779 paths. 780 - When a namespace is to be delegated among a set of resolvers, regexps 781 must be used. Each regexp appears in a separate NAPTR RR. Administrators 782 should do as little delegation as possible, because of limitations on 783 the size of DNS responses. 784 - Note that SRV RRs impose additional requirements on clients. 786 Acknowledgments: 787 ================= 789 The editors would like to thank Keith Moore for all his consultations 790 during the development of this draft. We would also like to thank Paul 791 Vixie for his assistance in debugging our implementation, and his answers 792 on our questions. Finally, we would like to acknowledge our enormous 793 intellectual debt to the participants in the Knoxville series of meetings, 794 as well as to the participants in the URI and URN working groups. 796 References: 797 =========== 799 [1] RFC-1737, "Functional Requirements for Uniform Resource Names", Karen 800 Sollins and Larry Masinter, Dec. 1994. 802 [2] The URN Implementors, Uniform Resource Names: A Progress Report, 803 http://www.dlib.org/dlib/february96/02arms.html, D-Lib Magazine, 804 February 1996. 806 [3] Ryan Moats, "URN Syntax", draft-ietf-urn-syntax-02.txt, Feb. 1997. 808 [4] RFC 2052, "A DNS RR for specifying the location of services (DNS SRV)", 809 A. Gulbrandsen and P. Vixie, October 1996. 811 [5] RFC-xxxx, "A Trivial Convention for using HTTP in URN Resolution", 812 Ron Daniel Jr., currently available as draft-ietf-urn-http-conv-01.txt, 813 Feb. 1997. 815 [6] RFC-xxxx, "URN Resolution Services", ???, draft-ietf-urn-??? 816 (This document is on the URN-WG's list of documents to prepare, but 817 has not yet been written. It will get its start from the treatment of 818 resolution services in [4]). 820 [7] Keith Moore, Shirley Browne, Jason Cox, and Jonathan Gettler, 821 Resource Cataloging and Distribution System, Technical Report CS-97-346, 822 University of Tennessee, Knoxville, December 1996 824 [8] Paul Vixie, personal communication. 826 [9] RFC-822, "Standard for the Format of ARPA Internet Text Messages", 827 Dave H. Crocker, August 1982. 829 [10] Charles Orth, Bill Arms; Handle Resolution Protocol Specification, 830 http://www.handle.net/docs/client_spec.html 832 [11] RFC-1714, "Referral Whois Protocol (RWhois)", S. Williamson and 833 M. Kosters, November 1994. 835 [12] Information Retrieval (Z39.50): Application Service Definition and 836 Protocol Specification, ANSI/NISO Z39.50-1995, July 1995. 838 [13] IEEE Standard for Information Technology - Portable Operating System 839 Interface (POSIX) - Part 2: Shell and Utilities (Vol. 1); IEEE Std 840 1003.2-1992; The Institute of Electrical and Electronics Engineers; 841 New York; 1993. ISBN:1-55937-255-9 843 [14] RFC-1123, "Requirements for Internet Hosts - Application and Support" 844 R. Braden, Oct. 1989. 846 [15] RFC-xxxx, "Requirements and a Framework for URN Resolution Systems", 847 Karen Sollins, draft-ietf-urn-req-frame-00.txt, November 1996. 849 Security Considerations 850 ======================= 852 The use of "urn.net" as the registry for URN namespaces is subject to 853 denial of service attacks, as well as other DNS spoofing attacks. The 854 interactions with DNSSEC are currently being studied. It is expected 855 that NAPTR records will be signed with SIG records once the DNSSEC 856 work is deployed. 858 The rewrite rules make identifiers from other namespaces subject to 859 the same attacks as normal domain names. Since they have not been 860 easily resolvable before, this may or may not be considered a problem. 862 Regular expressions should be checked for sanity, not blindly passed 863 to something like PERL. 865 This document has discussed a way of locating a resolver, but has not 866 discussed any detail of how the communication with the resolver takes 867 place. There are significant security considerations attached to the 868 communication with a resolver. Those considerations are outside the 869 scope of this document, and must be addressed by the specifications 870 for particular resolver communication protocols. 872 Author Contact Information: 873 =========================== 875 Ron Daniel 876 Los Alamos National Laboratory 877 MS B287 878 Los Alamos, NM, USA, 87545 879 voice: +1 505 665 0597 880 fax: +1 505 665 4939 881 email: rdaniel@lanl.gov 883 Michael Mealling 884 Network Solutions 885 505 Huntmar Park Drive 886 Herndon, VA 22070 887 voice: (703) 742-0400 888 fax: (703) 742-9552 889 email: michaelm@internic.net 890 URL: http://www.netsol.com/ 892 This draft expires 26 Sept., 1997.