idnits 2.17.1 draft-klensin-name-munging-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1.a on line 15. -- Found old boilerplate from RFC 3978, Section 5.5 on line 415. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 392. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 399. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 405. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 421), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 37. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: This document is an Internet-Draft and is subject to all provisions of Section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There is 1 instance of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 12, 2004) is 7221 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC2831' is defined on line 343, but no explicit reference was found in the text == Unused Reference: 'RFC2617' is defined on line 364, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2831 (Obsoleted by RFC 6331) -- Obsolete informational reference (is this intentional?): RFC 954 (Obsoleted by RFC 3912) -- Obsolete informational reference (is this intentional?): RFC 1341 (Obsoleted by RFC 1521) -- Obsolete informational reference (is this intentional?): RFC 2617 (Obsoleted by RFC 7235, RFC 7615, RFC 7616, RFC 7617) -- Obsolete informational reference (is this intentional?): RFC 3490 (Obsoleted by RFC 5890, RFC 5891) Summary: 7 errors (**), 0 flaws (~~), 5 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group J. Klensin 2 Internet-Draft July 12, 2004 3 Expires: January 10, 2005 5 A Name Munging Protocol 6 draft-klensin-name-munging-03.txt 8 Status of this Memo 10 This document is an Internet-Draft and is subject to all provisions 11 of section 3 of RFC 3667. By submitting this Internet-Draft, each 12 author represents that any applicable patent or other IPR claims of 13 which he or she is aware have been or will be disclosed, and any of 14 which he or she become aware will be disclosed, in accordance with 15 RFC 3668. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as 20 Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at http:// 28 www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on January 10, 2005. 35 Copyright Notice 37 Copyright (C) The Internet Society (2004). All Rights Reserved. 39 Abstract 41 As one works on internationalization issues for DNS, email, and other 42 protocols, it becomes clear that the various encodings and 43 transformations required, while not intrinsically difficult, can be 44 an impediment to rapid conversion of applications to international 45 form and to rapid prototyping of new applications. This document 46 proposes a new, lightweight, protocol that can be used to make such 47 conversions, rather than incorporating the needed tables and 48 algorithms into each application. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . 3 53 2. The Protocol . . . . . . . . . . . . . . . . . . . . . . 3 54 2.1 Inputs . . . . . . . . . . . . . . . . . . . . . . . 4 55 2.2 Element definitions . . . . . . . . . . . . . . . . 4 56 2.3 Initial List of Encodings . . . . . . . . . . . . . 4 57 2.4 Outputs . . . . . . . . . . . . . . . . . . . . . . 5 58 2.5 Reply codes . . . . . . . . . . . . . . . . . . . . 5 59 3. Examples . . . . . . . . . . . . . . . . . . . . . . . . 6 60 4. Signed Messages and Business Arrangements . . . . . . . 6 61 5. Availability . . . . . . . . . . . . . . . . . . . . . . 7 62 6. IANA Considerations . . . . . . . . . . . . . . . . . . 7 63 7. Security Considerations . . . . . . . . . . . . . . . . 7 64 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . 8 65 9. References . . . . . . . . . . . . . . . . . . . . . . . 8 66 9.1 Normative References . . . . . . . . . . . . . . . . . 8 67 9.2 Informative References . . . . . . . . . . . . . . . . 8 68 Author's Address . . . . . . . . . . . . . . . . . . . . 9 69 A. A Security-Enhanced Variation . . . . . . . . . . . . . ancho 70 A.1 Input . . . . . . . . . . . . . . . . . . . . . . . ancho 71 A.2 Output . . . . . . . . . . . . . . . . . . . . . . . ancho 72 Intellectual Property and Copyright Statements . . . . . 10 74 1. Introduction 76 A variety of new and upcoming protocols, most, but not all, of them 77 associated with internationalization, require that data be presented 78 in, or mapped into, encoding forms that are specialized and largely 79 unique to the Internet or those protocols. The trend arguably 80 started with the introduction of quoted-printable into MIME [RFC1341] 81 and has continued to more recent DNS internationalization work 82 [RFC3490] and developing errors in internationalization of electronic 83 mail [I-D.hoffman-imaa]. These encodings are at least complex enough 84 that testing for interoperability and accuracy is perceived to be 85 needed. Even though they are not, intrinsically, very hard, the 86 process of getting the needed code incorporated and tested may be 87 sufficient to discourage or delay internationalization of some 88 applications, including those that are built around short scripts. 90 This document describes a protocol -- designed for use over either 91 TCP or UDP -- that can be passed short strings for conversion from 92 one encoding to another. There are various samples, testbeds, and 93 web pages today that can do some of these conversions, but they are 94 not general (few of them handle more than one or two conversions), 95 and they are really not compatible with use in applications 96 implementation (regardless of whether they can be used in testing or 97 not). The core code in those samples and tests could presumably be 98 adapted to support this protocol. 100 2. The Protocol 102 The protocol is designed to be as simple as possible, following the 103 general "send packet containing one line, get another line back" 104 model used in finger [RFC1288] and whois [RFC0954]. That model is 105 traditional and well-proven in the Internet, but, by today's 106 standards, sacrifices a high degree of security for performance and 107 should be used with appropriate care. The appendix contains an 108 outline description of a possible variant on this protocol for 109 situations in which it is desired to have, within the protocol 110 itself, some degree of authentication that the intended server was 111 reached and the response received is from it, but, in general some 112 type of authenticated tunnel mechanism will be more satisfactory. 113 See Section 5, Section 4, and Section 7 for additional discussion of 114 these issues. For performance, the protocol is designed to be used 115 over either UDP or TCP, as meets the needs of the application. The 116 TCP variation on the above is, obviously, "open a connection, send a 117 line, remote system sends a line back and closes the connection". 118 The lines are defined as follows: 120 2.1 Inputs 122 The input line consists of 123 o A Version number, "1" for this variation on the protocol. 124 o An ASCII space (i.e., an octet containing hex 20) 125 o A source-indication string 126 o An ASCII space 127 o A target-indication string 128 o An ASCII space 129 o A bit count, expressed as an ASCII numeral 130 o An ASCII space 131 o The source bit string 133 2.2 Element definitions 135 The version number is a positive integer, defined as "1" in this 136 version of the protocol. Implementations of this version of the 137 protocol are required to check the version number and, if it is not 138 "1", to return a string consisting of "550 bad version number" (see 139 below). The indication strings are positive integers, registered 140 with IANA and described in Section 2.3, below. 142 The integers for the version number, indicator strings, and bit count 143 are expressed as decimal numbers using ASCII digits. They, and the 144 single ASCII space character that follows each one, are protocol 145 elements and are not intended to be internationalized. 147 The source string will be a simple string of bits, of length 148 specified by the bit count (with the first bit counted as one). 149 While it will normally be an integral number of octets, some special 150 encodings may not permit this, so any extra bits are ignored. For 151 convenience, the bit count may be specified as an ASCII asterisk 152 ("*", an octet containing hex 2A), in which case the server will 153 examine the string for the first pair of octets containing, 154 respectively, hex 0D and 0A (the usual CRLF convention) and consider 155 it to terminate immediately before those characters. 157 2.3 Initial List of Encodings 159 As discussed below, IANA is expected to set up a registry of encoding 160 codes for use in this protocol. That list is initially: 162 0 Information and debugging option. If 0 appears as the input 163 indicator, the rest of the input line is ignored and the server 164 returns a reply code of "000 " followed by a blank-separated list 165 of the indicator codes it recognizes. If 0 appears as the output 166 indication, the input is copied to the output, also with a reply 167 code of 000, and returned. 169 1 UCS-4 170 2 Unicode (UCS-2) 171 3 IDNA Punycode 172 4 The IMAA encoding scheme described in [I-D.hoffman-imaa] 173 5 UTF-8 174 6 ISO 8859-1 175 7 Unicode written as a blank-separated list of four or more 176 hexadecimal digit codes (written in ASCII), and with each set of 177 codes optionally preceded by "U+" or "u+". The hexadecimal codes 178 "A"..."F" may be written in either upper or lower case. 179 8 Nameprep (stringprep profile only, no punycode) 180 9 SASLprep (stringprep profile only, no punycode) 181 10 iSCSIprep (stringprep profile only, no punycode) 183 There is no requirement that every server support every encoding, 184 although it is expected that every server will support the "0" 185 encoding for test purposes. Issues of how a client locates an 186 appropriate server are outside the scope of this specification (see 187 Section 5). 189 2.4 Outputs 191 The version 1 output consists of 192 o a three-digit (ASCII) reply code (codes listed below) 193 o an ASCII space 194 o a bit count 195 o an ASCII space 196 o a string 197 The bit count, space, and string are as described above, but the "*" 198 convention will not be used. 200 2.5 Reply codes 202 The following reply codes are specified for use in this protocol. 203 If, for some reason (presumably due to a new version of the protocol 204 on the server), the three-digit code returned is not listed below, 205 only the first digit should be examined. A first digit of zero 206 indicates that the string returned contains either the original 207 string or a recoding of it; a first digit of 5 indicates that the 208 recoding failed and the string is either zero-length or contains an 209 explanation in ASCII characters. 211 000 String translated 212 001 String not translated 213 500 Service not available to you 214 501 Input encoding type not recognized 215 502 Output encoding type not recognized 216 503 Bit count exceeds length of line 217 504 No translation available, i.e., the server recognizes the input 218 encoding and the output encoding, but has no mapping between them. 219 505 Translation failed or input string invalid, e.g., the input 220 string was not a possible example of the input encoding specified. 221 506 Input string too long. 222 550 Wrong version number, i.e., version number specified is not 223 understood by this server. 224 6yz Authentication, authorization, or other security problem. 225 Reserved for future use. 227 3. Examples 229 1 6 0 10 teststring 230 000 10 teststring 232 1 6 3 9 F�ltstr�m 233 I.e., with the second and eighth characters as a-with-diaeresis 234 (U+00E4) and o-with-diaeresis (U+00F6) respectively. 235 000 12 xn--fltstrm-5wa1o 237 4. Signed Messages and Business Arrangements 239 In today's sometimes-hostile Internet environment, two questions 240 immediately arise about a protocol that is designed to be this 241 simple. One is how one tells that the returned string is the 242 intended one, i.e., that it came from the designated server and that 243 some is taking responsibility for that server's results. The other 244 is how to get someone to provide this service, especially if it is to 245 be called from production-scale applications protocols. Either or 246 both requirements might be satisfied by sending digitally-signed 247 strings. In the input (business model) case, we might imagine a 248 subscription service with registered users, with the digital 249 signature used to authenticate the query as coming from a subscriber 250 and/or authorize billing. In the output case, we might imagine a 251 family of certified servers (using a certification process that lies 252 outside this specification) able to sign the responses with a key the 253 user or application would trust. Both of these issues, and the 254 protocol changes that would be required, should be examined in depth 255 before this protocol is published. 257 At least for the TCP version of the protocol, both of these issues 258 could be dealt with independently of the protocol itself, e.g., by 259 running it over fully-authenticated IPSec or SSL. 261 This specification does not cover identification and location of 262 appropriate servers. 264 5. Availability 266 As suggested elsewhere in this document, it is expected that this 267 protocol will be used primarily within controlled environments, or 268 with servers accessed through tunnels that provide both client and 269 server authentication. Sample PERL source for client and server 270 implementations, contributed by Paul Hoffman, will be deposited with 271 the RFC Editor. 273 6. IANA Considerations 275 IANA has assigned reserved port number 3950 for both the UDP and TCP 276 variations of this protocol. 278 A registry of encoding type indicator strings is also required, with 279 a sequential integer to be assigned to each type of encoding 280 registered and the list in Section 2.3 used to initialize that 281 registry. IANA is requested to accept registrations only with 282 contact information and a reference that defines the encoding 283 involved, but, since there is no shortage of integers, checking and 284 evaluation of such requests is not required except to the degree 285 required to prevent denial of service attacks on IANA itself. 287 The conversions defined and supported are one-to-one mappings only. 288 This protocol, or at least this version of the protocol, does not 289 support any one-many, or otherwise ambiguous, mappings. 291 No IANA registry is required for version numbers: versions other than 292 the one described here will require a revised version of this 293 specification. 295 7. Security Considerations 297 As mentioned in Section 4, there is an attack on this protocol, 298 especially in which it is used over UDP, in which a response is sent 299 to the client application that contains an encoding of a different 300 string than the one that was submitted. If that string is used 301 without inspection or review by the client, various bad things might 302 happen. Signed strings, as discussed above, might protect against 303 that problem, but only if keys are properly protected and verified. 304 If assurances are needed that the server is the intended one, it is 305 recommended that the protocol be operated over an appropriately 306 configured tunnel. An extension for SASL negotiation is possible in 307 principle, but would be incompatible with operation of the protocol 308 over UDP and would be likely to defeat the intent of a very high 309 performance protocol design. 311 For those situations in which authentication of the server (and 312 response source) to the client is useful, an alternative version of 313 the protocol is specified with a minimal digest challenge-response 314 mechanism. Since that mechanism depends on a secret shared between 315 the client and server, it is likely to be useful, if at all, in 316 restricted environments such as a small department or group that does 317 not consider whatever group-isolation firewalls or similar mechanisms 318 adequate to protect against server spoofing attacks. For any sort of 319 public use, the mechanism is subject to the well-known problems of a 320 secret known to hundreds of people and is hence likely to be useless. 321 As discussed elsewhere in this document, authenticity and integrity 322 protection when public servers and the public Internet are involved 323 are probably best dealt by running this protocol within an 324 authenticated and cryptographically protected tunnel or, in 325 principle, by extending the protocol to utilize some sort of public 326 key message-signing mechanism. 328 8. Acknowledgements 330 The author would like to express appreciation to Patrik Faltstrom and 331 Leslie Dangle, who made some suggestions at a early formative stage 332 of this proposal and, in particular, pointed out the desirability of 333 digitally signing the strings. Paul Hoffman made a number of other 334 useful suggestions and contributed the first implementation. Simon 335 Josefsson suggested the addition of type codes for several additional 336 stringprep profiles. And the decision to modify the protocol to add 337 a version number emerged from a discussion with Harald Alvestrand. 339 9. References 341 9.1 Normative References 343 [RFC2831] Leach, P. and C. Newman, "Using Digest Authentication as a 344 SASL Mechanism", RFC 2831, May 2000. 346 9.2 Informative References 348 [I-D.hoffman-imaa] 349 Hoffman, P. and A. Costello, "Internationalizing Mail 350 Addresses in Applications (IMAA)", draft-hoffman-imaa-03 351 (work in progress), October 2003. 353 [RFC0954] Harrenstien, K., Stahl, M. and E. Feinler, "NICNAME/ 354 WHOIS", RFC 954, October 1985. 356 [RFC1288] Zimmerman, D., "The Finger User Information Protocol", RFC 357 1288, December 1991. 359 [RFC1341] Borenstein, N. and N. Freed, "MIME (Multipurpose Internet 360 Mail Extensions): Mechanisms for Specifying and Describing 361 the Format of Internet Message Bodies", RFC 1341, June 362 1992. 364 [RFC2617] Franks, J., Hallam-Baker, P., Hostetler, J., Lawrence, S., 365 Leach, P., Luotonen, A. and L. Stewart, "HTTP 366 Authentication: Basic and Digest Access Authentication", 367 RFC 2617, June 1999. 369 [RFC3490] Faltstrom, P., Hoffman, P. and A. Costello, 370 "Internationalizing Domain Names in Applications (IDNA)", 371 RFC 3490, March 2003. 373 Author's Address 375 John C Klensin 376 1770 Massachusetts Ave, #322 377 Cambridge, MA 02140 378 USA 380 Phone: +1 617 491 5735 381 EMail: john-ietf@jck.com 383 Intellectual Property Statement 385 The IETF takes no position regarding the validity or scope of any 386 Intellectual Property Rights or other rights that might be claimed to 387 pertain to the implementation or use of the technology described in 388 this document or the extent to which any license under such rights 389 might or might not be available; nor does it represent that it has 390 made any independent effort to identify any such rights. Information 391 on the procedures with respect to rights in RFC documents can be 392 found in BCP 78 and BCP 79. 394 Copies of IPR disclosures made to the IETF Secretariat and any 395 assurances of licenses to be made available, or the result of an 396 attempt made to obtain a general license or permission for the use of 397 such proprietary rights by implementers or users of this 398 specification can be obtained from the IETF on-line IPR repository at 399 http://www.ietf.org/ipr. 401 The IETF invites any interested party to bring to its attention any 402 copyrights, patents or patent applications, or other proprietary 403 rights that may cover technology that may be required to implement 404 this standard. Please address the information to the IETF at 405 ietf-ipr@ietf.org. 407 Disclaimer of Validity 409 This document and the information contained herein are provided on an 410 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 411 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 412 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 413 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 414 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 415 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 417 Copyright Statement 419 Copyright (C) The Internet Society (2004). This document is subject 420 to the rights, licenses and restrictions contained in BCP 78, and 421 except as set forth therein, the authors retain all their rights. 423 Acknowledgment 425 Funding for the RFC Editor function is currently provided by the 426 Internet Society.