idnits 2.17.1 draft-ietf-ftpext2-typeu-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC959, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC959, updated by this document, for RFC5378 checks: 1985-10-01) -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 12, 2012) is 4429 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'ASCII' -- Possible downref: Non-RFC (?) normative reference: ref. 'Unicode' Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 FTPEXT2 J. Klensin 3 Internet-Draft March 12, 2012 4 Updates: 959 (if approved) 5 Intended status: Standards Track 6 Expires: September 13, 2012 8 FTP TYPE Extension for Internationalized Text 9 draft-ietf-ftpext2-typeu-03 11 Abstract 13 The traditional FTP protocol includes a TYPE command to specify the 14 data representation. That command has values for ASCII and EBCDIC 15 text, plus binary ("IMAGE") transmission. As the Internet becomes 16 more international, there is a growing requirement to be able to 17 transmit textual data, encoded in Unicode, in a way that is 18 independent of the coding and line representation forms of particular 19 operating systems. This memo specifies a new FTP representation TYPE 20 value for Unicode data. 22 Status of this Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on September 13, 2012. 39 Copyright Notice 41 Copyright (c) 2012 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 This document may contain material from IETF Documents or IETF 55 Contributions published or made publicly available before November 56 10, 2008. The person(s) controlling the copyright in some of this 57 material may not have granted the IETF Trust the right to allow 58 modifications of such material outside the IETF Standards Process. 59 Without obtaining an adequate license from the person(s) controlling 60 the copyright in such materials, this document may not be modified 61 outside the IETF Standards Process, and derivative works of it may 62 not be created outside the IETF Standards Process, except to format 63 it for publication as an RFC or to translate it into languages other 64 than English. 66 Table of Contents 68 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 69 1.1. Context and Overview . . . . . . . . . . . . . . . . . . . 4 70 1.2. Summary of History of Internationalization of FTP . . . . 4 71 1.3. History of the TYPE Command . . . . . . . . . . . . . . . 4 72 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 73 1.5. Discussion List . . . . . . . . . . . . . . . . . . . . . 6 74 2. Specification . . . . . . . . . . . . . . . . . . . . . . . . 6 75 2.1. Existing TYPEs . . . . . . . . . . . . . . . . . . . . . . 6 76 2.2. Unicode TYPE . . . . . . . . . . . . . . . . . . . . . . . 7 77 2.3. Data Structure . . . . . . . . . . . . . . . . . . . . . . 7 78 2.4. Feature Negotiation . . . . . . . . . . . . . . . . . . . 7 79 3. Net-Unicode Format for FTP . . . . . . . . . . . . . . . . . . 8 80 4. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 8 81 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 82 6. Security Considerations . . . . . . . . . . . . . . . . . . . 8 83 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 84 7.1. Normative References . . . . . . . . . . . . . . . . . . . 9 85 7.2. Informative References . . . . . . . . . . . . . . . . . . 9 86 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 10 87 A.1. New Version and File Name: draft-ietf-ftpext2-typeu-00 . . 10 88 A.2. Version -01 . . . . . . . . . . . . . . . . . . . . . . . 10 89 A.3. Version -02 . . . . . . . . . . . . . . . . . . . . . . . 10 90 A.4. Version -03 . . . . . . . . . . . . . . . . . . . . . . . 11 91 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 11 93 1. Introduction 95 1.1. Context and Overview 97 The traditional FTP protocol, as documented in RFC 959 [RFC0959], 98 includes a TYPE command to specify the data representation. That 99 command was originally specified as having values for ASCII and 100 EBCDIC text, plus binary ("IMAGE") transmission. The Host 101 Requirements specification [RFC1123] made other changes to FTP, but 102 did not alter the TYPE command or the environment for which it 103 provided. 105 As the Internet becomes more international, there is a growing 106 requirement to be able to transmit textual data, encoded in Unicode 107 [Unicode], in a way that is independent of the coding and line 108 representation forms of particular operating systems. This memo 109 specifies a new FTP TYPE value for Unicode data. 111 1.2. Summary of History of Internationalization of FTP 113 RFC 2640 [RFC2640] is described as providing internationalization of 114 FTP, but only addresses the use of FTP in internationalized (non- 115 ASCII or extended ASCII [ASCII]) file systems. Its facilities were 116 slightly enhanced in a more general extensions specification 117 [RFC3659], which builds on a more general FTP extension mechanism 118 [RFC2389]. The specification in this document addresses the transfer 119 of non-ASCII text files only, building on the TYPE command of the 120 original FTP specification [RFC0959]. 122 1.3. History of the TYPE Command 124 [[Note in Draft: AppsAWG: please decide whether this subsection 125 should be included in the final version as informative or dropped as 126 surplus text that doesn't contribute to an implementer understanding 127 of what should be done.]] 129 When the FTP protocol was first defined in 1971 [RFC0114], hosts on 130 the ARPANET were extremely diverse. ASCII and EBCDIC were both in 131 active use, as were several completely different character encodings, 132 and ASCII was encoded in a variety of different forms inside 133 different systems (TENEX/TOPS-20, Multics, Unix on 16 and then 32 bit 134 architectures, and the original IBM ASCII all used different 135 encodings. In mid-1972, the late John McCarthy described some 136 aspects of the issues [RFC0373]. Within a relatively short period of 137 time, it was understood that expecting every system to adapt to the 138 formats of every other system -- a fairly large n-squared problem -- 139 was crazy. At least for text, the solution was to expect all FTP- 140 supporting hosts to convert between their local formats and a 141 network-standard ASCII encoding and, optionally, to also identify, 142 and permit, EBCBIC files to be transferred in canonical form. The 143 TYPE command was incorporated into FTP to support client 144 specification of those forms for on-the-wire transfer and also to 145 support a pair of TYPEs to support transferring data in forms that 146 were likely to be operating system and hardware specific (see 147 Section 2.1 for more details). 149 Because of the need to handle these different text character sets and 150 encoding forms without that n-squared problem, TYPE was very commonly 151 used unless it was known that the sending and receiving systems were 152 homogeneous. Several arrangements for single-line FTP commands did 153 not make explicit provision for TYPE specifications, but they tended 154 to make exactly that homogeneity assumption. 156 By the late 1980s, the ARPANET was converging toward a single basic 157 host system architecture. Almost all significant computer systems 158 used 32 bit architectures or felt an obligation to be able to 159 simulate them. EBCDIC had fallen into disuse on the network. ASCII, 160 encoded right-justified in eight bits with a leading zero, had become 161 pervasive. An Image transfer among diverse systems might well 162 encounter differences with line termination or, occasionally, record 163 structures rather than stream ones (both of which TYPE A would have 164 smoothed out), but the character encodings were almost certain to be 165 the same. So, with allowances for those line termination problems -- 166 which have been a large issue in many cases -- Image ("binary") and 167 ASCII transfers were almost equivalent and the TYPE command became 168 less-used. Some client FTP implementations also adopted an 169 "automatic" mode in which they tried to determine heuristically, 170 based on either file names or content inspection, whether the 171 relevant file consisted of ASCII characters or binary information and 172 to send the appropriate TYPE command without user intervention. 173 Because there were usually only two choices in practice, they often 174 (but not always) got it right. 176 However, migration to Unicode has reintroduced many of the old 177 issues. When Unicode is used inside a system, it can be used with 178 several different encodings (e.g., UTF-8 and several variations on 179 UTF-16 (possibly with surrogate pairs), different assumptions about 180 normalization (see "Terminology for Use in Internationalization" 181 [i18n-terms] for more discussion) and even new variations on line 182 termination conventions. When those files are transferred to another 183 system with Image type, the result may be completely uninterpretable 184 on the target system. This specification extends to non-ASCII 185 character transfers the early concept of having a very small number 186 of common/ canonical network transfer formats for characters, having 187 systems able to convert to or from them. By doing so, it avoids a 188 Unicode version of the n-squared problems and the general confusion 189 that led to the definition of TYPE. 191 1.4. Terminology 193 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 194 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 195 document are to be interpreted as described in [RFC2119]. 197 This document assumes that the reader is familiar with the 198 terminology of RFC 959. Those terms, especially reply, server-FTP 199 process, user-FTP process, server-PI, user-PI, logical byte size, and 200 user, if used here, are used in the same way. For the convenience of 201 contemporary readers, the terms "client" and "server" are used 202 interchangeably with the historic terms "user-FTP process" and 203 "server-FTP process". The document also assumes the termology and 204 changes in the updates to FTP specified in RFC 1123 and RFC 2389 205 [RFC2389]. 207 1.5. Discussion List 209 [[anchor5: RFC Editor: please remove this section before 210 publication.]] 212 This proposal is being discussed in the IETF FTPEXT2 Working Group. 213 Its mailing list is at ftpext@ietf.org. 215 2. Specification 217 2.1. Existing TYPEs 219 The FTP TYPE command, described in [RFC0959] accepts four possible 220 first argument values, as described below. Note that the 221 descriptions in this subsection are provided for the reader's 222 convenience; the definitions in RFC 959 remain normative. 224 A The data are expected to be in, and are transformed by the server 225 if needed to, an ASCII [ASCII] data stream conforming to the "NVT" 226 specification (See RFC 959 [RFC0959] and Appendix B of RFC 5198 227 [RFC5198] for more information). 229 E The data are expected to be in, and are transformed by the server 230 if needed to, an EBCDIC data stream as specified in RFC 959. 232 I The data are transferred in "image" form, i.e., exactly as they 233 appear in the server. Because it is the only TYPE form in which 234 true binary data can be transferred, TYPE I is often referred to 235 as "binary" or "binary transfer". 237 L The data are transmitted in logical bytes of a size specified in 238 an additional argument. See RFC 959. 240 Any of these four argument variations to TYPE except "TYPE A" (with 241 non-print format) MAY be rejected by the server-FTP process with a 242 504 response code if it does not support that type and the necessary 243 conversions. 245 2.2. Unicode TYPE 247 The client-PI MAY transmit TYPE U to the server-PI as an alternative 248 to other TYPE commands and arguments. If it does, the server MAY 249 return reply-code 504, indicating that the TYPE U feature is not 250 supported (unchanged from RFC 959) or MUST respond to any data 251 retrieval request (e.g., RETR) by sending the data in a stream 252 conformant to the Net-Unicode format specified in Section 3. 253 Similarly, if the client-PI sends TYPE U and the server accepts it, 254 the client MUST send any data streams in that format while the option 255 is in effect. No second parameter is used or permitted for TYPE U. 257 2.3. Data Structure 259 The default and only permitted data structure for TYPE U is "file 260 structure". Use of the STRU command SHOULD be avoided. If is used, 261 its argument MUST be "F". 263 2.4. Feature Negotiation 265 RFC 2389 [RFC2389] specifies a feature negotiation mechanism for new 266 extensions to FTP. Since the TYPE command is a required part of the 267 base FTP specification, the client-PI is not required to issue the 268 FEAT command prior to issuing TYPE U. However, it MAY do so and 269 Server-FTP implementations that include TYPE U SHOULD support FEAT as 270 described below. If the FEAT command is transmitted from the 271 client-PI to the server-PI, and this extension and FEAT are 272 supported, the response MUST include a TYPE line that lists all TYPE 273 values supported by the server (including the required ones). For 274 example, if an FTP-server supports all of TYPEs A, E, I, and U, the 275 FEAT response line would contain each of the possible arguments 276 separated by semicolons, e.g., 278 TYPE A;E;I;U 280 This specification does not change either RFC 959 or RFC 2389. In 281 particular, no FEAT response line is required for TYPE unless this, 282 or some other, extension to TYPE is supported by the FTP-server. 284 3. Net-Unicode Format for FTP 286 This section specifies a profile of Net-Unicode [RFC5198] for use 287 with FTP TYPE U. 289 Unicode characters must be transmitted in UTF-8 [RFC3629] as 290 specified for Net-Unicode. Because FTP is used in data transmission, 291 the characters and sequences that are discouraged in Section 2 of RFC 292 5198 are permitted to be transported by FTP. However, line-ending 293 sequences MUST conform to the CRLF convention specified there. 294 Consistent with Paragraph 4 of that Section, strings SHOULD be 295 normalized before transmission if at all possible. 297 The implicit logical byte size for this transmission type is eight 298 bits. 300 4. Acknowledgments 302 This document draws heavily on RFC 959; appreciation is expressed to 303 its authors and to the authors of RFC 2398. The work of Mark P. 304 Peterson and Douglas J. Papenthien on other FTP extensions finally 305 motivated production of this document in 2008 after a long delay; 306 that contribution is appreciated as well. Specific useful comments 307 on this draft or its immediate predecessors were provided by the late 308 and much-lamented Mike Padlipsky and by Mykyta Yevstifeyev. 310 5. IANA Considerations 312 When this specification is approved, IANA is requested to add an 313 additional table to the FTP Extensions Registry established by RFC 314 5797 [RFC5797]. That table should be titled "TYPE command arguments" 315 and should include "A (m) RFC 959", "E (o) RFC 959", "I (o) RFC 959", 316 "L (o) RFC 959", and "U (o) RFCNNNN". 318 6. Security Considerations 320 This specification makes no substantive change to the FTP command 321 stream (the argument to the standard TYPE command is changed). It 322 only alters the presentation of data in the data stream. 323 Consequently, it should have no negative security implications that 324 are not already present in the earlier FTP specifications described 325 in Section 1 and in the Net-Unicode specification [RFC5198]. By 326 specifying an exact canonical form for the identification and 327 transfer of Unicode strings, it may eliminate some problems that 328 might be encountered when such strings are transmitted without 329 identification or without restrictions (e.g., using TYPE I to obtain 330 a "binary" transfer). 332 7. References 334 7.1. Normative References 336 [ASCII] American National Standards Institute (formerly United 337 States of America Standards Institute), "USA Code for 338 Information Interchange", ANSI X3.4-1968, 1968. 340 ANSI X3.4-1968 has been replaced by newer versions with 341 slight modifications, but the 1968 version remains 342 definitive for the Internet. 344 [RFC0959] Postel, J. and J. Reynolds, "File Transfer Protocol", 345 STD 9, RFC 959, October 1985. 347 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 348 Requirement Levels", BCP 14, RFC 2119, March 1997. 350 [RFC2389] Hethmon, P. and R. Elz, "Feature negotiation mechanism for 351 the File Transfer Protocol", RFC 2389, August 1998. 353 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 354 10646", STD 63, RFC 3629, November 2003. 356 [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network 357 Interchange", RFC 5198, March 2008. 359 [Unicode] The Unicode Consortium. The Unicode Standard, Version 360 6.0.0, defined by:, "The Unicode Standard, Version 6.0.0", 361 (Mountain View, CA: The Unicode Consortium, 2011. ISBN 362 978-1-936213-01-6)., 363 . 365 7.2. Informative References 367 [RFC0114] Bhushan, A., "File Transfer Protocol", RFC 114, 368 April 1971. 370 [RFC0373] McCarthy, J., "Arbitrary Character Sets", RFC 373, 371 July 1972. 373 [RFC1123] Braden, R., "Requirements for Internet Hosts - Application 374 and Support", STD 3, RFC 1123, October 1989. 376 [RFC2640] Curtin, B., "Internationalization of the File Transfer 377 Protocol", RFC 2640, July 1999. 379 [RFC3659] Hethmon, P., "Extensions to FTP", RFC 3659, March 2007. 381 [RFC5797] Klensin, J. and A. Hoenes, "FTP Command and Extension 382 Registry", RFC 5797, March 2010. 384 [i18n-terms] 385 Hoffman, P. and J. Klensin, "Terminology Used in 386 Internationalization in the IETF", June 2011, . 389 Appendix A. Change Log 391 [[anchor13: RFC Editor: Please remove this section]] 393 A.1. New Version and File Name: draft-ietf-ftpext2-typeu-00 395 This version of the document is a slight update to 396 draft-klensin-ftp-typeu-00, posted in July 2008). It includes some 397 updated references to work completed in the interim, information 398 about the FTPEXT2 WG, a new Security Considerations section (omitted 399 from the prior draft), and a few other minor corrections. 401 A.2. Version -01 403 o Corrected a typographical error in the -00 change log entry and 404 made a cosmetic change to that section. 406 o Added additional metadata. 408 o Added a new introductory subsection (Section 1.3) to clarify the 409 relationship of this spec to FTP's development and some other 410 ongoing discussions in the IETF. 412 A.3. Version -02 414 o Changed title per suggestion from Mykyta Yevstifeyev 416 o Removed reference to ABNF since it turned out to be possible to 417 write the document without it. 419 o Rewrote the IANA Considerations to specify a table for TYPE 420 argument values. 422 o Made a number of other relatively minor corrections and 423 clarifications. 425 o Updated Unicode reference to 6.0. 427 o Moved this section to an appendix for easier handling later. 429 A.4. Version -03 431 o Draft reissued to reactivate it. 433 o Many small editorial changes and clarifications with no 434 substantive change to the specification itself. 436 Author's Address 438 John C Klensin 439 1770 Massachusetts Ave, Ste 322 440 Cambridge, MA 02140 441 USA 443 Phone: +1 617 245 1457 444 Email: john+ietf@jck.com