idnits 2.17.1 

draft-newman-url-imap-09.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-20) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Introduction section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There are 18 instances of too long lines in the document, the longest
     one being 7 characters in excess of 72.

  ** The abstract seems to contain references ([IMAP4]), which it shouldn't. 
     Please replace those with straight textual mentions of the documents in
     question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords -- however, there's a paragraph with
     a matching beginning. Boilerplate error?

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (May 1997) is 9837 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Looks like a reference, but probably isn't: '256' on line 639

  -- Looks like a reference, but probably isn't: '6' on line 547

  ** Obsolete normative reference: RFC 1738 (ref. 'BASIC-URL') (Obsoleted by
     RFC 4248, RFC 4266)

  ** Obsolete normative reference: RFC 2060 (ref. 'IMAP4') (Obsoleted by RFC
     3501)

  ** Obsolete normative reference: RFC 2068 (ref. 'HTTP') (Obsoleted by RFC
     2616)

  ** Obsolete normative reference: RFC  822 (ref. 'IMAIL') (Obsoleted by RFC
     2822)

  ** Obsolete normative reference: RFC 1808 (ref. 'REL-URL') (Obsoleted by
     RFC 3986)

  ** Obsolete normative reference: RFC 2044 (ref. 'UTF8') (Obsoleted by RFC
     2279)


     Summary: 17 errors (**), 0 flaws (~~), 2 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                          C. Newman
3	Internet Draft: IMAP URL Scheme                                 Innosoft
4	Document: draft-newman-url-imap-09.txt                          May 1997
5	                                                   Expires in six months

7	                            IMAP URL Scheme

9	Status of this memo

11	     This document is an Internet Draft.  Internet Drafts are working
12	     documents of the Internet Engineering Task Force (IETF), its Areas,
13	     and its Working Groups.  Note that other groups may also distribute
14	     working documents as Internet Drafts.

16	     Internet Drafts are draft documents valid for a maximum of six
17	     months.  Internet Drafts may be updated, replaced, or obsoleted by
18	     other documents at any time.  It is not appropriate to use Internet
19	     Drafts as reference material or to cite them other than as a
20	     ``working draft'' or ``work in progress``.

22	     To learn the current status of any Internet-Draft, please check the
23	     1id-abstracts.txt listing contained in the Internet-Drafts Shadow
24	     Directories on ds.internic.net, nic.nordu.net, ftp.isi.edu, or
25	     munnari.oz.au.

27	     A revised version of this draft document will be submitted to the
28	     RFC editor as a Proposed Standard for the Internet Community.
29	     Discussion and suggestions for improvement are requested.  This
30	     document will expire six months after publication.  Distribution of
31	     this draft is unlimited.

33	Abstract

35	     IMAP [IMAP4] is a rich protocol for accessing remote message
36	     stores.  It provides an ideal mechanism for accessing public
37	     mailing list archives as well as private and shared message stores.
38	     This document defines a URL scheme for referencing objects on an
39	     IMAP server.

41	1. Conventions used in this document

43	     The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY"
44	     in this document are to be interpreted as defined in "Key words for
45	     use in RFCs to Indicate Requirement Levels" [KEYWORDS].

47	2. IMAP scheme

49	     The IMAP URL scheme is used to designate IMAP servers, mailboxes,
50	     messages, MIME bodies [MIME], and search programs on Internet hosts
51	     accessible using the IMAP protocol.

53	     The IMAP URL follows the common Internet scheme syntax as defined
54	     in RFC 1738 [BASIC-URL] except that clear text passwords are not
55	     permitted.  If :<port> is omitted, the port defaults to 143.

57	     An IMAP URL takes one of the following forms:

59	         imap://<iserver>/
60	         imap://<iserver>/<enc_list_mailbox>;TYPE=<list_type>
61	         imap://<iserver>/<enc_mailbox>[uidvalidity][?<enc_search>]
62	         imap://<iserver>/<enc_mailbox>[uidvalidity]<iuid>[isection]

64	     The first form is used to refer to an IMAP server, the second form
65	     refers to a list of mailboxes, the third form refers to the
66	     contents of a mailbox or a set of messages resulting from a search,
67	     and the final form refers to a specific message or message part.
68	     Note that the syntax here is informal.  The authoritative formal
69	     syntax for IMAP URLs is defined in section 11.

71	3. IMAP User Name and Authentication Mechanism

73	     A user name and/or authentication mechanism may be supplied.  They
74	     are used in the "LOGIN" or "AUTHENTICATE" commands after making the
75	     connection to the IMAP server.  If no user name or authentication
76	     mechanism is supplied, the user name "anonymous" is used with the
77	     "LOGIN" command and the password is supplied as the Internet e-mail
78	     address of the end user accessing the resource.  If the URL doesn't
79	     supply a user name, the program interpreting the IMAP URL SHOULD
80	     request one from the user if necessary.

82	     An authentication mechanism can be expressed by adding
83	     ";AUTH=<enc_auth_type>" to the end of the user name.  When such an
84	     <enc_auth_type> is indicated, the client SHOULD request appropriate
85	     credentials from that mechanism and use the "AUTHENTICATE" command
86	     instead of the "LOGIN" command.  If no user name is specified, one
87	     SHOULD be obtained from the mechanism or requested from the user as
88	     appropriate.

90	     The string ";AUTH=*" indicates that the client SHOULD select an
91	     appropriate authentication mechanism.  It MAY use any mechanism
92	     listed in the CAPABILITY command or use an out of band security
93	     service resulting in a PREAUTH connection.  If no user name is
94	     specified and no appropriate authentication mechanisms are
95	     available, the client SHOULD fall back to anonymous login as
96	     described above.  This allows a URL which grants read-write access
97	     to authorized users, and read-only anonymous access to other users.

99	     If a user name is included with no authentication mechanism, then
100	     ";AUTH=*" is assumed.

102	     Since URLs can easily come from untrusted sources, care must be
103	     taken when resolving a URL which requires or requests any sort of
104	     authentication.  If authentication credentials are supplied to the
105	     wrong server, it may compromise the security of the user's account.
106	     The program resolving the URL should make sure it meets at least
107	     one of the following criteria in this case:

109	     (1) The URL comes from a trusted source, such as a referral server
110	     which the client has validated and trusts according to site policy.
111	     Note that user entry of the URL may or may not count as a trusted
112	     source, depending on the experience level of the user and site
113	     policy.
114	     (2) Explicit local site policy permits the client to connect to the
115	     server in the URL.  For example, if the client knows the site
116	     domain name, site policy may dictate that any hostname ending in
117	     that domain is trusted.
118	     (3) The user confirms that connecting to that domain name with the
119	     specified credentials and/or mechanism is permitted.
120	     (4) A mechanism is used which validates the server before passing
121	     potentially compromising client credentials.
122	     (5) An authentication mechanism is used which will not reveal
123	     information to the server which could be used to compromise future
124	     connections.

126	     URLs which do not include a user name must be treated with extra
127	     care, since they are more likely to compromise the user's primary
128	     account.  A URL containing ";AUTH=*" must also be treated with
129	     extra care since it might fall back on a weaker security mechanism.
130	     Finally, clients are discouraged from using a plain text password
131	     as a fallback with ";AUTH=*" unless the connection has strong
132	     encryption (e.g. a key length of greater than 56 bits).

134	     Note that if unsafe or reserved characters such as " " or ";" are
135	     present in the user name or authentication mechanism, they MUST be
136	     encoded as described in RFC 1738 [BASIC-URL].

138	4. IMAP server

140	     An IMAP URL referring to an IMAP server has the following form:

142	         imap://<iserver>/

144	     A program interpreting this URL would issue the standard set of
145	     commands it uses to present a view of the contents of an IMAP
146	     server.  This is likely to be semanticly equivalent to one of the
147	     following URLs:

149	         imap://<iserver>/;TYPE=LIST
150	         imap://<iserver>/;TYPE=LSUB

152	     The program interpreting this URL SHOULD use the LSUB form if it
153	     supports mailbox subscriptions.

155	5. Lists of mailboxes

157	     An IMAP URL referring to a list of mailboxes has the following
158	     form:

160	         imap://<iserver>/<enc_list_mailbox>;TYPE=<list_type>

162	     The <list_type> may be either "LIST" or "LSUB", and is case
163	     insensitive.  The field ";TYPE=<list_type>" MUST be included.

165	     The <enc_list_mailbox> is any argument suitable for the
166	     list_mailbox field of the IMAP [IMAP4] LIST or LSUB commands.  The
167	     field <enc_list_mailbox> may be omitted, in which case the program
168	     interpreting the IMAP URL may use "*" or "%" as the
169	     <enc_list_mailbox>.  The program SHOULD use "%" if it supports a
170	     hierarchical view, otherwise it SHOULD use "*".

172	     Note that if unsafe or reserved characters such as " " or "%" are
173	     present in <enc_list_mailbox> they MUST be encoded as described in
174	     RFC 1738 [BASIC-URL].  If the character "/" is present in
175	     enc_list_mailbox, it SHOULD NOT be encoded.

177	6. Lists of messages

179	     An IMAP URL referring to a list of messages has the following form:

181	         imap://<iserver>/<enc_mailbox>[uidvalidity][?<enc_search>]

183	     The <enc_mailbox> field is used as the argument to the IMAP4
184	     "SELECT" command.  Note that if unsafe or reserved characters such
185	     as " ", ";", or "?" are present in <enc_mailbox> they MUST be
186	     encoded as described in RFC 1738 [BASIC-URL].  If the character "/"
187	     is present in enc_mailbox, it SHOULD NOT be encoded.

189	     The [uidvalidity] field is optional.  If it is present, it MUST be
190	     the argument to the IMAP4 UIDVALIDITY status response at the time
191	     the URL was created.  This SHOULD be used by the program
192	     interpreting the IMAP URL to determine if the URL is stale.

194	     The [?<enc_search>] field is optional.  If it is not present, the
195	     contents of the mailbox SHOULD be presented by the program
196	     interpreting the URL.  If it is present, it SHOULD be used as the
197	     arguments following an IMAP4 SEARCH command with unsafe characters
198	     such as " " (which are likely to be present in the <enc_search>)
199	     encoded as described in RFC 1738 [BASIC-URL].

201	7. A specific message or message part

203	     An IMAP URL referring to a specific message or message part has the
204	     following form:

206	         imap://<iserver>/<enc_mailbox>[uidvalidity]<iuid>[isection]

208	     The <enc_mailbox> and [uidvalidity] are as defined above.

210	     If [uidvalidity] is present in this form, it SHOULD be used by the
211	     program interpreting the URL to determine if the URL is stale.

213	     The <iuid> refers to an IMAP4 message UID, and SHOULD be used as
214	     the <set> argument to the IMAP4 "UID FETCH" command.

216	     The [isection] field is optional.  If not present, the URL refers
217	     to the entire Internet message as returned by the IMAP command "UID
218	     FETCH <uid> BODY.PEEK[]".  If present, the URL refers to the object
219	     returned by a "UID FETCH <uid> BODY.PEEK[<section>]" command.  The
220	     type of the object may be determined with a "UID FETCH <uid>
221	     BODYSTRUCTURE" command and locating the appropriate part in the
222	     resulting BODYSTRUCTURE.  Note that unsafe characters in [isection]
223	     MUST be encoded as described in [BASIC-URL].

225	8. Relative IMAP URLs

227	     Relative IMAP URLs are permitted and are resolved according to the
228	     rules defined in RFC 1808 [REL-URL] with one exception.  In IMAP
229	     URLs, parameters are treated as part of the normal path with
230	     respect to relative URL resolution.  This is believed to be the
231	     behavior of the installed base and is likely to be documented in a
232	     future revision of the relative URL specification.

234	     The following observations are also important:

236	     The <iauth> grammar element is considered part of the user name for
237	     purposes of resolving relative IMAP URLs.  This means that unless a
238	     new login/server specification is included in the relative URL, the
239	     authentication mechanism is inherited from a base IMAP URL.

241	     URLs always use "/" as the hierarchy delimiter for the purpose of
242	     resolving paths in relative URLs.  IMAP4 permits the use of any
243	     hierarchy delimiter in mailbox names.  For this reason, relative
244	     mailbox paths will only work if the mailbox uses "/" as the
245	     hierarchy delimiter.  Relative URLs may be used on mailboxes which
246	     use other delimiters, but in that case, the entire mailbox name
247	     MUST be specified in the relative URL or inherited as a whole from
248	     the base URL.

250	     The base URL for a list of mailboxes or messages which was referred
251	     to by an IMAP URL is always the referring IMAP URL itself.  The
252	     base URL for a message or message part which was referred to by an
253	     IMAP URL may be more complicated to determine.  The program
254	     interpreting the relative URL will have to check the headers of the
255	     MIME entity and any enclosing MIME entities in order to locate the
256	     "Content-Base" and "Content-Location" headers.  These headers are
257	     used to determine the base URL as defined in [HTTP].  For example,
258	     if the referring IMAP URL contains a "/;SECTION=1.2" parameter,
259	     then the MIME headers for section 1.2, for section 1, and for the
260	     enclosing message itself SHOULD be checked in that order for
261	     "Content-Base" or "Content-Location" headers.

263	9. Multinational Considerations

265	     IMAP4 [IMAP4] section 5.1.3 includes a convention for encoding
266	     non-US-ASCII characters in IMAP mailbox names.  Because this
267	     convention is private to IMAP, it is necessary to convert IMAP's
268	     encoding to one that can be more easily interpreted by a URL
269	     display program.  For this reason, IMAP's modified UTF-7 encoding
270	     for mailboxes MUST be converted to UTF-8 [UTF8].  Since 8-bit
271	     characters are not permitted in URLs, the UTF-8 characters are
272	     encoded as required by the URL specification [BASIC-URL].  Sample
273	     code is included in Appendix A to demonstrate this conversion.

275	10. Examples

277	     The following examples demonstrate how an IMAP4 client program
278	     might translate various IMAP4 URLs into a series of IMAP4 commands.
279	     Commands sent from the client to the server are prefixed with "C:",
280	     and responses sent from the server to the client are prefixed with
281	     "S:".

283	     The URL:

285	      <imap://minbari.org/gray-council;UIDVALIDITY=385759045/;UID=20>

287	     Results in the following client commands:

289	         <connect to minbari.org, port 143>
290	         C: A001 LOGIN ANONYMOUS sheridan@babylon5.org
291	         C: A002 SELECT gray-council
292	         <client verifies the UIDVALIDITY matches>
293	         C: A003 UID FETCH 20 BODY.PEEK[]

295	     The URL:

297	      <imap://michael@minbari.org/users.*;type=list>

299	     Results in the following client commands:

301	         <client requests password from user>
302	         <connect to minbari.org imap server, activate strong encryption>
303	         C: A001 LOGIN MICHAEL zipper
304	         C: A002 LIST "" users.*

306	     The URL:

308	      <imap://psy.earth/~peter/%E6%97%A5%E6%9C%AC%E8%AA%9E/%E5%8F%B0%E5%8C%97>

310	     Results in the following client commands:

312	         <connect to psy.earth, port 143>
313	         C: A001 LOGIN ANONYMOUS bester@psycop.psy.earth
314	         C: A002 SELECT ~peter/&ZeVnLIqe-/&U,BTFw-
315	         <commands the client uses for viewing the contents of a mailbox>

317	     The URL:

319	      <imap://;AUTH=KERBEROS_V4@minbari.org/gray-council/;uid=20/;section=1.2>

321	     Results in the following client commands:

323	         <connect to minbari.org, port 143>
324	         C: A001 AUTHENTICATE KERBEROS_V4
325	         <authentication exchange>
326	         C: A002 SELECT gray-council
327	         C: A003 UID FETCH 20 BODY.PEEK[1.2]

329	     If the following relative URL is located in that body part:

331	      <;section=1.4>

333	     This could result in the following client commands:

335	         C: A004 UID FETCH 20 (BODY.PEEK[1.2.MIME]
336	                 BODY.PEEK[1.MIME]
337	                 BODY.PEEK[HEADER.FIELDS (Content-Base Content-Location)])
338	         <Client looks for Content-Base or Content-Location headers in
339	          result.  If no such headers, then it does the following>
340	         C: A005 UID FETCH 20 BODY.PEEK[1.4]

342	     The URL:

344	      <imap://;AUTH=*@minbari.org/gray%20council?SUBJECT%20shadows>

346	     Could result in the following:

348	         <connect to minbari.org, port 143>
349	         C: A001 CAPABILITY
350	         S: * CAPABILITY IMAP4rev1 AUTH=GSSAPI
351	         S: A001 OK
352	         C: A002 AUTHENTICATE GSSAPI
353	         <authentication exchange>
354	         S: A002 OK user lennier authenticated
355	         C: A003 SELECT "gray council"
356	         ...
357	         C: A004 SEARCH SUBJECT shadows
358	         S: * SEARCH 8 10 13 14 15 16
359	         S: A004 OK SEARCH completed
360	         C: A005 FETCH 8,10,13:16 ALL
361	         ...

363	     NOTE: In this final example, the client has implementation dependent
364	     choices.  The authentication mechanism could be anything, including
365	     PREAUTH.  And the final FETCH command could fetch more or less
366	     information about the messages, depending on what it wishes to display
367	     to the user.

369	11. Security Considerations

371	     Security considerations discussed in the IMAP specification [IMAP4]
372	     and the URL specification [BASIC-URL] are relevant.  Security
373	     considerations related to authenticated URLs are discussed in
374	     section 3 of this document.

376	     Many email clients store the plain text password for later use
377	     after logging into an IMAP server.  Such clients MUST NOT use a
378	     stored password in response to an IMAP URL without explicit
379	     permission from the user to supply that password to the specified
380	     host name.

382	12. ABNF for IMAP URL scheme

384	     This uses ABNF as defined in RFC 822 [IMAIL].  Terminals from the
385	     BNF for IMAP [IMAP4] and URLs [BASIC-URL] are also used.  Strings
386	     are not case sensitive and free insertion of linear-white-space is
387	     not permitted.

389	     achar            = uchar / "&" / "=" / "~"
390	                             ; see [BASIC-URL] for "uchar" definition

392	     bchar            = achar / ":" / "@" / "/"

394	     enc_auth_type    = 1*achar
395	                             ; encoded version of [IMAP-AUTH] "auth_type"

397	     enc_list_mailbox = 1*bchar
398	                             ; encoded version of [IMAP4] "list_mailbox"

400	     enc_mailbox      = 1*bchar
401	                             ; encoded version of [IMAP4] "mailbox"

403	     enc_search       = 1*bchar
404	                             ; encoded version of search_program below

406	     enc_section      = 1*bchar
407	                             ; encoded version of section below

409	     enc_user         = 1*achar
410	                             ; encoded version of [IMAP4] "userid"

412	     imapurl          = "imap://" iserver "/" [ icommand ]

414	     iauth            = ";AUTH=" ( "*" / enc_auth_type )

416	     icommand         = imailboxlist / ipath / isearch

418	     imailboxlist     = [enc_list_mailbox] ";TYPE=" list_type

420	     ipath            = enc_mailbox [uidvalidity] iuid [isection]

422	     isearch          = enc_mailbox [ "?" enc_search ] [uidvalidity]

424	     isection         = "/;SECTION=" enc_section

426	     iserver          = [iuserauth "@"] hostport
427	                             ; See [BASIC-URL] for "hostport" definition

429	     iuid             = "/;UID=" nz_number
430	                             ; See [IMAP4] for "nz_number" definition

432	     iuserauth        = enc_user [iauth] / [enc_user] iauth

434	     list_type        = "LIST" / "LSUB"

436	     search_program   = ["CHARSET" SPACE astring SPACE] 1#search_key
437	                             ; IMAP4 literals may not be used
438	                             ; See [IMAP4] for "astring" and "search_key"

440	     section          = section_text / (nz_number *["." nz_number]
441	                         ["." (section_text / "MIME")])
442	                             ; See [IMAP4] for "section_text" and "nz_number"

444	     uidvalidity      = ";UIDVALIDITY=" nz_number
445	                             ; See [IMAP4] for "nz_number" definition

447	13. References

449	     [BASIC-URL] Berners-Lee, Masinter, McCahill, "Uniform Resource
450	     Locators (URL)", RFC 1738, CERN, Xerox Corporation, University of
451	     Minnesota, December 1994.

453	         <ftp://ds.internic.net/rfc/rfc1738.txt>

455	     [IMAP4] Crispin, M., "Internet Message Access Protocol - Version
456	     4rev1", RFC 2060, University of Washington, December 1996.

458	         <ftp://ds.internic.net/rfc/rfc2060.txt>

460	     [IMAP-AUTH] Myers, J., "IMAP4 Authentication Mechanism", RFC 1731,
461	     Carnegie-Mellon University, December 1994.

463	         <ftp://ds.internic.net/rfc/rfc1731.txt>

465	     [HTTP] Fielding, Gettys, Mogul, Frystyk, Berners-Lee, "Hypertext
466	     Transfer Protocol -- HTTP/1.1", RFC 2068, UC Irvine, DEC, MIT/LCS,
467	     January 1997.

469	         <ftp://ds.internic.net/rfc/rfc2068.txt>

471	     [IMAIL] Crocker, "Standard for the Format of ARPA Internet Text
472	     Messages", STD 11, RFC 822, University of Delaware, August 1982.

474	         <ftp://ds.internic.net/rfc/rfc822.txt>

476	     [KEYWORDS] Bradner, "Key words for use in RFCs to Indicate
477	     Requirement Levels", RFC 2119, Harvard University, March 1997.

479	         <ftp://ds.internic.net/rfc/rfc2119.txt>

481	     [MIME] Freed, N., Borenstein, N., "Multipurpose Internet Mail
482	     Extensions", RFC 2045, Innosoft, First Virtual, November 1996.

484	        <ftp://ds.internic.net/rfc/rfc2045.txt>

486	     [REL-URL] Fielding, "Relative Uniform Resource Locators", RFC 1808,
487	     UC Irvine, June 1995.

489	         <ftp://ds.internic.net/rfc/rfc1808.txt>

491	     [UTF8] Yergeau, F. "UTF-8, a transformation format of Unicode and
492	     ISO 10646", RFC 2044, Alis Technologies, October 1996.

494	         <ftp://ds.internic.net/rfc/rfc2044.txt>

496	14. Author's Address

498	     Chris Newman
499	     Innosoft International, Inc.
500	     1050 East Garvey Ave. South
501	     West Covina, CA 91790 USA

503	     Email: chris.newman@innosoft.com

505	Appendix A.  Sample code

507	Here is sample C source code to convert between URL paths and IMAP
508	mailbox names, taking into account mapping between IMAP's modified UTF-7
509	[IMAP4] and hex-encoded UTF-8 which is more appropriate for URLs.  This
510	code has not been rigorously tested nor does it necessarily behave
511	reasonably with invalid input, but it should serve as a useful example.
512	This code just converts the mailbox portion of the URL and does not deal
513	with parameters, query or server components of the URL.

515	#include <stdio.h>
516	#include <string.h>

518	/* hexadecimal lookup table */
519	static char hex[] = "0123456789ABCDEF";

521	/* URL unsafe printable characters */
522	static char urlunsafe[] = " \"#%&+:;<=>?@[\\]^`{|}";

524	/* UTF7 modified base64 alphabet */
525	static char base64chars[] =
526	  "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+,";
527	#define UNDEFINED 64

529	/* UTF16 definitions */
530	#define UTF16MASK       0x03FFUL
531	#define UTF16SHIFT      10
532	#define UTF16HIGHSTART  0xD800UL
533	#define UTF16HIGHEND    0xDBFFUL
534	#define UTF16LOSTART    0xDC00UL
535	#define UTF16LOEND      0xDFFFUL

537	/* Convert an IMAP mailbox to a URL path
538	 *  dst needs to have roughly 4 times the storage space of src
539	 *    Hex encoding can triple the size of the input
540	 *    UTF-7 can be slightly denser than UTF-8
541	 *     (worst case: 8 octets UTF-7 becomes 9 octets UTF-8)
542	 */
543	void MailboxToURL(char *dst, char *src)
544	{
545	    unsigned char c, i, bitcount;
546	    unsigned long ucs4, utf16, bitbuf;
547	    unsigned char base64[256], utf8[6];

549	    /* initialize modified base64 decoding table */
550	    memset(base64, UNDEFINED, sizeof (base64));
551	    for (i = 0; i < sizeof (base64chars); ++i) {
552	        base64[base64chars[i]] = i;

554	    }

556	    /* loop until end of string */
557	    while (*src != '\0') {
558	        c = *src++;
559	        /* deal with literal characters and &- */
560	        if (c != '&' || *src == '-') {
561	            if (c < ' ' || c > '~' || strchr(urlunsafe, c) != NULL) {
562	                /* hex encode if necessary */
563	                dst[0] = '%';
564	                dst[1] = hex[c >> 4];
565	                dst[2] = hex[c & 0x0f];
566	                dst += 3;
567	            } else {
568	                /* encode literally */
569	                *dst++ = c;
570	            }
571	            /* skip over the '-' if this is an &- sequence */
572	            if (c == '&') ++src;
573	        } else {
574	            /* convert modified UTF-7 -> UTF-16 -> UCS-4 -> UTF-8 -> HEX */
575	            bitbuf = 0;
576	            bitcount = 0;
577	            ucs4 = 0;
578	            while ((c = base64[(unsigned char) *src]) != UNDEFINED) {
579	                ++src;
580	                bitbuf = (bitbuf << 6) | c;
581	                bitcount += 6;
582	                /* enough bits for a UTF-16 character? */
583	                if (bitcount >= 16) {
584	                    bitcount -= 16;
585	                    utf16 = (bitcount ? bitbuf >> bitcount : bitbuf) & 0xffff;
586	                    /* convert UTF16 to UCS4 */
587	                    if (utf16 >= UTF16HIGHSTART && utf16 <= UTF16HIGHEND) {
588	                        ucs4 = (utf16 & UTF16MASK) << UTF16SHIFT;
589	                        continue;
590	                    } else if (utf16 >= UTF16LOSTART && utf16 <= UTF16LOEND) {
591	                        ucs4 |= utf16 & UTF16MASK;
592	                    } else {
593	                        ucs4 = utf16;
594	                    }
595	                    /* convert UTF-16 range of UCS4 to UTF-8 */
596	                    if (ucs4 <= 0x7fUL) {
597	                        utf8[0] = ucs4;
598	                        i = 1;
599	                    } else if (ucs4 <= 0x7ffUL) {
600	                        utf8[0] = 0xc0 | (ucs4 >> 6);
601	                        utf8[1] = 0x80 | (ucs4 & 0x3f);
602	                        i = 2;
603	                    } else if (ucs4 <= 0xffffUL) {
604	                        utf8[0] = 0xe0 | (ucs4 >> 12);
605	                        utf8[1] = 0x80 | ((ucs4 >> 6) & 0x3f);
606	                        utf8[2] = 0x80 | (ucs4 & 0x3f);
607	                        i = 3;
608	                    } else {
609	                        utf8[0] = 0xf0 | (ucs4 >> 18);
610	                        utf8[1] = 0x80 | ((ucs4 >> 12) & 0x3f);
611	                        utf8[2] = 0x80 | ((ucs4 >> 6) & 0x3f);
612	                        utf8[3] = 0x80 | (ucs4 & 0x3f);
613	                        i = 4;
614	                    }
615	                    /* convert utf8 to hex */
616	                    for (c = 0; c < i; ++c) {
617	                        dst[0] = '%';
618	                        dst[1] = hex[utf8[c] >> 4];
619	                        dst[2] = hex[utf8[c] & 0x0f];
620	                        dst += 3;
621	                    }
622	                }
623	            }
624	            /* skip over trailing '-' in modified UTF-7 encoding */
625	            if (*src == '-') ++src;
626	        }
627	    }
628	    /* terminate destination string */
629	    *dst = '\0';
630	}

632	/* Convert hex coded UTF-8 URL path to modified UTF-7 IMAP mailbox
633	 *  dst should be about twice the length of src to deal with non-hex coded URLs
634	 */
635	void URLtoMailbox(char *dst, char *src)
636	{
637	    unsigned int utf8pos, utf8total, i, c, utf7mode, bitstogo, utf16flag;
638	    unsigned long ucs4, bitbuf;
639	    unsigned char hextab[256];

641	    /* initialize hex lookup table */
642	    memset(hextab, 0, sizeof (hextab));
643	    for (i = 0; i < sizeof (hex); ++i) {
644	        hextab[hex[i]] = i;
645	        if (isupper(hex[i])) hextab[tolower(hex[i])] = i;
646	    }

648	    utf7mode = 0;
649	    utf8total = 0;
650	    bitstogo = 0;
651	    while ((c = *src) != '\0') {
652	        ++src;
653	        /* undo hex-encoding */
654	        if (c == '%' && src[0] != '\0' && src[1] != '\0') {
655	            c = (hextab[src[0]] << 4) | hextab[src[1]];
656	            src += 2;
657	        }
658	        /* normal character? */
659	        if (c >= ' ' && c <= '~') {
660	            /* switch out of UTF-7 mode */
661	            if (utf7mode) {
662	                if (bitstogo) {
663	                    *dst++ = base64chars[(bitbuf << (6 - bitstogo)) & 0x3F];
664	                }
665	                *dst++ = '-';
666	                utf7mode = 0;
667	            }
668	            *dst++ = c;
669	            /* encode '&' as '&-' */
670	            if (c == '&') {
671	                *dst++ = '-';
672	            }
673	            continue;
674	        }
675	        /* switch to UTF-7 mode */
676	        if (!utf7mode) {
677	            *dst++ = '&';
678	            utf7mode = 1;
679	        }
680	        /* Encode US-ASCII characters as themselves */
681	        if (c < 0x80) {
682	            ucs4 = c;
683	            utf8total = 1;
684	        } else if (utf8total) {
685	            /* save UTF8 bits into UCS4 */
686	            ucs4 = (ucs4 << 6) | (c & 0x3FUL);
687	            if (++utf8pos < utf8total) {
688	                continue;
689	            }
690	        } else {
691	            utf8pos = 1;
692	            if (c < 0xE0) {
693	                utf8total = 2;
694	                ucs4 = c & 0x1F;
695	            } else if (c < 0xF0) {
696	                utf8total = 3;
697	                ucs4 = c & 0x0F;

699	            } else {
700	                /* NOTE: can't convert UTF8 sequences longer than 4 */
701	                utf8total = 4;
702	                ucs4 = c & 0x03;
703	            }
704	            continue;
705	        }
706	        /* loop to split ucs4 into two utf16 chars if necessary */
707	        utf8total = 0;
708	        do {
709	            if (ucs4 > 0xffffUL) {
710	                bitbuf = (bitbuf << 16) | ((ucs4 >> UTF16SHIFT)
711	                                           + UTF16HIGHSTART);
712	                ucs4 = (ucs4 & UTF16MASK) + UTF16LOSTART;
713	                utf16flag = 1;
714	            } else {
715	                bitbuf = (bitbuf << 16) | ucs4;
716	                utf16flag = 0;
717	            }
718	            bitstogo += 16;
719	            /* spew out base64 */
720	            while (bitstogo >= 6) {
721	                bitstogo -= 6;
722	                *dst++ = base64chars[(bitstogo ? (bitbuf >> bitstogo) : bitbuf)
723	                                     & 0x3F];
724	            }
725	        } while (utf16flag);
726	    }
727	    /* if in UTF-7 mode, finish in ASCII */
728	    if (utf7mode) {
729	        if (bitstogo) {
730	            *dst++ = base64chars[(bitbuf << (6 - bitstogo)) & 0x3F];
731	        }
732	        *dst++ = '-';
733	    }
734	    /* tie off string */
735	    *dst = '\0';
736	}