idnits 2.17.1 draft-resnick-text-enriched-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 959 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 4 instances of too long lines in the document, the longest one being 8 characters in excess of 72. ** The abstract seems to contain references ([RFC-1866], [RFC-1563], [RFC-1521], [RFC-1523]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 1995) is 10360 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '
' and
     '' lines.


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'ISO-639' is mentioned on line 465, but not defined

  -- Looks like a reference, but probably isn't: '62' on line 886

  ** Obsolete normative reference: RFC 1341 (Obsoleted by RFC 1521)

  ** Obsolete normative reference: RFC 1521 (Obsoleted by RFC 2045, RFC 2046,
     RFC 2047, RFC 2048, RFC 2049)

  ** Obsolete normative reference: RFC 1523 (Obsoleted by RFC 1563, RFC 1896)

  ** Obsolete normative reference: RFC 1563 (Obsoleted by RFC 1896)

  ** Obsolete normative reference: RFC 1642 (Obsoleted by RFC 2152)

  ** Obsolete normative reference: RFC 1766 (Obsoleted by RFC 3066, RFC 3282)

  ** Obsolete normative reference: RFC 1866 (Obsoleted by RFC 2854)


     Summary: 17 errors (**), 0 flaws (~~), 2 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group               P. Resnick
3	INTERNET-DRAFT                      A. Walker
4	To-obsolete RFCs: 1523, 1563        December 1995
5	Category: Informational             

7	                 The text/enriched MIME Content-type

9	Status of this Memo

11	This document is an Internet-Draft. Internet-Drafts are working
12	documents of the Internet Engineering Task Force (IETF), its areas,
13	and its working groups. Note that other groups may also distribute
14	working documents as Internet-Drafts.

16	Internet-Drafts are draft documents valid for a maximum of six
17	months and may be updated, replaced, or obsoleted by other documents
18	at any time. It is inappropriate to use Internet-Drafts as reference
19	material or to cite them other than as "work in progress."

21	To learn the current status of any Internet-Draft, please check the
22	"1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
23	Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
24	munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
25	ftp.isi.edu (US West Coast).

27	Abstract

29	MIME [RFC-1521] defines a format and general framework for the
30	representation of a wide variety of data types in Internet mail.
31	This document defines one particular type of MIME data, the
32	text/enriched MIME type. The text/enriched MIME type is intended to
33	facilitate the wider interoperation of simple enriched text across a
34	wide variety of hardware and software platforms. This document is
35	only a minor revision to the text/enriched MIME type that was first
36	described in [RFC-1523] and [RFC-1563], and is only intended to be
37	used in the short term until other MIME types for text formatting in
38	Internet mail are developed and deployed.

40	The text/enriched MIME type

42	In order to promote the wider interoperability of simple formatted
43	text, this document defines an extremely simple subtype of the MIME
44	content-type "text", the "text/enriched" subtype. The content-type
45	line for this type may have one optional parameter, the "charset"
46	parameter, with the same values permitted for the "text/plain" MIME
47	content-type.

49	The text/enriched subtype was designed to meet the following
50	criteria:

52	  1. The syntax must be extremely simple to parse, so that even
53	     teletype-oriented mail systems can easily strip away the
54	     formatting information and leave only the readable text.

56	  2. The syntax must be extensible to allow for new formatting
57	     commands that are deemed essential for some application.

59	  3. If the character set in use is ASCII or an 8- bit ASCII
60	     superset, then the raw form of the data must be readable enough
61	     to be largely unobjectionable in the event that it is displayed
62	     on the screen of the user of a non-MIME-conformant mail reader.

64	  4. The capabilities must be extremely limited, to ensure that it
65	     can represent no more than is likely to be representable by the
66	     user's primary word processor. While this limits what can be
67	     sent, it increases the likelihood that what is sent can be
68	     properly displayed.

70	There are other text formatting standards which meet some of these
71	criteria. In particular, HTML and SGML have come into widespread use
72	on the Internet. However, there are two important reasons that this
73	document further promotes the use of text/enriched in Internet mail
74	over other such standards:

76	  1. Most MIME-aware Internet mail applications are already able to
77	     either properly format text/enriched mail or, at the very
78	     least, are able to strip out the formatting commands and
79	     display the readable text. The same is not true for HTML or
80	     SGML.

82	  2. The current RFC on HTML [RFC-1866] and Internet Drafts on SGML
83	     have many features which are not necessary for Internet mail,
84	     and are missing a few capabilities that text/enriched already
85	     has.

87	For these reasons, this document is promoting the use of
88	text/enriched until other Internet standards come into more
89	widespread use. For those who will want to use HTML, Appendix B of
90	this document contains a very simple C program that converts
91	text/enriched to HTML 2.0 described in [RFC-1866].

93	Syntax

95	The syntax of "text/enriched" is very simple. It represents text in
96	a single character set--US-ASCII by default, although a different
97	character set can be specified by the use of the "charset"
98	parameter. (The semantics of text/enriched in non-ASCII character
99	sets are discussed later in this document.) All characters represent
100	themselves, with the exception of the "<" character (ASCII 60),
101	which is used to mark the beginning of a formatting command. A
102	literal less-than sign ("<") can be represented by a sequence of two
103	such characters, "<<".

105	Formatting instructions consist of formatting commands surrounded by
106	angle brackets ("<>", ASCII 60 and 62). Each formatting command may
107	be no more than 60 characters in length, all in US-ASCII, restricted
108	to the alphanumeric and hyphen ("-") characters. Formatting commands
109	may be preceded by a solidus ("/", ASCII 47), making them negations,
110	and such negations must always exist to balance the initial opening
111	commands. Thus, if the formatting command "" appears at some
112	point, there must later be a "" to balance it. (NOTE: The 60
113	character limit on formatting commands does NOT include the "<",
114	">", or "/" characters that might be attached to such commands.)

116	Line break rules

118	Line breaks (CRLF pairs in standard network representation) are
119	handled specially. In particular, isolated CRLF pairs are translated
120	into a single SPACE character. Sequences of N consecutive CRLF
121	pairs, however, are translated into N-1 actual line breaks. This
122	permits long lines of data to be represented in a natural looking
123	manner despite the frequency of line-wrapping in Internet mailers.
124	When preparing the data for mail transport, isolated line breaks
125	should be inserted wherever necessary to keep each line shorter than
126	80 characters. When preparing such data for presentation to the
127	user, isolated line breaks should be replaced by a single SPACE
128	character, and N consecutive CRLF pairs should be presented to the
129	user as N-1 line breaks.

131	Thus text/enriched data that looks like this:

133	     This is
134	     a single
135	     line

137	     This is the
138	     next line.

140	     This is the
141	     next section.

143	should be displayed by a text/enriched interpreter as follows:

145	     This is a single line
146	     This is the next line.

148	     This is the next section.

150	The formatting commands, not all of which will be implemented by all
151	implementations, are described in the following sections.

153	Formatting Commands

155	The text/enriched formatting commands all begin with 
156	and end with , affecting the formatting of the text
157	between those two tokens. The commands are described here, grouped
158	according to type.

160	Parameter Command

162	Some of the formatting commands may require one or more associated
163	parameters. The "param" command is a special formatting command used
164	to include these parameters.

166	     Param
167	          Marks the affected text as command parameters, to be
168	          interpreted or ignored by the text/enriched
169	          interpreter, but not to be shown to the reader. The
170	          "param" command always immediately follows some other
171	          formatting command, and the parameter data indicates
172	          some additional information about the formatting that
173	          is to be done. The syntax of the parameter data
174	          (whatever appears between the initial "" and
175	          the terminating "") is defined for each
176	          command that uses it. However, it is always required
177	          that the format of such data must not contain nested
178	          "param" commands, and either must not use the "<"
179	          character or must use it in a way that is compatible
180	          with text/enriched parsing. That is, the end of the
181	          parameter data should be recognizable with either of
182	          two algorithms: simply searching for the first
183	          occurrence of "" or parsing until a balanced
184	          "" command is found. In either case, however,
185	          the parameter data should not be shown to the human
186	          reader.

188	Font-Alteration Commands

190	The following formatting commands are intended to alter the font in
191	which text is displayed, but not to alter the indentation or
192	justification state of the text:

194	     Bold
195	          causes the affected text to be in a bold font. Nested
196	          bold commands have the same effect as a single bold
197	          command.

199	     Italic
200	          causes the affected text to be in an italic font.
201	          Nested italic commands have the same effect as a
202	          single italic command.

204	     Underline
205	          causes the affected text to be underlined. Nested
206	          underline commands have the same effect as a single
207	          underline command.

209	     Fixed
210	          causes the affected text to be in a fixed width font.
211	          Nested fixed commands have the same effect as a
212	          single fixed command.

214	     FontFamily
215	          causes the affected text to be displayed in a
216	          specified typeface. The "fontfamily" command requires
217	          a parameter that is specified by using the "param"
218	          command. The parameter data is a case-insensitive
219	          string containing the name of a font family. Any
220	          currently available font family name (e.g. Times,
221	          Palatino, Courier, etc.) may be used. This includes
222	          font families defined by commercial type foundries
223	          such as Adobe, BitStream, or any other such foundry.
224	          Note that implementations should only use the general
225	          font family name, not the specific font name (e.g.
226	          use "Times", not "TimesRoman" nor "TimesBoldItalic").
227	          When nested, the inner "fontfamily" command takes
228	          precedence. Also note that the "fontfamily" command
229	          is advisory only; it should not be expected that
230	          other implementations will honor the typeface
231	          information in this command since the font
232	          capabilities of systems vary drastically.

234	     Color
235	          causes the affected text to be displayed in a
236	          specified color. The "color" command requires a
237	          parameter that is specified by using the "param"
238	          command. The parameter data can be one of the
239	          following:

241	               red
242	               blue
243	               green
244	               yellow
245	               cyan
246	               magenta
247	               black
248	               white

250	          or an RGB color value in the form:

252	               ####,####,####

254	          where '#' is a hexadecimal digit '0' through '9', 'A'
255	          through 'F', or 'a' through 'f'. The three 4-digit
256	          hexadecimal values are the RGB values for red, green,
257	          and blue respectively, where each component is
258	          expressed as an unsigned value between 0 (0000) and
259	          65535 (FFFF). The default color for the message is
260	          unspecified, though black is a common choice in many
261	          environments. When nested, the inner "color" command
262	          takes precedence.

264	     Smaller
265	          causes the affected text to be in a smaller font. It
266	          is recommended that the font size be changed by two
267	          points, but other amounts may be more appropriate in
268	          some environments. Nested smaller commands produce
269	          ever smaller fonts, to the limits of the
270	          implementation's capacity to reasonably display them,
271	          after which further smaller commands have no
272	          incremental effect.

274	     Bigger
275	          causes the affected text to be in a bigger font. It
276	          is recommended that the font size be changed by two
277	          points, but other amounts may be more appropriate in
278	          some environments. Nested bigger commands produce
279	          ever bigger fonts, to the limits of the
280	          implementation's capacity to reasonably display them,
281	          after which further bigger commands have no
282	          incremental effect.

284	While the "bigger" and "smaller" operators are effectively inverses,
285	it is not recommended, for example, that "" be used to end
286	the effect of "". This is properly done with "".

288	Since the capabilities of implementations will vary, it is to be
289	expected that some implementations will not be able to act on some
290	of the font-alteration commands. However, an implementation should
291	still display the text to the user in a reasonable fashion. In
292	particular, the lack of capability to display a particular font
293	family, color, or other text attribute does not mean that an
294	implementation should fail to display text.

296	Fill/Justification/Indentation Commands

298	Initially, text/enriched text is intended to be displayed fully
299	filled (that is, using the rules specified for replacing CRLF pairs
300	with spaces or removing them as appropriate) with appropriate
301	kerning and letter-tracking, and using the maximum available margins
302	as suits the capabilities of the receiving user agent software.

304	The following commands alter that state. Each of these commands
305	force a line break before and after the formatting environment if
306	there is not otherwise a line break. For example, if one of these
307	commands occurs anywhere other than the beginning of a line of text
308	as presented, a new line is begun.

310	     Center
311	          causes the affected text to be centered.

313	     FlushLeft
314	          causes the affected text to be left-justified with a
315	          ragged right margin.

317	     FlushRight
318	          causes the affected text to be right-justified with a
319	          ragged left margin.

321	     FlushBoth
322	          causes the affected text to be filled and padded so
323	          as to create smooth left and right margins, i.e., to
324	          be fully justified.

326	     ParaIndent
327	          causes the running margins of the affected text to be
328	          moved in. The recommended indentation change is the
329	          width of four characters, but this may differ among
330	          implementations. The "paraindent" command requires a
331	          parameter that is specified by using the "param"
332	          command. The parameter data is a comma-seperated list
333	          of one or more of the following:

335	          Left
336	               causes the running left margin to be moved to
337	               the right.

339	          Right
340	               causes the running right margin to be moved to
341	               the left.

343	          In
344	               causes the first line of the affected paragraph
345	               to be indented in addition to the running
346	               margin. The remaining lines remain flush to the
347	               running margin.

349	          Out
350	               causes all lines except for the first line of
351	               the affected paragraph to be indented in
352	               addition to the running margin. The first line
353	               remains flush to the running margin.

355	     Nofill
356	          causes the affected text to be displayed without
357	          filling. That is, the text is displayed without using
358	          the rules for replacing CRLF pairs with spaces or
359	          removing consecutive sequences of CRLF pairs.
360	          However, the current state of the margins and
361	          justification is honored; any indentation or
362	          justification commands are still applied to the text
363	          within the scope of the "nofill".

365	The "center", "flushleft", "flushright", and "flushboth" commands
366	are mutually exclusive, and, when nested, the inner command takes
367	precedence.

369	The "nofill" command is mutually exclusive with the "in" and "out"
370	parameters of the "paraindent" command; when they occur in the same
371	scope, their behavior is undefined.

373	The parameter data for the "paraindent" command my contain multiple
374	occurances of the same parameter (i.e. "left", "right", "in", or
375	"out"). Each occurance causes the text to be further indented in the
376	manner indicated by that parameter. Nested "paraindent" commands
377	cause the affected text to be further indented according to the
378	parameters. Note that the "in" and "out" parameters for "paraindent"
379	are mutually exclusive; when they appear together or when nested
380	"paraindent" commands contain both of them, their behavior is
381	undefined.

383	For purposes of the "in" and "out" parameters, a paragraph is
384	defined as text that is delimited by line breaks after applying the
385	rules for replacing CRLF pairs with spaces or removing consecutive
386	sequences of CRLF pairs. For example, within the scope of an "out",
387	the line following each CRLF is made flush with the running margin,
388	and subsequent lines are indented. Within the scope of an "in", the
389	first line following each CRLF is indented, and subsequent lines
390	remain flush to the running margin.

392	Whether or not text is justified by default (that is, whether the
393	default environment is "flushleft", "flushright", or "flushboth") is
394	unspecified, and depends on the preferences of the user, the
395	capabilities of the local software and hardware, and the nature of
396	the character set in use. On systems where full justification is
397	considered undesirable, the "flushboth" environment may be identical
398	to the default environment. Note that full justification should
399	never be performed inside of "center", "flushleft", "flushright", or
400	"nofill" environments. Note also that for some non-ASCII character
401	sets, full justification may be fundamentally inappropriate.

403	Note that [RFC-1563] defined two additional indentation commands,
404	"Indent" and "IndentRight". These commands did not force a line
405	break, and therefore their behavior was unpredictable since they
406	depended on the margins and character sizes that a particular
407	implementation used. Therefore, their use is deprecated and they
408	should be ignored just as other unrecognized commands.

410	Markup Commands

412	Commands in this section, unlike the other text/enriched commands
413	are declarative markup commands. Text/enriched is not intended as a
414	full markup language, but instead as a simple way to represent
415	common formatting commands. Therefore, markup commands are purposely
416	kept to a minimum. It is only because each was deemed so prevalent
417	or necessary in an e-mail environment that these particular commands
418	have been included at all.

420	     Excerpt
421	          causes the affected text to be interpreted as a
422	          textual excerpt from another source, probably a
423	          message being responded to. Typically this will be
424	          displayed using indentation and an alternate font, or
425	          by indenting lines and preceding them with "> ", but
426	          such decisions are up to the implementation. Note
427	          that as with the justification commands, the excerpt
428	          command implicitly begins and ends with a line break
429	          if one is not already there. Nested "excerpt"
430	          commands are acceptable and should be interpreted as
431	          meaning that the excerpted text was excerpted from
432	          yet another source. Again, this can be displayed
433	          using additional indentation, different colors, etc.

435	          Optionally, the "excerpt" command can take a
436	          parameter by using the "param" command. The format of
437	          the data is unspecified, but it is intended to
438	          uniquely identify the text from which the excerpt is
439	          taken. With this information, an implementation
440	          should be able to uniquely identify the source of any
441	          particular excerpt, especially if two or more
442	          excerpts in the message are from the same source, and
443	          display it in some way that makes this apparent to
444	          the user.

446	     Lang
447	          causes the affected text to be interpreted as
448	          belonging to a particular language. This is most
449	          useful when two different languages use the same
450	          character set, but may require a different font or
451	          formatting depending on the language. For instance,
452	          Chinese and Japanese share similar character glyphs,
453	          and in some character sets like UNICODE share common
454	          code points, but it is considered very important that
455	          different fonts be used for the two languages,
456	          especially if they appear together, so that meaning
457	          is not lost. Also, language information can be used
458	          to allow for fancier text handling, like spell
459	          checking or hyphenation.

461	          The "lang" command requires a parameter using the
462	          "param" command. The parameter data can be any of the
463	          language tags specified in [RFC-1766], "Tags for the
464	          Identification of Languages". These tags are the two
465	          letter language codes taken from [ISO-639] or can be
466	          other language codes that are registered according to
467	          the instructions in the Langauge Tags RFC. Consult
468	          that memo for further information.

470	Balancing and Nesting of Formatting Commands

472	Pairs of formatting commands must be properly balanced and nested.
473	Thus, a proper way to describe text in bold italics is:

475	     the-text

477	or, alternately,

479	     the-text

481	but, in particular, the following is illegal text/enriched:

483	     the-text

485	The nesting requirement for formatting commands imposes a slightly
486	higher burden upon the composers of text/enriched bodies, but
487	potentially simplifies text/enriched displayers by allowing them to
488	be stack-based. The main goal of text/enriched is to be simple
489	enough to make multifont, formatted email widely readable, so that
490	those with the capability of sending it will be able to do so with
491	confidence. Thus slightly increased complexity in the composing
492	software was deemed a reasonable tradeoff for simplified reading
493	software. Nonetheless, implementors of text/enriched readers are
494	encouraged to follow the general Internet guidelines of being
495	conservative in what you send and liberal in what you accept. Those
496	implementations that can do so are encouraged to deal reasonably
497	with improperly nested text/enriched data.

499	Unrecognized formatting commands

501	Implementations must regard any unrecognized formatting command as
502	"no-op" commands, that is, as commands having no effect, thus
503	facilitating future extensions to "text/enriched". Private
504	extensions may be defined using formatting commands that begin with
505	"X-", by analogy to Internet mail header field names.

507	In order to formally define extended commands, a new Internet
508	document should be published.

510	White Space in Text/enriched Data

512	No special behavior is required for the SPACE or TAB (HT) character.
513	It is recommended, however, that, at least when fixed-width fonts
514	are in use, the common semantics of the TAB (HT) character should be
515	observed, namely that it moves to the next column position that is a
516	multiple of 8. (In other words, if a TAB (HT) occurs in column n,
517	where the leftmost column is column 0, then that TAB (HT) should be
518	replaced by 8-(n mod 8) SPACE characters.) It should also be noted
519	that some mail gateways are notorious for losing (or, less commonly,
520	adding) white space at the end of lines, so reliance on SPACE or TAB
521	characters at the end of a line is not recommended.

523	Initial State of a text/enriched interpreter

525	Text/enriched is assumed to begin with filled text in a
526	variable-width font in a normal typeface and a size that is average
527	for thecurrent display and user. The left and right margins are
528	assumed to be maximal, that is, at the leftmost and rightmost
529	acceptable positions.

531	Non-ASCII character sets

533	One of the great benefits of MIME is the ability to use different
534	varieties of non-ASCII text in messages. To use non-ASCII text in a
535	message, normally a charset parameter is specified in the
536	Content-type line that indicates the character set being used. For
537	purposes of this RFC, any legal MIME charset parameter can be used
538	with the text/enriched Content-type. However, there are two
539	difficulties that arise with regard to the text/enriched
540	Content-type when non-ASCII text is desired. The first problem
541	involves difficulties that occur when the user wishes to create text
542	which would normally require multiple non-ASCII character sets in
543	the same text/enriched message. The second problem is an ambiguity
544	that arises because of the text/enriched use of the "<" character in
545	formatting commands.

547	Using multiple non-ASCII character sets

549	Normally, if a user wishes to produce text which contains characters
550	from entirely different character sets within the same MIME message
551	(for example, using Russian Cyrillic characters from ISO 8859-5 and
552	Hebrew characters from ISO 8859-8), a multipart message is used.
553	Every time a new character set is desired, a new MIME body part is
554	started with different character sets specified in the charset
555	parameter of the Content-type line. However, using multiple
556	character sets this way in text/enriched messages introduces
557	problems. Since a change in the charset parameter requires a new
558	part, text/enriched formatting commands used in the first part would
559	not be able to apply to text that occurs in subsequent parts. It is
560	not possible for text/enriched formatting commands to apply across
561	MIME body part boundaries.

563	[RFC-1341] attempted to get around this problem in the now obsolete
564	text/richtext format by introducing different character set
565	formatting commands like "iso-8859-5" and "us-ascii". But this, or
566	even a more general solution along the same lines, is still
567	undesirable: It is common for a MIME application to decide, for
568	example, what character font resources or character lookup tables it
569	will require based on the information provided by the charset
570	parameter of the Content-type line, before it even begins to
571	interpret or display the data in that body part. By allowing the
572	text/enriched interpreter to subsequently change the character set,
573	perhaps to one completely different from the charset specified in
574	the Content-type line (with potentially much different resource
575	requirements), too much burden would be placed on the text/enriched
576	interpreter itself.

578	Therefore, if multiple types of non-ASCII characters are desired in
579	a text/enriched document, one of the following two methods must be
580	used:

582	  1. For cases where the different types of non-ASCII text can be
583	     limited to their own paragraphs with distinct formatting, a
584	     multipart message can be used with each part having a
585	     Content-Type of text/enriched and a different charset
586	     parameter. The one caveat to using this method is that each new
587	     part must start in the initial state for a text/enriched
588	     document. That means that all of the text/enriched commands in
589	     the preceding part must be properly balanced with ending
590	     commands before the next text/enriched part begins. Also, each
591	     text/enriched part must begin a new paragraph.

593	  2. If different types of non-ASCII text are to appear in the same
594	     line or paragraph, or if text/enriched formatting (e.g.
595	     margins, typeface, justification) is required across several
596	     different types of non-ASCII text, a single text/enriched body
597	     part should be used with a character set specified that
598	     contains all of the required characters. For example, a charset
599	     parameter of "UNICODE-1-1-UTF-7" as specified in [RFC-1642]
600	     could be used for such purposes. Not only does UNICODE contain
601	     all of the characters that can be represented in all of the
602	     other registered ISO 8859 MIME character sets, but UTF-7 is
603	     fully compatible with other aspects of the text/enriched
604	     standard, including the use of the "<" character referred to
605	     below. Any other character sets that are specified for use in
606	     MIME which contain different types of non-ASCII text can also
607	     be used in these instances.

609	Use of the "<" character in formatting commands

611	If the character set specified by the charset parameter on the
612	Content-type line is anything other than "US- ASCII", this means
613	that the text being described by text/enriched formatting commands
614	is in a non-ASCII character set. However, the commands themselves
615	are still the same ASCII commands that are defined in this document.
616	This creates an ambiguity only with reference to the "<" character,
617	the octet with numeric value 60. In single byte character sets, such
618	as the ISO-8859 family, this is not a problem; the octet 60 can be
619	quoted by including it twice, just as for ASCII. The problem is more
620	complicated, however, in the case of multi-byte character sets,
621	where the octet 60 might appear at any point in the byte sequence
622	for any of several characters.

624	In practice, however, most multi-byte character sets address this
625	problem internally. For example, the UNICODE character sets can use
626	the UTF-7 encoding which preserves all of the important ASCII
627	characters in their single byte form. The ISO-2022 family of
628	character sets can use certain character sequences to switch back
629	into ASCII at any moment. Therefore it is specified that, before
630	text/enriched formatting commands, the prevailing character set
631	should be "switched back" into ASCII, and that only those characters
632	which would be interpreted as "<" in plain text should be
633	interpreted as token delimiters in text/enriched.

635	The question of what to do for hypothetical future character sets
636	that do not subsume ASCII is not addressed in this memo.

638	Minimal text/enriched conformance

640	A minimal text/enriched implementation is one that converts "<<" to
641	"<", removes everything between a  command and the next
642	balancing  command, removes all other formatting commands
643	(all text enclosed in angle brackets), and, outside of 
644	environments, converts any series of n CRLFs to n-1 CRLFs, and
645	converts any lone CRLF pairs to SPACE.

647	Notes for Implementors

649	It is recognized that implementors of future mail systems will want
650	rich text functionality far beyond that currently defined for
651	text/enriched. The intent of text/enriched is to provide a common
652	format for expressing that functionality in a form in which much of
653	it, at least, will be understood by interoperating software. Thus,
654	in particular, software with a richer notion of formatted text than
655	text/enriched can still use text/enriched as its basic
656	representation, but can extend it with new formatting commands and
657	by hiding information specific to that software system in
658	text/enriched  constructs. As such systems evolve, it is
659	expected that the definition of text/enriched will be further
660	refined by future published specifications, but text/enriched as
661	defined here provides a platform on which evolutionary refinements
662	can be based.

664	An expected common way that sophisticated mail programs will
665	generate text/enriched data is as part of a multipart/alternative
666	construct. For example, a mail agent that can generate enriched mail
667	in ODA format can generate that mail in a more widely interoperable
668	form by generating both text/enriched and ODA versions of the same
669	data, e.g.:

671	     Content-type: multipart/alternative; boundary=foo

673	     --foo
674	     Content-type: text/enriched

676	     [text/enriched version of data]
677	     --foo Content-type: application/oda

679	     [ODA version of data]
680	     --foo--

682	If such a message is read using a MIME-conformant mail reader that
683	understands ODA, the ODA version will be displayed; otherwise, the
684	text/enriched version will be shown.

686	In some environments, it might be impossible to combine certain
687	text/enriched formatting commands, whereas in others they might be
688	combined easily. For example, the combination of  and 
689	might produce bold italics on systems that support such fonts, but
690	there exist systems that can make text bold or italicized, but not
691	both. In such cases, the most recently issued (innermost) recognized
692	formatting command should be preferred.

694	One of the major goals in the design of text/enriched was to make it
695	so simple that even text-only mailers will implement enriched-to-
696	plain-text translators, thus increasing the likelihood that enriched
697	text will become "safe" to use very widely. To demonstrate this
698	simplicity, an extremely simple C program that converts
699	text/enriched input into plain text output is included in Appendix
700	A.

702	Extensions to text/enriched

704	It is expected that various mail system authors will desire
705	extensions to text/enriched. The simple syntax of text/enriched, and
706	the specification that unrecognized formatting commands should
707	simply be ignored, are intended to promote such extensions.

709	An Example

711	Putting all this together, the following "text/enriched" body
712	fragment:

714	     From: Nathaniel Borenstein 
715	     To: Ned Freed 
716	     Content-type: text/enriched

718	     Now is the time for all
719	     good men
720	     (and <) to
721	     come

723	     to the aid of their

725	     redbeloved
726	     country.

728	     By the way,
729	     I think that left<

731	     should REALLY be called

733	     left<
734	     and that I am always right.

736	     -- the end

738	represents the following formatted text (which will, no doubt, look
739	somewhat cryptic in the text-only version of this document):

741	     Now is the time for all good men (and ) to come
742	     to the aid of their

744	     beloved country.
745	     By the way, I think that
746	          
747	     should REALLY be called
748	          
749	     and that I am always right.
750	     -- the end

752	where the word "beloved" would be in red on a color display.

754	Security Considerations

756	Security issues are not discussed in this memo, as the mechanism
757	raises no security issues.

759	Author's Address

761	For more information, the authors of this document may be contacted
762	via Internet mail:

764	                          Peter W. Resnick
765	                        QUALCOMM Incorporated
766	                       1009 North Busey Avenue
767	                        Urbana, IL 61801-1607
768	                       Phone: +1 217 337 1905
769	                        FAX: +1 217 337 1905
770	                    e-mail: presnick@qualcomm.com

772	                            Amanda Walker
773	                    InterCon Systems Corporation
774	                         950 Herndon Parkway
775	                          Herndon, VA 22070
776	                       Phone: +1 703 709 5500
777	                        FAX: +1 703 709 5555
778	                     e-mail: amanda@intercon.com

780	Acknowledgements

782	References

784	[RFC-1341]
785	[RFC-1521]
786	[RFC-1523]
787	[RFC-1563]
788	[RFC-1642]
789	[RFC-1766]
790	[RFC-1866]

792	Appendix A--A Simple enriched-to-plain Translator in C

794	One of the major goals in the design of the text/enriched subtype of
795	the text Content-Type is to make formatted text so simple that even
796	text-only mailers will implement enriched-to-plain-text translators,
797	thus increasing the likelihood that multifont text will become
798	"safe" to use very widely. To demonstrate this simplicity, what
799	follows is a simple C program that converts text/enriched input into
800	plain text output. Note that the local newline convention (the
801	single character represented by "\n") is assumed by this program,
802	but that special CRLF handling might be necessary on some systems.

804	#include 
805	#include 
806	#include 
807	#include 

809	main() {
810	    int c, i, paramct=0, newlinect=0, nofill=0;
811	    char token[62], *p;

813	    while ((c=getc(stdin)) != EOF) {
814	        if (c == '<') {
815	            if (newlinect == 1) putc(' ', stdout);
816	            newlinect = 0;
817	            c = getc(stdin);
818	            if (c == '<') {
819	                if (paramct <= 0) putc(c, stdout);
820	            } else {
821	                 ungetc(c, stdin);
822	                 for (i=0, p=token; (c=getc(stdin)) != EOF && c != '>'; i++) {
823	                    if (i < sizeof(token)-1) *p++ = isupper(c) ? tolower(c) : c;
824	                 }
825	                 *p = '\0';
826	                 if (c == EOF) break;
827	                 if (strcmp(token, "param") == 0)
828	                     paramct++;
829	                 else if (strcmp(token, "nofill") == 0)
830	                     nofill++;
831	                 else if (strcmp(token, "/param") == 0)
832	                     paramct--;
833	                 else if (strcmp(token, "/nofill") == 0)
834	                     nofill--;
835	             }
836	        } else {
837	            if (paramct > 0)
838	                ; /* ignore params */
839	            else if (c == '\n' && nofill <= 0) {
840	                if (++newlinect > 1) putc(c, stdout);
841	            } else {
842	                if (newlinect == 1) putc(' ', stdout);
843	                newlinect = 0;
844	                putc(c, stdout);
845	            }
846	        }
847	    }
848	    /* The following line is only needed with line-buffering */
849	    putc('\n', stdout);
850	    exit(0);
851	}

853	It should be noted that one can do considerably better than this in
854	displaying text/enriched data on a dumb terminal. In particular, one
855	can replace font information such as "bold" with textual emphasis
856	(like *this* or _T_H_I_S_). One can also properly handle the
857	text/enriched formatting commands regarding indentation,
858	justification, and others. However, the above program is all that is
859	necessary in order to present text/enriched on a dumb terminal
860	without showing the user any formatting artifacts.

862	Appendix B--A Simple enriched-to-HTML Translator in C

864	It is fully expected that other text formatting standards like HTML
865	and SGML will supplant text/enriched in Internet mail. It is also
866	likely that as this happens, recipients of text/enriched mail will
867	wish to view such mail with an HTML viewer. To this end, the
868	following is a simple example of a C program to convert
869	text/enriched to HTML. Since the current version of HTML at the time
870	of this document's publication is HTML 2.0 defined in [RFC-1866],
871	this program converts to that standard. There are several
872	text/enriched commands that have no HTML 2.0 equivalent. In those
873	cases, this program simply puts those commands into processing
874	instructions; that is, surrounded by "". As in Appendix A,
875	the local newline convention (the single character represented by
876	"\n") is assumed by this program, but special CRLF handling might be
877	necessary on some systems.

879	#include 
880	#include 
881	#include 
882	#include 

884	main() {
885	    int c, i, paramct=0, nofill=0;
886	    char token[62], *p;

888	    while((c=getc(stdin)) != EOF) {
889	        if(c == '<') {
890	            c = getc(stdin);
891	            if(c == '<') {
892	                fputs("<", stdout);
893	            } else {
894	                ungetc(c, stdin);
895	                for (i=0, p=token; (c=getc(stdin)) != EOF && c != '>'; i++) {
896	                    if (i < sizeof(token)-1) *p++ = isupper(c) ? tolower(c) : c;
897	                }
898	                *p = '\0';
899	                if(c == EOF) break;
900	                if(strcmp(token, "/param") == 0) {
901	                    paramct--;
902	                    putc('>', stdout);
903	                } else if(paramct > 0) {
904	                    fputs("<", stdout);
905	                    fputs(token, stdout);
906	                    fputs(">", stdout);
907	                } else {
908	                    putc('<', stdout);
909	                    if(strcmp(token, "nofill") == 0) {
910	                        nofill++;
911	                        fputs("pre", stdout);
912	                    } else if(strcmp(token, "/nofill") == 0) {
913	                        nofill--;
914	                        fputs("/pre", stdout);
915	                    } else if(strcmp(token, "bold") == 0) {
916	                        fputs("b", stdout);
917	                    } else if(strcmp(token, "/bold") == 0) {
918	                        fputs("/b", stdout);
919	                    } else if(strcmp(token, "italic") == 0) {
920	                        fputs("i", stdout);
921	                    } else if(strcmp(token, "/italic") == 0) {
922	                        fputs("/i", stdout);
923	                    } else if(strcmp(token, "fixed") == 0) {
924	                        fputs("tt", stdout);
925	                    } else if(strcmp(token, "/fixed") == 0) {
926	                        fputs("/tt", stdout);
927	                    } else if(strcmp(token, "excerpt") == 0) {
928	                        fputs("blockquote", stdout);
929	                    } else if(strcmp(token, "/excerpt") == 0) {
930	                        fputs("/blockquote", stdout);
931	                    } else {
932	                        putc('?', stdout);
933	                        fputs(token, stdout);
934	                        if(strcmp(token, "param") == 0) {
935	                            paramct++;
936	                            putc(' ', stdout);
937	                            continue;
938	                        }
939	                    }
940	                    putc('>', stdout);
941	                }
942	            }
943	        } else if(c == '>') {
944	            fputs(">", stdout);
945	        } else {
946	            if(c == '\n' && nofill <= 0 && paramct <= 0) {
947	                while((i=getc(stdin)) == '\n') fputs("
", stdout);
948	                ungetc(i, stdin);
949	            }
950	            putc(c, stdout);
951	        }
952	    }
953	    /* The following line is only needed with line-buffering */
954	    putc('\n', stdout);
955	    exit(0);
956	}