| < draft-resnick-text-enriched-01.txt | draft-resnick-text-enriched-02.txt > | |||
|---|---|---|---|---|
| Network Working Group P. Resnick | Network Working Group P. Resnick | |||
| INTERNET-DRAFT A. Walker | INTERNET-DRAFT QUALCOMM | |||
| To-obsolete RFCs: 1523, 1563 December 1995 | To-obsolete RFCs: 1523, 1563 A. Walker | |||
| Category: Informational <draft-resnick-text-enriched-01.txt> | Category: Informational InterCon | |||
| January 1996 | ||||
| <draft-resnick-text-enriched-02.txt> | ||||
| The text/enriched MIME Content-type | The text/enriched MIME Content-type | |||
| Status of this Memo | Status of this Memo | |||
| This document is an Internet-Draft. Internet-Drafts are working | This document is an Internet-Draft. Internet-Drafts are working | |||
| documents of the Internet Engineering Task Force (IETF), its areas, | documents of the Internet Engineering Task Force (IETF), its areas, and | |||
| and its working groups. Note that other groups may also distribute | its working groups. Note that other groups may also distribute working | |||
| working documents as Internet-Drafts. | documents as Internet-Drafts. | |||
| Internet-Drafts are draft documents valid for a maximum of six | Internet-Drafts are draft documents valid for a maximum of six months | |||
| months and may be updated, replaced, or obsoleted by other documents | and may be updated, replaced, or obsoleted by other documents at any | |||
| at any time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference material | |||
| material or to cite them other than as "work in progress." | or to cite them other than as "work in progress." | |||
| To learn the current status of any Internet-Draft, please check the | To learn the current status of any Internet-Draft, please check the | |||
| "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow | "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow | |||
| Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), | Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), | |||
| munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or | munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or | |||
| ftp.isi.edu (US West Coast). | ftp.isi.edu (US West Coast). | |||
| Abstract | Abstract | |||
| MIME [RFC-1521] defines a format and general framework for the | MIME [RFC-1521] defines a format and general framework for the | |||
| representation of a wide variety of data types in Internet mail. | representation of a wide variety of data types in Internet mail. This | |||
| This document defines one particular type of MIME data, the | document defines one particular type of MIME data, the text/enriched | |||
| text/enriched MIME type. The text/enriched MIME type is intended to | MIME type. The text/enriched MIME type is intended to facilitate the | |||
| facilitate the wider interoperation of simple enriched text across a | wider interoperation of simple enriched text across a wide variety of | |||
| wide variety of hardware and software platforms. This document is | hardware and software platforms. This document is only a minor revision | |||
| only a minor revision to the text/enriched MIME type that was first | to the text/enriched MIME type that was first described in [RFC-1523] | |||
| described in [RFC-1523] and [RFC-1563], and is only intended to be | and [RFC-1563], and is only intended to be used in the short term until | |||
| used in the short term until other MIME types for text formatting in | other MIME types for text formatting in Internet mail are developed and | |||
| Internet mail are developed and deployed. | deployed. | |||
| The text/enriched MIME type | The text/enriched MIME type | |||
| In order to promote the wider interoperability of simple formatted | In order to promote the wider interoperability of simple formatted text, | |||
| text, this document defines an extremely simple subtype of the MIME | this document defines an extremely simple subtype of the MIME | |||
| content-type "text", the "text/enriched" subtype. The content-type | content-type "text", the "text/enriched" subtype. The content-type line | |||
| line for this type may have one optional parameter, the "charset" | for this type may have one optional parameter, the "charset" parameter, | |||
| parameter, with the same values permitted for the "text/plain" MIME | with the same values permitted for the "text/plain" MIME content-type. | |||
| content-type. | ||||
| The text/enriched subtype was designed to meet the following | The text/enriched subtype was designed to meet the following criteria: | |||
| criteria: | ||||
| 1. The syntax must be extremely simple to parse, so that even | 1. The syntax must be extremely simple to parse, so that even | |||
| teletype-oriented mail systems can easily strip away the | teletype-oriented mail systems can easily strip away the formatting | |||
| formatting information and leave only the readable text. | information and leave only the readable text. | |||
| 2. The syntax must be extensible to allow for new formatting | 2. The syntax must be extensible to allow for new formatting commands | |||
| commands that are deemed essential for some application. | that are deemed essential for some application. | |||
| 3. If the character set in use is ASCII or an 8- bit ASCII | 3. If the character set in use is ASCII or an 8- bit ASCII superset, | |||
| superset, then the raw form of the data must be readable enough | then the raw form of the data must be readable enough to be largely | |||
| to be largely unobjectionable in the event that it is displayed | unobjectionable in the event that it is displayed on the screen of | |||
| on the screen of the user of a non-MIME-conformant mail reader. | the user of a non-MIME-conformant mail reader. | |||
| 4. The capabilities must be extremely limited, to ensure that it | 4. The capabilities must be extremely limited, to ensure that it can | |||
| can represent no more than is likely to be representable by the | represent no more than is likely to be representable by the user's | |||
| user's primary word processor. While this limits what can be | primary word processor. While this limits what can be sent, it | |||
| sent, it increases the likelihood that what is sent can be | increases the likelihood that what is sent can be properly | |||
| properly displayed. | displayed. | |||
| There are other text formatting standards which meet some of these | There are other text formatting standards which meet some of these | |||
| criteria. In particular, HTML and SGML have come into widespread use | criteria. In particular, HTML and SGML have come into widespread use on | |||
| on the Internet. However, there are two important reasons that this | the Internet. However, there are two important reasons that this | |||
| document further promotes the use of text/enriched in Internet mail | document further promotes the use of text/enriched in Internet mail over | |||
| over other such standards: | other such standards: | |||
| 1. Most MIME-aware Internet mail applications are already able to | 1. Most MIME-aware Internet mail applications are already able to | |||
| either properly format text/enriched mail or, at the very | either properly format text/enriched mail or, at the very least, | |||
| least, are able to strip out the formatting commands and | are able to strip out the formatting commands and display the | |||
| display the readable text. The same is not true for HTML or | readable text. The same is not true for HTML or SGML. | |||
| SGML. | ||||
| 2. The current RFC on HTML [RFC-1866] and Internet Drafts on SGML | 2. The current RFC on HTML [RFC-1866] and Internet Drafts on SGML have | |||
| have many features which are not necessary for Internet mail, | many features which are not necessary for Internet mail, and are | |||
| and are missing a few capabilities that text/enriched already | missing a few capabilities that text/enriched already has. | |||
| has. | ||||
| For these reasons, this document is promoting the use of | For these reasons, this document is promoting the use of text/enriched | |||
| text/enriched until other Internet standards come into more | until other Internet standards come into more widespread use. For those | |||
| widespread use. For those who will want to use HTML, Appendix B of | who will want to use HTML, Appendix B of this document contains a very | |||
| this document contains a very simple C program that converts | simple C program that converts text/enriched to HTML 2.0 described in | |||
| text/enriched to HTML 2.0 described in [RFC-1866]. | [RFC-1866]. | |||
| Syntax | Syntax | |||
| The syntax of "text/enriched" is very simple. It represents text in | The syntax of "text/enriched" is very simple. It represents text in a | |||
| a single character set--US-ASCII by default, although a different | single character set--US-ASCII by default, although a different | |||
| character set can be specified by the use of the "charset" | character set can be specified by the use of the "charset" parameter. | |||
| parameter. (The semantics of text/enriched in non-ASCII character | (The semantics of text/enriched in non-ASCII character sets are | |||
| sets are discussed later in this document.) All characters represent | discussed later in this document.) All characters represent themselves, | |||
| themselves, with the exception of the "<" character (ASCII 60), | with the exception of the "<" character (ASCII 60), which is used to | |||
| which is used to mark the beginning of a formatting command. A | mark the beginning of a formatting command. A literal less-than sign | |||
| literal less-than sign ("<") can be represented by a sequence of two | ("<") can be represented by a sequence of two such characters, "<<". | |||
| such characters, "<<". | ||||
| Formatting instructions consist of formatting commands surrounded by | Formatting instructions consist of formatting commands surrounded by | |||
| angle brackets ("<>", ASCII 60 and 62). Each formatting command may | angle brackets ("<>", ASCII 60 and 62). Each formatting command may be | |||
| be no more than 60 characters in length, all in US-ASCII, restricted | no more than 60 characters in length, all in US-ASCII, restricted to the | |||
| to the alphanumeric and hyphen ("-") characters. Formatting commands | alphanumeric and hyphen ("-") characters. Formatting commands may be | |||
| may be preceded by a solidus ("/", ASCII 47), making them negations, | preceded by a solidus ("/", ASCII 47), making them negations, and such | |||
| and such negations must always exist to balance the initial opening | negations must always exist to balance the initial opening commands. | |||
| commands. Thus, if the formatting command "<bold>" appears at some | Thus, if the formatting command "<bold>" appears at some point, there | |||
| point, there must later be a "</bold>" to balance it. (NOTE: The 60 | must later be a "</bold>" to balance it. (NOTE: The 60 character limit | |||
| character limit on formatting commands does NOT include the "<", | on formatting commands does NOT include the "<", ">", or "/" characters | |||
| ">", or "/" characters that might be attached to such commands.) | that might be attached to such commands.) | |||
| Line break rules | Line break rules | |||
| Line breaks (CRLF pairs in standard network representation) are | Line breaks (CRLF pairs in standard network representation) are handled | |||
| handled specially. In particular, isolated CRLF pairs are translated | specially. In particular, isolated CRLF pairs are translated into a | |||
| into a single SPACE character. Sequences of N consecutive CRLF | single SPACE character. Sequences of N consecutive CRLF pairs, however, | |||
| pairs, however, are translated into N-1 actual line breaks. This | are translated into N-1 actual line breaks. This permits long lines of | |||
| permits long lines of data to be represented in a natural looking | data to be represented in a natural looking manner despite the frequency | |||
| manner despite the frequency of line-wrapping in Internet mailers. | of line-wrapping in Internet mailers. When preparing the data for mail | |||
| When preparing the data for mail transport, isolated line breaks | transport, isolated line breaks should be inserted wherever necessary to | |||
| should be inserted wherever necessary to keep each line shorter than | keep each line shorter than 80 characters. When preparing such data for | |||
| 80 characters. When preparing such data for presentation to the | presentation to the user, isolated line breaks should be replaced by a | |||
| user, isolated line breaks should be replaced by a single SPACE | single SPACE character, and N consecutive CRLF pairs should be presented | |||
| character, and N consecutive CRLF pairs should be presented to the | to the user as N-1 line breaks. | |||
| user as N-1 line breaks. | ||||
| Thus text/enriched data that looks like this: | Thus text/enriched data that looks like this: | |||
| This is | This is | |||
| a single | a single | |||
| line | line | |||
| This is the | This is the | |||
| next line. | next line. | |||
| skipping to change at line 155 ¶ | skipping to change at line 151 ¶ | |||
| This is a single line | This is a single line | |||
| This is the next line. | This is the next line. | |||
| This is the next section. | This is the next section. | |||
| The formatting commands, not all of which will be implemented by all | The formatting commands, not all of which will be implemented by all | |||
| implementations, are described in the following sections. | implementations, are described in the following sections. | |||
| Formatting Commands | Formatting Commands | |||
| The text/enriched formatting commands all begin with <commandname> | The text/enriched formatting commands all begin with <commandname> and | |||
| and end with </commandname>, affecting the formatting of the text | end with </commandname>, affecting the formatting of the text between | |||
| between those two tokens. The commands are described here, grouped | those two tokens. The commands are described here, grouped according to | |||
| according to type. | type. | |||
| Parameter Command | Parameter Command | |||
| Some of the formatting commands may require one or more associated | Some of the formatting commands may require one or more associated | |||
| parameters. The "param" command is a special formatting command used | parameters. The "param" command is a special formatting command used to | |||
| to include these parameters. | include these parameters. | |||
| Param | Param | |||
| Marks the affected text as command parameters, to be | Marks the affected text as command parameters, to be | |||
| interpreted or ignored by the text/enriched | interpreted or ignored by the text/enriched interpreter, | |||
| interpreter, but not to be shown to the reader. The | but not to be shown to the reader. The "param" command | |||
| "param" command always immediately follows some other | always immediately follows some other formatting command, | |||
| formatting command, and the parameter data indicates | and the parameter data indicates some additional | |||
| some additional information about the formatting that | information about the formatting that is to be done. The | |||
| is to be done. The syntax of the parameter data | syntax of the parameter data (whatever appears between | |||
| (whatever appears between the initial "<param>" and | the initial "<param>" and the terminating "</param>") is | |||
| the terminating "</param>") is defined for each | defined for each command that uses it. However, it is | |||
| command that uses it. However, it is always required | always required that the format of such data must not | |||
| that the format of such data must not contain nested | contain nested "param" commands, and either must not use | |||
| "param" commands, and either must not use the "<" | the "<" character or must use it in a way that is | |||
| character or must use it in a way that is compatible | compatible with text/enriched parsing. That is, the end | |||
| with text/enriched parsing. That is, the end of the | of the parameter data should be recognizable with either | |||
| parameter data should be recognizable with either of | of two algorithms: simply searching for the first | |||
| two algorithms: simply searching for the first | ||||
| occurrence of "</param>" or parsing until a balanced | occurrence of "</param>" or parsing until a balanced | |||
| "</param>" command is found. In either case, however, | "</param>" command is found. In either case, however, the | |||
| the parameter data should not be shown to the human | parameter data should not be shown to the human reader. | |||
| reader. | ||||
| Font-Alteration Commands | Font-Alteration Commands | |||
| The following formatting commands are intended to alter the font in | The following formatting commands are intended to alter the font in | |||
| which text is displayed, but not to alter the indentation or | which text is displayed, but not to alter the indentation or | |||
| justification state of the text: | justification state of the text: | |||
| Bold | Bold | |||
| causes the affected text to be in a bold font. Nested | causes the affected text to be in a bold font. Nested | |||
| bold commands have the same effect as a single bold | bold commands have the same effect as a single bold | |||
| command. | command. | |||
| Italic | Italic | |||
| causes the affected text to be in an italic font. | causes the affected text to be in an italic font. Nested | |||
| Nested italic commands have the same effect as a | italic commands have the same effect as a single italic | |||
| single italic command. | command. | |||
| Underline | Underline | |||
| causes the affected text to be underlined. Nested | causes the affected text to be underlined. Nested | |||
| underline commands have the same effect as a single | underline commands have the same effect as a single | |||
| underline command. | underline command. | |||
| Fixed | Fixed | |||
| causes the affected text to be in a fixed width font. | causes the affected text to be in a fixed width font. | |||
| Nested fixed commands have the same effect as a | Nested fixed commands have the same effect as a single | |||
| single fixed command. | fixed command. | |||
| FontFamily | FontFamily | |||
| causes the affected text to be displayed in a | causes the affected text to be displayed in a specified | |||
| specified typeface. The "fontfamily" command requires | typeface. The "fontfamily" command requires a parameter | |||
| a parameter that is specified by using the "param" | that is specified by using the "param" command. The | |||
| command. The parameter data is a case-insensitive | parameter data is a case-insensitive string containing | |||
| string containing the name of a font family. Any | the name of a font family. Any currently available font | |||
| currently available font family name (e.g. Times, | family name (e.g. Times, Palatino, Courier, etc.) may be | |||
| Palatino, Courier, etc.) may be used. This includes | used. This includes font families defined by commercial | |||
| font families defined by commercial type foundries | type foundries such as Adobe, BitStream, or any other | |||
| such as Adobe, BitStream, or any other such foundry. | such foundry. Note that implementations should only use | |||
| Note that implementations should only use the general | the general font family name, not the specific font name | |||
| font family name, not the specific font name (e.g. | (e.g. use "Times", not "TimesRoman" nor | |||
| use "Times", not "TimesRoman" nor "TimesBoldItalic"). | "TimesBoldItalic"). When nested, the inner "fontfamily" | |||
| When nested, the inner "fontfamily" command takes | command takes precedence. Also note that the "fontfamily" | |||
| precedence. Also note that the "fontfamily" command | command is advisory only; it should not be expected that | |||
| is advisory only; it should not be expected that | other implementations will honor the typeface information | |||
| other implementations will honor the typeface | in this command since the font capabilities of systems | |||
| information in this command since the font | vary drastically. | |||
| capabilities of systems vary drastically. | ||||
| Color | Color | |||
| causes the affected text to be displayed in a | causes the affected text to be displayed in a specified | |||
| specified color. The "color" command requires a | color. The "color" command requires a parameter that is | |||
| parameter that is specified by using the "param" | specified by using the "param" command. The parameter | |||
| command. The parameter data can be one of the | data can be one of the following: | |||
| following: | ||||
| red | red | |||
| blue | blue | |||
| green | green | |||
| yellow | yellow | |||
| cyan | cyan | |||
| magenta | magenta | |||
| black | black | |||
| white | white | |||
| or an RGB color value in the form: | or an RGB color value in the form: | |||
| ####,####,#### | ####,####,#### | |||
| where '#' is a hexadecimal digit '0' through '9', 'A' | where '#' is a hexadecimal digit '0' through '9', 'A' | |||
| through 'F', or 'a' through 'f'. The three 4-digit | through 'F', or 'a' through 'f'. The three 4-digit | |||
| hexadecimal values are the RGB values for red, green, | hexadecimal values are the RGB values for red, green, and | |||
| and blue respectively, where each component is | blue respectively, where each component is expressed as | |||
| expressed as an unsigned value between 0 (0000) and | an unsigned value between 0 (0000) and 65535 (FFFF). The | |||
| 65535 (FFFF). The default color for the message is | default color for the message is unspecified, though | |||
| unspecified, though black is a common choice in many | black is a common choice in many environments. When | |||
| environments. When nested, the inner "color" command | nested, the inner "color" command takes precedence. | |||
| takes precedence. | ||||
| Smaller | Smaller | |||
| causes the affected text to be in a smaller font. It | causes the affected text to be in a smaller font. It is | |||
| is recommended that the font size be changed by two | recommended that the font size be changed by two points, | |||
| points, but other amounts may be more appropriate in | but other amounts may be more appropriate in some | |||
| some environments. Nested smaller commands produce | environments. Nested smaller commands produce ever | |||
| ever smaller fonts, to the limits of the | smaller fonts, to the limits of the implementation's | |||
| implementation's capacity to reasonably display them, | capacity to reasonably display them, after which further | |||
| after which further smaller commands have no | smaller commands have no incremental effect. | |||
| incremental effect. | ||||
| Bigger | Bigger | |||
| causes the affected text to be in a bigger font. It | causes the affected text to be in a bigger font. It is | |||
| is recommended that the font size be changed by two | recommended that the font size be changed by two points, | |||
| points, but other amounts may be more appropriate in | but other amounts may be more appropriate in some | |||
| some environments. Nested bigger commands produce | environments. Nested bigger commands produce ever bigger | |||
| ever bigger fonts, to the limits of the | fonts, to the limits of the implementation's capacity to | |||
| implementation's capacity to reasonably display them, | reasonably display them, after which further bigger | |||
| after which further bigger commands have no | commands have no incremental effect. | |||
| incremental effect. | ||||
| While the "bigger" and "smaller" operators are effectively inverses, | While the "bigger" and "smaller" operators are effectively inverses, it | |||
| it is not recommended, for example, that "<smaller>" be used to end | is not recommended, for example, that "<smaller>" be used to end the | |||
| the effect of "<bigger>". This is properly done with "</bigger>". | effect of "<bigger>". This is properly done with "</bigger>". | |||
| Since the capabilities of implementations will vary, it is to be | Since the capabilities of implementations will vary, it is to be | |||
| expected that some implementations will not be able to act on some | expected that some implementations will not be able to act on some of | |||
| of the font-alteration commands. However, an implementation should | the font-alteration commands. However, an implementation should still | |||
| still display the text to the user in a reasonable fashion. In | display the text to the user in a reasonable fashion. In particular, the | |||
| particular, the lack of capability to display a particular font | lack of capability to display a particular font family, color, or other | |||
| family, color, or other text attribute does not mean that an | text attribute does not mean that an implementation should fail to | |||
| implementation should fail to display text. | display text. | |||
| Fill/Justification/Indentation Commands | Fill/Justification/Indentation Commands | |||
| Initially, text/enriched text is intended to be displayed fully | Initially, text/enriched text is intended to be displayed fully filled | |||
| filled (that is, using the rules specified for replacing CRLF pairs | (that is, using the rules specified for replacing CRLF pairs with spaces | |||
| with spaces or removing them as appropriate) with appropriate | or removing them as appropriate) with appropriate kerning and | |||
| kerning and letter-tracking, and using the maximum available margins | letter-tracking, and using the maximum available margins as suits the | |||
| as suits the capabilities of the receiving user agent software. | capabilities of the receiving user agent software. | |||
| The following commands alter that state. Each of these commands | The following commands alter that state. Each of these commands force a | |||
| force a line break before and after the formatting environment if | line break before and after the formatting environment if there is not | |||
| there is not otherwise a line break. For example, if one of these | otherwise a line break. For example, if one of these commands occurs | |||
| commands occurs anywhere other than the beginning of a line of text | anywhere other than the beginning of a line of text as presented, a new | |||
| as presented, a new line is begun. | line is begun. | |||
| Center | Center | |||
| causes the affected text to be centered. | causes the affected text to be centered. | |||
| FlushLeft | FlushLeft | |||
| causes the affected text to be left-justified with a | causes the affected text to be left-justified with a | |||
| ragged right margin. | ragged right margin. | |||
| FlushRight | FlushRight | |||
| causes the affected text to be right-justified with a | causes the affected text to be right-justified with a | |||
| ragged left margin. | ragged left margin. | |||
| FlushBoth | FlushBoth | |||
| causes the affected text to be filled and padded so | causes the affected text to be filled and padded so as to | |||
| as to create smooth left and right margins, i.e., to | create smooth left and right margins, i.e., to be fully | |||
| be fully justified. | justified. | |||
| ParaIndent | ParaIndent | |||
| causes the running margins of the affected text to be | causes the running margins of the affected text to be | |||
| moved in. The recommended indentation change is the | moved in. The recommended indentation change is the width | |||
| width of four characters, but this may differ among | of four characters, but this may differ among | |||
| implementations. The "paraindent" command requires a | implementations. The "paraindent" command requires a | |||
| parameter that is specified by using the "param" | parameter that is specified by using the "param" command. | |||
| command. The parameter data is a comma-seperated list | The parameter data is a comma-seperated list of one or | |||
| of one or more of the following: | more of the following: | |||
| Left | Left | |||
| causes the running left margin to be moved to | causes the running left margin to be moved to the | |||
| the right. | right. | |||
| Right | Right | |||
| causes the running right margin to be moved to | causes the running right margin to be moved to the | |||
| the left. | left. | |||
| In | In | |||
| causes the first line of the affected paragraph | causes the first line of the affected paragraph to | |||
| to be indented in addition to the running | be indented in addition to the running margin. The | |||
| margin. The remaining lines remain flush to the | remaining lines remain flush to the running margin. | |||
| running margin. | ||||
| Out | Out | |||
| causes all lines except for the first line of | causes all lines except for the first line of the | |||
| the affected paragraph to be indented in | affected paragraph to be indented in addition to the | |||
| addition to the running margin. The first line | running margin. The first line remains flush to the | |||
| remains flush to the running margin. | running margin. | |||
| Nofill | Nofill | |||
| causes the affected text to be displayed without | causes the affected text to be displayed without filling. | |||
| filling. That is, the text is displayed without using | That is, the text is displayed without using the rules | |||
| the rules for replacing CRLF pairs with spaces or | for replacing CRLF pairs with spaces or removing | |||
| removing consecutive sequences of CRLF pairs. | consecutive sequences of CRLF pairs. However, the current | |||
| However, the current state of the margins and | state of the margins and justification is honored; any | |||
| justification is honored; any indentation or | indentation or justification commands are still applied | |||
| justification commands are still applied to the text | to the text within the scope of the "nofill". | |||
| within the scope of the "nofill". | ||||
| The "center", "flushleft", "flushright", and "flushboth" commands | The "center", "flushleft", "flushright", and "flushboth" commands are | |||
| are mutually exclusive, and, when nested, the inner command takes | mutually exclusive, and, when nested, the inner command takes | |||
| precedence. | precedence. | |||
| The "nofill" command is mutually exclusive with the "in" and "out" | The "nofill" command is mutually exclusive with the "in" and "out" | |||
| parameters of the "paraindent" command; when they occur in the same | parameters of the "paraindent" command; when they occur in the same | |||
| scope, their behavior is undefined. | scope, their behavior is undefined. | |||
| The parameter data for the "paraindent" command my contain multiple | The parameter data for the "paraindent" command my contain multiple | |||
| occurances of the same parameter (i.e. "left", "right", "in", or | occurances of the same parameter (i.e. "left", "right", "in", or "out"). | |||
| "out"). Each occurance causes the text to be further indented in the | Each occurance causes the text to be further indented in the manner | |||
| manner indicated by that parameter. Nested "paraindent" commands | indicated by that parameter. Nested "paraindent" commands cause the | |||
| cause the affected text to be further indented according to the | affected text to be further indented according to the parameters. Note | |||
| parameters. Note that the "in" and "out" parameters for "paraindent" | that the "in" and "out" parameters for "paraindent" are mutually | |||
| are mutually exclusive; when they appear together or when nested | exclusive; when they appear together or when nested "paraindent" | |||
| "paraindent" commands contain both of them, their behavior is | commands contain both of them, their behavior is undefined. | |||
| undefined. | ||||
| For purposes of the "in" and "out" parameters, a paragraph is | For purposes of the "in" and "out" parameters, a paragraph is defined as | |||
| defined as text that is delimited by line breaks after applying the | text that is delimited by line breaks after applying the rules for | |||
| rules for replacing CRLF pairs with spaces or removing consecutive | replacing CRLF pairs with spaces or removing consecutive sequences of | |||
| sequences of CRLF pairs. For example, within the scope of an "out", | CRLF pairs. For example, within the scope of an "out", the line | |||
| the line following each CRLF is made flush with the running margin, | following each CRLF is made flush with the running margin, and | |||
| and subsequent lines are indented. Within the scope of an "in", the | subsequent lines are indented. Within the scope of an "in", the first | |||
| first line following each CRLF is indented, and subsequent lines | line following each CRLF is indented, and subsequent lines remain flush | |||
| remain flush to the running margin. | to the running margin. | |||
| Whether or not text is justified by default (that is, whether the | Whether or not text is justified by default (that is, whether the | |||
| default environment is "flushleft", "flushright", or "flushboth") is | default environment is "flushleft", "flushright", or "flushboth") is | |||
| unspecified, and depends on the preferences of the user, the | unspecified, and depends on the preferences of the user, the | |||
| capabilities of the local software and hardware, and the nature of | capabilities of the local software and hardware, and the nature of the | |||
| the character set in use. On systems where full justification is | character set in use. On systems where full justification is considered | |||
| considered undesirable, the "flushboth" environment may be identical | undesirable, the "flushboth" environment may be identical to the default | |||
| to the default environment. Note that full justification should | environment. Note that full justification should never be performed | |||
| never be performed inside of "center", "flushleft", "flushright", or | inside of "center", "flushleft", "flushright", or "nofill" environments. | |||
| "nofill" environments. Note also that for some non-ASCII character | Note also that for some non-ASCII character sets, full justification may | |||
| sets, full justification may be fundamentally inappropriate. | be fundamentally inappropriate. | |||
| Note that [RFC-1563] defined two additional indentation commands, | Note that [RFC-1563] defined two additional indentation commands, | |||
| "Indent" and "IndentRight". These commands did not force a line | "Indent" and "IndentRight". These commands did not force a line break, | |||
| break, and therefore their behavior was unpredictable since they | and therefore their behavior was unpredictable since they depended on | |||
| depended on the margins and character sizes that a particular | the margins and character sizes that a particular implementation used. | |||
| implementation used. Therefore, their use is deprecated and they | Therefore, their use is deprecated and they should be ignored just as | |||
| should be ignored just as other unrecognized commands. | other unrecognized commands. | |||
| Markup Commands | Markup Commands | |||
| Commands in this section, unlike the other text/enriched commands | Commands in this section, unlike the other text/enriched commands are | |||
| are declarative markup commands. Text/enriched is not intended as a | declarative markup commands. Text/enriched is not intended as a full | |||
| full markup language, but instead as a simple way to represent | markup language, but instead as a simple way to represent common | |||
| common formatting commands. Therefore, markup commands are purposely | formatting commands. Therefore, markup commands are purposely kept to a | |||
| kept to a minimum. It is only because each was deemed so prevalent | minimum. It is only because each was deemed so prevalent or necessary in | |||
| or necessary in an e-mail environment that these particular commands | an e-mail environment that these particular commands have been included | |||
| have been included at all. | at all. | |||
| Excerpt | Excerpt | |||
| causes the affected text to be interpreted as a | causes the affected text to be interpreted as a textual | |||
| textual excerpt from another source, probably a | excerpt from another source, probably a message being | |||
| message being responded to. Typically this will be | responded to. Typically this will be displayed using | |||
| displayed using indentation and an alternate font, or | indentation and an alternate font, or by indenting lines | |||
| by indenting lines and preceding them with "> ", but | and preceding them with "> ", but such decisions are up | |||
| such decisions are up to the implementation. Note | to the implementation. Note that as with the | |||
| that as with the justification commands, the excerpt | justification commands, the excerpt command implicitly | |||
| command implicitly begins and ends with a line break | begins and ends with a line break if one is not already | |||
| if one is not already there. Nested "excerpt" | there. Nested "excerpt" commands are acceptable and | |||
| commands are acceptable and should be interpreted as | should be interpreted as meaning that the excerpted text | |||
| meaning that the excerpted text was excerpted from | was excerpted from yet another source. Again, this can be | |||
| yet another source. Again, this can be displayed | displayed using additional indentation, different colors, | |||
| using additional indentation, different colors, etc. | etc. | |||
| Optionally, the "excerpt" command can take a | Optionally, the "excerpt" command can take a parameter by | |||
| parameter by using the "param" command. The format of | using the "param" command. The format of the data is | |||
| the data is unspecified, but it is intended to | unspecified, but it is intended to uniquely identify the | |||
| uniquely identify the text from which the excerpt is | text from which the excerpt is taken. With this | |||
| taken. With this information, an implementation | information, an implementation should be able to uniquely | |||
| should be able to uniquely identify the source of any | identify the source of any particular excerpt, especially | |||
| particular excerpt, especially if two or more | if two or more excerpts in the message are from the same | |||
| excerpts in the message are from the same source, and | source, and display it in some way that makes this | |||
| display it in some way that makes this apparent to | apparent to the user. | |||
| the user. | ||||
| Lang | Lang | |||
| causes the affected text to be interpreted as | causes the affected text to be interpreted as belonging | |||
| belonging to a particular language. This is most | to a particular language. This is most useful when two | |||
| useful when two different languages use the same | different languages use the same character set, but may | |||
| character set, but may require a different font or | require a different font or formatting depending on the | |||
| formatting depending on the language. For instance, | language. For instance, Chinese and Japanese share | |||
| Chinese and Japanese share similar character glyphs, | similar character glyphs, and in some character sets like | |||
| and in some character sets like UNICODE share common | UNICODE share common code points, but it is considered | |||
| code points, but it is considered very important that | very important that different fonts be used for the two | |||
| different fonts be used for the two languages, | languages, especially if they appear together, so that | |||
| especially if they appear together, so that meaning | meaning is not lost. Also, language information can be | |||
| is not lost. Also, language information can be used | used to allow for fancier text handling, like spell | |||
| to allow for fancier text handling, like spell | ||||
| checking or hyphenation. | checking or hyphenation. | |||
| The "lang" command requires a parameter using the | The "lang" command requires a parameter using the "param" | |||
| "param" command. The parameter data can be any of the | command. The parameter data can be any of the language | |||
| language tags specified in [RFC-1766], "Tags for the | tags specified in [RFC-1766], "Tags for the | |||
| Identification of Languages". These tags are the two | Identification of Languages". These tags are the two | |||
| letter language codes taken from [ISO-639] or can be | letter language codes taken from [ISO-639] or can be | |||
| other language codes that are registered according to | other language codes that are registered according to the | |||
| the instructions in the Langauge Tags RFC. Consult | instructions in the Langauge Tags RFC. Consult that memo | |||
| that memo for further information. | for further information. | |||
| Balancing and Nesting of Formatting Commands | Balancing and Nesting of Formatting Commands | |||
| Pairs of formatting commands must be properly balanced and nested. | Pairs of formatting commands must be properly balanced and nested. Thus, | |||
| Thus, a proper way to describe text in bold italics is: | a proper way to describe text in bold italics is: | |||
| <bold><italic>the-text</italic></bold> | <bold><italic>the-text</italic></bold> | |||
| or, alternately, | or, alternately, | |||
| <italic><bold>the-text</bold></italic> | <italic><bold>the-text</bold></italic> | |||
| but, in particular, the following is illegal text/enriched: | but, in particular, the following is illegal text/enriched: | |||
| <bold><italic>the-text</bold></italic> | <bold><italic>the-text</bold></italic> | |||
| The nesting requirement for formatting commands imposes a slightly | The nesting requirement for formatting commands imposes a slightly | |||
| higher burden upon the composers of text/enriched bodies, but | higher burden upon the composers of text/enriched bodies, but | |||
| potentially simplifies text/enriched displayers by allowing them to | potentially simplifies text/enriched displayers by allowing them to be | |||
| be stack-based. The main goal of text/enriched is to be simple | stack-based. The main goal of text/enriched is to be simple enough to | |||
| enough to make multifont, formatted email widely readable, so that | make multifont, formatted email widely readable, so that those with the | |||
| those with the capability of sending it will be able to do so with | capability of sending it will be able to do so with confidence. Thus | |||
| confidence. Thus slightly increased complexity in the composing | slightly increased complexity in the composing software was deemed a | |||
| software was deemed a reasonable tradeoff for simplified reading | reasonable tradeoff for simplified reading software. Nonetheless, | |||
| software. Nonetheless, implementors of text/enriched readers are | implementors of text/enriched readers are encouraged to follow the | |||
| encouraged to follow the general Internet guidelines of being | general Internet guidelines of being conservative in what you send and | |||
| conservative in what you send and liberal in what you accept. Those | liberal in what you accept. Those implementations that can do so are | |||
| implementations that can do so are encouraged to deal reasonably | encouraged to deal reasonably with improperly nested text/enriched data. | |||
| with improperly nested text/enriched data. | ||||
| Unrecognized formatting commands | Unrecognized formatting commands | |||
| Implementations must regard any unrecognized formatting command as | Implementations must regard any unrecognized formatting command as | |||
| "no-op" commands, that is, as commands having no effect, thus | "no-op" commands, that is, as commands having no effect, thus | |||
| facilitating future extensions to "text/enriched". Private | facilitating future extensions to "text/enriched". Private extensions | |||
| extensions may be defined using formatting commands that begin with | may be defined using formatting commands that begin with "X-", by | |||
| "X-", by analogy to Internet mail header field names. | analogy to Internet mail header field names. | |||
| In order to formally define extended commands, a new Internet | In order to formally define extended commands, a new Internet document | |||
| document should be published. | should be published. | |||
| White Space in Text/enriched Data | White Space in Text/enriched Data | |||
| No special behavior is required for the SPACE or TAB (HT) character. | No special behavior is required for the SPACE or TAB (HT) character. It | |||
| It is recommended, however, that, at least when fixed-width fonts | is recommended, however, that, at least when fixed-width fonts are in | |||
| are in use, the common semantics of the TAB (HT) character should be | use, the common semantics of the TAB (HT) character should be observed, | |||
| observed, namely that it moves to the next column position that is a | namely that it moves to the next column position that is a multiple of | |||
| multiple of 8. (In other words, if a TAB (HT) occurs in column n, | 8. (In other words, if a TAB (HT) occurs in column n, where the leftmost | |||
| where the leftmost column is column 0, then that TAB (HT) should be | column is column 0, then that TAB (HT) should be replaced by 8-(n mod 8) | |||
| replaced by 8-(n mod 8) SPACE characters.) It should also be noted | SPACE characters.) It should also be noted that some mail gateways are | |||
| that some mail gateways are notorious for losing (or, less commonly, | notorious for losing (or, less commonly, adding) white space at the end | |||
| adding) white space at the end of lines, so reliance on SPACE or TAB | of lines, so reliance on SPACE or TAB characters at the end of a line is | |||
| characters at the end of a line is not recommended. | not recommended. | |||
| Initial State of a text/enriched interpreter | Initial State of a text/enriched interpreter | |||
| Text/enriched is assumed to begin with filled text in a | Text/enriched is assumed to begin with filled text in a variable-width | |||
| variable-width font in a normal typeface and a size that is average | font in a normal typeface and a size that is average for thecurrent | |||
| for thecurrent display and user. The left and right margins are | display and user. The left and right margins are assumed to be maximal, | |||
| assumed to be maximal, that is, at the leftmost and rightmost | that is, at the leftmost and rightmost acceptable positions. | |||
| acceptable positions. | ||||
| Non-ASCII character sets | Non-ASCII character sets | |||
| One of the great benefits of MIME is the ability to use different | One of the great benefits of MIME is the ability to use different | |||
| varieties of non-ASCII text in messages. To use non-ASCII text in a | varieties of non-ASCII text in messages. To use non-ASCII text in a | |||
| message, normally a charset parameter is specified in the | message, normally a charset parameter is specified in the Content-type | |||
| Content-type line that indicates the character set being used. For | line that indicates the character set being used. For purposes of this | |||
| purposes of this RFC, any legal MIME charset parameter can be used | RFC, any legal MIME charset parameter can be used with the text/enriched | |||
| with the text/enriched Content-type. However, there are two | Content-type. However, there are two difficulties that arise with regard | |||
| difficulties that arise with regard to the text/enriched | to the text/enriched Content-type when non-ASCII text is desired. The | |||
| Content-type when non-ASCII text is desired. The first problem | first problem involves difficulties that occur when the user wishes to | |||
| involves difficulties that occur when the user wishes to create text | create text which would normally require multiple non-ASCII character | |||
| which would normally require multiple non-ASCII character sets in | sets in the same text/enriched message. The second problem is an | |||
| the same text/enriched message. The second problem is an ambiguity | ambiguity that arises because of the text/enriched use of the "<" | |||
| that arises because of the text/enriched use of the "<" character in | character in formatting commands. | |||
| formatting commands. | ||||
| Using multiple non-ASCII character sets | Using multiple non-ASCII character sets | |||
| Normally, if a user wishes to produce text which contains characters | Normally, if a user wishes to produce text which contains characters | |||
| from entirely different character sets within the same MIME message | from entirely different character sets within the same MIME message (for | |||
| (for example, using Russian Cyrillic characters from ISO 8859-5 and | example, using Russian Cyrillic characters from ISO 8859-5 and Hebrew | |||
| Hebrew characters from ISO 8859-8), a multipart message is used. | characters from ISO 8859-8), a multipart message is used. Every time a | |||
| Every time a new character set is desired, a new MIME body part is | new character set is desired, a new MIME body part is started with | |||
| started with different character sets specified in the charset | different character sets specified in the charset parameter of the | |||
| parameter of the Content-type line. However, using multiple | Content-type line. However, using multiple character sets this way in | |||
| character sets this way in text/enriched messages introduces | text/enriched messages introduces problems. Since a change in the | |||
| problems. Since a change in the charset parameter requires a new | charset parameter requires a new part, text/enriched formatting commands | |||
| part, text/enriched formatting commands used in the first part would | used in the first part would not be able to apply to text that occurs in | |||
| not be able to apply to text that occurs in subsequent parts. It is | subsequent parts. It is not possible for text/enriched formatting | |||
| not possible for text/enriched formatting commands to apply across | commands to apply across MIME body part boundaries. | |||
| MIME body part boundaries. | ||||
| [RFC-1341] attempted to get around this problem in the now obsolete | [RFC-1341] attempted to get around this problem in the now obsolete | |||
| text/richtext format by introducing different character set | text/richtext format by introducing different character set formatting | |||
| formatting commands like "iso-8859-5" and "us-ascii". But this, or | commands like "iso-8859-5" and "us-ascii". But this, or even a more | |||
| even a more general solution along the same lines, is still | general solution along the same lines, is still undesirable: It is | |||
| undesirable: It is common for a MIME application to decide, for | common for a MIME application to decide, for example, what character | |||
| example, what character font resources or character lookup tables it | font resources or character lookup tables it will require based on the | |||
| will require based on the information provided by the charset | information provided by the charset parameter of the Content-type line, | |||
| parameter of the Content-type line, before it even begins to | before it even begins to interpret or display the data in that body | |||
| interpret or display the data in that body part. By allowing the | part. By allowing the text/enriched interpreter to subsequently change | |||
| text/enriched interpreter to subsequently change the character set, | the character set, perhaps to one completely different from the charset | |||
| perhaps to one completely different from the charset specified in | specified in the Content-type line (with potentially much different | |||
| the Content-type line (with potentially much different resource | resource requirements), too much burden would be placed on the | |||
| requirements), too much burden would be placed on the text/enriched | text/enriched interpreter itself. | |||
| interpreter itself. | ||||
| Therefore, if multiple types of non-ASCII characters are desired in | Therefore, if multiple types of non-ASCII characters are desired in a | |||
| a text/enriched document, one of the following two methods must be | text/enriched document, one of the following two methods must be used: | |||
| used: | ||||
| 1. For cases where the different types of non-ASCII text can be | 1. For cases where the different types of non-ASCII text can be | |||
| limited to their own paragraphs with distinct formatting, a | limited to their own paragraphs with distinct formatting, a | |||
| multipart message can be used with each part having a | multipart message can be used with each part having a Content-Type | |||
| Content-Type of text/enriched and a different charset | of text/enriched and a different charset parameter. The one caveat | |||
| parameter. The one caveat to using this method is that each new | to using this method is that each new part must start in the | |||
| part must start in the initial state for a text/enriched | initial state for a text/enriched document. That means that all of | |||
| document. That means that all of the text/enriched commands in | the text/enriched commands in the preceding part must be properly | |||
| the preceding part must be properly balanced with ending | balanced with ending commands before the next text/enriched part | |||
| commands before the next text/enriched part begins. Also, each | begins. Also, each text/enriched part must begin a new paragraph. | |||
| text/enriched part must begin a new paragraph. | ||||
| 2. If different types of non-ASCII text are to appear in the same | 2. If different types of non-ASCII text are to appear in the same line | |||
| line or paragraph, or if text/enriched formatting (e.g. | or paragraph, or if text/enriched formatting (e.g. margins, | |||
| margins, typeface, justification) is required across several | typeface, justification) is required across several different types | |||
| different types of non-ASCII text, a single text/enriched body | of non-ASCII text, a single text/enriched body part should be used | |||
| part should be used with a character set specified that | with a character set specified that contains all of the required | |||
| contains all of the required characters. For example, a charset | characters. For example, a charset parameter of "UNICODE-1-1-UTF-7" | |||
| parameter of "UNICODE-1-1-UTF-7" as specified in [RFC-1642] | as specified in [RFC-1642] could be used for such purposes. Not | |||
| could be used for such purposes. Not only does UNICODE contain | only does UNICODE contain all of the characters that can be | |||
| all of the characters that can be represented in all of the | represented in all of the other registered ISO 8859 MIME character | |||
| other registered ISO 8859 MIME character sets, but UTF-7 is | sets, but UTF-7 is fully compatible with other aspects of the | |||
| fully compatible with other aspects of the text/enriched | text/enriched standard, including the use of the "<" character | |||
| standard, including the use of the "<" character referred to | referred to below. Any other character sets that are specified for | |||
| below. Any other character sets that are specified for use in | use in MIME which contain different types of non-ASCII text can | |||
| MIME which contain different types of non-ASCII text can also | also be used in these instances. | |||
| be used in these instances. | ||||
| Use of the "<" character in formatting commands | Use of the "<" character in formatting commands | |||
| If the character set specified by the charset parameter on the | If the character set specified by the charset parameter on the | |||
| Content-type line is anything other than "US- ASCII", this means | Content-type line is anything other than "US- ASCII", this means that | |||
| that the text being described by text/enriched formatting commands | the text being described by text/enriched formatting commands is in a | |||
| is in a non-ASCII character set. However, the commands themselves | non-ASCII character set. However, the commands themselves are still the | |||
| are still the same ASCII commands that are defined in this document. | same ASCII commands that are defined in this document. This creates an | |||
| This creates an ambiguity only with reference to the "<" character, | ambiguity only with reference to the "<" character, the octet with | |||
| the octet with numeric value 60. In single byte character sets, such | numeric value 60. In single byte character sets, such as the ISO-8859 | |||
| as the ISO-8859 family, this is not a problem; the octet 60 can be | family, this is not a problem; the octet 60 can be quoted by including | |||
| quoted by including it twice, just as for ASCII. The problem is more | it twice, just as for ASCII. The problem is more complicated, however, | |||
| complicated, however, in the case of multi-byte character sets, | in the case of multi-byte character sets, where the octet 60 might | |||
| where the octet 60 might appear at any point in the byte sequence | appear at any point in the byte sequence for any of several characters. | |||
| for any of several characters. | ||||
| In practice, however, most multi-byte character sets address this | In practice, however, most multi-byte character sets address this | |||
| problem internally. For example, the UNICODE character sets can use | problem internally. For example, the UNICODE character sets can use the | |||
| the UTF-7 encoding which preserves all of the important ASCII | UTF-7 encoding which preserves all of the important ASCII characters in | |||
| characters in their single byte form. The ISO-2022 family of | their single byte form. The ISO-2022 family of character sets can use | |||
| character sets can use certain character sequences to switch back | certain character sequences to switch back into ASCII at any moment. | |||
| into ASCII at any moment. Therefore it is specified that, before | Therefore it is specified that, before text/enriched formatting | |||
| text/enriched formatting commands, the prevailing character set | commands, the prevailing character set should be "switched back" into | |||
| should be "switched back" into ASCII, and that only those characters | ASCII, and that only those characters which would be interpreted as "<" | |||
| which would be interpreted as "<" in plain text should be | in plain text should be interpreted as token delimiters in | |||
| interpreted as token delimiters in text/enriched. | text/enriched. | |||
| The question of what to do for hypothetical future character sets | The question of what to do for hypothetical future character sets that | |||
| that do not subsume ASCII is not addressed in this memo. | do not subsume ASCII is not addressed in this memo. | |||
| Minimal text/enriched conformance | Minimal text/enriched conformance | |||
| A minimal text/enriched implementation is one that converts "<<" to | A minimal text/enriched implementation is one that converts "<<" to "<", | |||
| "<", removes everything between a <param> command and the next | removes everything between a <param> command and the next balancing | |||
| balancing </param> command, removes all other formatting commands | </param> command, removes all other formatting commands (all text | |||
| (all text enclosed in angle brackets), and, outside of <nofill> | enclosed in angle brackets), and, outside of <nofill> environments, | |||
| environments, converts any series of n CRLFs to n-1 CRLFs, and | converts any series of n CRLFs to n-1 CRLFs, and converts any lone CRLF | |||
| converts any lone CRLF pairs to SPACE. | pairs to SPACE. | |||
| Notes for Implementors | Notes for Implementors | |||
| It is recognized that implementors of future mail systems will want | It is recognized that implementors of future mail systems will want rich | |||
| rich text functionality far beyond that currently defined for | text functionality far beyond that currently defined for text/enriched. | |||
| text/enriched. The intent of text/enriched is to provide a common | The intent of text/enriched is to provide a common format for expressing | |||
| format for expressing that functionality in a form in which much of | that functionality in a form in which much of it, at least, will be | |||
| it, at least, will be understood by interoperating software. Thus, | understood by interoperating software. Thus, in particular, software | |||
| in particular, software with a richer notion of formatted text than | with a richer notion of formatted text than text/enriched can still use | |||
| text/enriched can still use text/enriched as its basic | text/enriched as its basic representation, but can extend it with new | |||
| representation, but can extend it with new formatting commands and | formatting commands and by hiding information specific to that software | |||
| by hiding information specific to that software system in | system in text/enriched <param> constructs. As such systems evolve, it | |||
| text/enriched <param> constructs. As such systems evolve, it is | is expected that the definition of text/enriched will be further refined | |||
| expected that the definition of text/enriched will be further | by future published specifications, but text/enriched as defined here | |||
| refined by future published specifications, but text/enriched as | provides a platform on which evolutionary refinements can be based. | |||
| defined here provides a platform on which evolutionary refinements | ||||
| can be based. | ||||
| An expected common way that sophisticated mail programs will | An expected common way that sophisticated mail programs will generate | |||
| generate text/enriched data is as part of a multipart/alternative | text/enriched data is as part of a multipart/alternative construct. For | |||
| construct. For example, a mail agent that can generate enriched mail | example, a mail agent that can generate enriched mail in ODA format can | |||
| in ODA format can generate that mail in a more widely interoperable | generate that mail in a more widely interoperable form by generating | |||
| form by generating both text/enriched and ODA versions of the same | both text/enriched and ODA versions of the same data, e.g.: | |||
| data, e.g.: | ||||
| Content-type: multipart/alternative; boundary=foo | Content-type: multipart/alternative; boundary=foo | |||
| --foo | --foo | |||
| Content-type: text/enriched | Content-type: text/enriched | |||
| [text/enriched version of data] | [text/enriched version of data] | |||
| --foo Content-type: application/oda | --foo Content-type: application/oda | |||
| [ODA version of data] | [ODA version of data] | |||
| --foo-- | --foo-- | |||
| If such a message is read using a MIME-conformant mail reader that | If such a message is read using a MIME-conformant mail reader that | |||
| understands ODA, the ODA version will be displayed; otherwise, the | understands ODA, the ODA version will be displayed; otherwise, the | |||
| text/enriched version will be shown. | text/enriched version will be shown. | |||
| In some environments, it might be impossible to combine certain | In some environments, it might be impossible to combine certain | |||
| text/enriched formatting commands, whereas in others they might be | text/enriched formatting commands, whereas in others they might be | |||
| combined easily. For example, the combination of <bold> and <italic> | combined easily. For example, the combination of <bold> and <italic> | |||
| might produce bold italics on systems that support such fonts, but | might produce bold italics on systems that support such fonts, but there | |||
| there exist systems that can make text bold or italicized, but not | exist systems that can make text bold or italicized, but not both. In | |||
| both. In such cases, the most recently issued (innermost) recognized | such cases, the most recently issued (innermost) recognized formatting | |||
| formatting command should be preferred. | command should be preferred. | |||
| One of the major goals in the design of text/enriched was to make it | One of the major goals in the design of text/enriched was to make it so | |||
| so simple that even text-only mailers will implement enriched-to- | simple that even text-only mailers will implement enriched-to- | |||
| plain-text translators, thus increasing the likelihood that enriched | plain-text translators, thus increasing the likelihood that enriched | |||
| text will become "safe" to use very widely. To demonstrate this | text will become "safe" to use very widely. To demonstrate this | |||
| simplicity, an extremely simple C program that converts | simplicity, an extremely simple C program that converts text/enriched | |||
| text/enriched input into plain text output is included in Appendix | input into plain text output is included in Appendix A. | |||
| A. | ||||
| Extensions to text/enriched | Extensions to text/enriched | |||
| It is expected that various mail system authors will desire | It is expected that various mail system authors will desire extensions | |||
| extensions to text/enriched. The simple syntax of text/enriched, and | to text/enriched. The simple syntax of text/enriched, and the | |||
| the specification that unrecognized formatting commands should | specification that unrecognized formatting commands should simply be | |||
| simply be ignored, are intended to promote such extensions. | ignored, are intended to promote such extensions. | |||
| An Example | An Example | |||
| Putting all this together, the following "text/enriched" body | Putting all this together, the following "text/enriched" body fragment: | |||
| fragment: | ||||
| From: Nathaniel Borenstein <nsb@bellcore.com> | From: Nathaniel Borenstein <nsb@bellcore.com> | |||
| To: Ned Freed <ned@innosoft.com> | To: Ned Freed <ned@innosoft.com> | |||
| Content-type: text/enriched | Content-type: text/enriched | |||
| <bold>Now</bold> is the time for <italic>all</italic> | <bold>Now</bold> is the time for <italic>all</italic> | |||
| good men | good men | |||
| <smaller>(and <<women>)</smaller> to | <smaller>(and <<women>)</smaller> to | |||
| <ignoreme>come</ignoreme> | <ignoreme>come</ignoreme> | |||
| skipping to change at line 756 ¶ | skipping to change at line 726 ¶ | |||
| <smaller> | <smaller> | |||
| should REALLY be called | should REALLY be called | |||
| <tinier> | <tinier> | |||
| and that I am always right. | and that I am always right. | |||
| -- the end | -- the end | |||
| where the word "beloved" would be in red on a color display. | where the word "beloved" would be in red on a color display. | |||
| Security Considerations | Security Considerations | |||
| Security issues are not discussed in this memo, as the mechanism | Security issues are not discussed in this memo, as the mechanism raises | |||
| raises no security issues. | no security issues. | |||
| Author's Address | Author's Address | |||
| For more information, the authors of this document may be contacted | For more information, the authors of this document may be contacted via | |||
| via Internet mail: | Internet mail: | |||
| Peter W. Resnick | Peter W. Resnick | |||
| QUALCOMM Incorporated | QUALCOMM Incorporated | |||
| 1009 North Busey Avenue | 6455 Lusk Boulevard | |||
| Urbana, IL 61801-1607 | San Diego, CA 92121-2779 | |||
| Phone: +1 217 337 1905 | Phone: +1 619 587 1121 | |||
| FAX: +1 217 337 1905 | FAX: +1 619 658 2230 | |||
| e-mail: presnick@qualcomm.com | e-mail: presnick@qualcomm.com | |||
| Amanda Walker | Amanda Walker | |||
| InterCon Systems Corporation | InterCon Systems Corporation | |||
| 950 Herndon Parkway | 950 Herndon Parkway | |||
| Herndon, VA 22070 | Herndon, VA 22070 | |||
| Phone: +1 703 709 5500 | Phone: +1 703 709 5500 | |||
| FAX: +1 703 709 5555 | FAX: +1 703 709 5555 | |||
| e-mail: amanda@intercon.com | e-mail: amanda@intercon.com | |||
| Acknowledgements | Acknowledgements | |||
| The authors gratefully acknowledge the input of many contributors, | ||||
| readers, and implementors of the specification in this document. | ||||
| Particular thanks are due to Nathaniel Borenstein, the original author | ||||
| of RFC 1563. | ||||
| References | References | |||
| [RFC-1341] | [RFC-1341] | |||
| Borenstein, N., Freed, N., "MIME (Multipurpose Internet Mail | ||||
| Extensions): Mechanisms for Specifying and Describing the Format of | ||||
| Internet Message Bodies", 06/11/1992. | ||||
| [RFC-1521] | [RFC-1521] | |||
| Borenstein, N., Freed, N., "MIME (Multipurpose Internet Mail | ||||
| Extensions) Part One: Mechanisms for Specifying and Describing the | ||||
| Format of Internet Message Bodies", 09/23/1993. | ||||
| [RFC-1523] | [RFC-1523] | |||
| Borenstein, N., "The text/enriched MIME Content-type", 09/23/1993. | ||||
| [RFC-1563] | [RFC-1563] | |||
| Borenstein, N., "The text/enriched MIME Content-type", 01/10/1994. | ||||
| [RFC-1642] | [RFC-1642] | |||
| Goldsmith, D., Davis, M., "UTF-7 - A Mail-Safe Transformation | ||||
| Format of Unicode", 07/13/1994. | ||||
| [RFC-1766] | [RFC-1766] | |||
| Alvestrand, H., "Tags for the Identification of Languages", | ||||
| 03/02/1995. | ||||
| [RFC-1866] | [RFC-1866] | |||
| Berners-Lee, T., Connolly, D., "Hypertext Markup Language - 2.0", | ||||
| 11/03/1995. | ||||
| Appendix A--A Simple enriched-to-plain Translator in C | Appendix A--A Simple enriched-to-plain Translator in C | |||
| One of the major goals in the design of the text/enriched subtype of | One of the major goals in the design of the text/enriched subtype of the | |||
| the text Content-Type is to make formatted text so simple that even | text Content-Type is to make formatted text so simple that even | |||
| text-only mailers will implement enriched-to-plain-text translators, | text-only mailers will implement enriched-to-plain-text translators, | |||
| thus increasing the likelihood that multifont text will become | thus increasing the likelihood that multifont text will become "safe" to | |||
| "safe" to use very widely. To demonstrate this simplicity, what | use very widely. To demonstrate this simplicity, what follows is a | |||
| follows is a simple C program that converts text/enriched input into | simple C program that converts text/enriched input into plain text | |||
| plain text output. Note that the local newline convention (the | output. Note that the local newline convention (the single character | |||
| single character represented by "\n") is assumed by this program, | represented by "\n") is assumed by this program, but that special CRLF | |||
| but that special CRLF handling might be necessary on some systems. | handling might be necessary on some systems. | |||
| #include <ctype.h> | #include <ctype.h> | |||
| #include <stdio.h> | #include <stdio.h> | |||
| #include <stdlib.h> | #include <stdlib.h> | |||
| #include <string.h> | #include <string.h> | |||
| main() { | main() { | |||
| int c, i, paramct=0, newlinect=0, nofill=0; | int c, i, paramct=0, newlinect=0, nofill=0; | |||
| char token[62], *p; | char token[62], *p; | |||
| while ((c=getc(stdin)) != EOF) { | while ((c=getc(stdin)) != EOF) { | |||
| if (c == '<') { | if (c == '<') { | |||
| if (newlinect == 1) putc(' ', stdout); | if (newlinect == 1) putc(' ', stdout); | |||
| newlinect = 0; | newlinect = 0; | |||
| c = getc(stdin); | c = getc(stdin); | |||
| if (c == '<') { | if (c == '<') { | |||
| if (paramct <= 0) putc(c, stdout); | if (paramct <= 0) putc(c, stdout); | |||
| } else { | } else { | |||
| ungetc(c, stdin); | ungetc(c, stdin); | |||
| for (i=0, p=token; (c=getc(stdin)) != EOF && c != '>'; i++) { | for (i=0, p=token; (c=getc(stdin)) != EOF && c != '>'; i++) { | |||
| if (i < sizeof(token)-1) *p++ = isupper(c) ? tolower(c) : c; | if (i < sizeof(token)-1) *p++ = isupper(c) ? tolower(c) : c; | |||
| } | } | |||
| *p = '\0'; | *p = '\0'; | |||
| if (c == EOF) break; | if (c == EOF) break; | |||
| if (strcmp(token, "param") == 0) | if (strcmp(token, "param") == 0) | |||
| paramct++; | paramct++; | |||
| else if (strcmp(token, "nofill") == 0) | else if (strcmp(token, "nofill") == 0) | |||
| nofill++; | nofill++; | |||
| else if (strcmp(token, "/param") == 0) | else if (strcmp(token, "/param") == 0) | |||
| paramct--; | paramct--; | |||
| else if (strcmp(token, "/nofill") == 0) | else if (strcmp(token, "/nofill") == 0) | |||
| nofill--; | nofill--; | |||
| } | } | |||
| } else { | } else { | |||
| if (paramct > 0) | if (paramct > 0) | |||
| ; /* ignore params */ | ; /* ignore params */ | |||
| else if (c == '\n' && nofill <= 0) { | else if (c == '\n' && nofill <= 0) { | |||
| if (++newlinect > 1) putc(c, stdout); | if (++newlinect > 1) putc(c, stdout); | |||
| } else { | } else { | |||
| if (newlinect == 1) putc(' ', stdout); | if (newlinect == 1) putc(' ', stdout); | |||
| newlinect = 0; | newlinect = 0; | |||
| putc(c, stdout); | putc(c, stdout); | |||
| } | } | |||
| } | ||||
| } | } | |||
| } | /* The following line is only needed with line-buffering */ | |||
| /* The following line is only needed with line-buffering */ | putc('\n', stdout); | |||
| putc('\n', stdout); | exit(0); | |||
| exit(0); | ||||
| } | } | |||
| It should be noted that one can do considerably better than this in | It should be noted that one can do considerably better than this in | |||
| displaying text/enriched data on a dumb terminal. In particular, one | displaying text/enriched data on a dumb terminal. In particular, one can | |||
| can replace font information such as "bold" with textual emphasis | replace font information such as "bold" with textual emphasis (like | |||
| (like *this* or _T_H_I_S_). One can also properly handle the | *this* or _T_H_I_S_). One can also properly handle the text/enriched | |||
| text/enriched formatting commands regarding indentation, | formatting commands regarding indentation, justification, and others. | |||
| justification, and others. However, the above program is all that is | However, the above program is all that is necessary in order to present | |||
| necessary in order to present text/enriched on a dumb terminal | text/enriched on a dumb terminal without showing the user any formatting | |||
| without showing the user any formatting artifacts. | artifacts. | |||
| Appendix B--A Simple enriched-to-HTML Translator in C | Appendix B--A Simple enriched-to-HTML Translator in C | |||
| It is fully expected that other text formatting standards like HTML | It is fully expected that other text formatting standards like HTML and | |||
| and SGML will supplant text/enriched in Internet mail. It is also | SGML will supplant text/enriched in Internet mail. It is also likely | |||
| likely that as this happens, recipients of text/enriched mail will | that as this happens, recipients of text/enriched mail will wish to view | |||
| wish to view such mail with an HTML viewer. To this end, the | such mail with an HTML viewer. To this end, the following is a simple | |||
| following is a simple example of a C program to convert | example of a C program to convert text/enriched to HTML. Since the | |||
| text/enriched to HTML. Since the current version of HTML at the time | current version of HTML at the time of this document's publication is | |||
| of this document's publication is HTML 2.0 defined in [RFC-1866], | HTML 2.0 defined in [RFC-1866], this program converts to that standard. | |||
| this program converts to that standard. There are several | There are several text/enriched commands that have no HTML 2.0 | |||
| text/enriched commands that have no HTML 2.0 equivalent. In those | equivalent. In those cases, this program simply puts those commands into | |||
| cases, this program simply puts those commands into processing | processing instructions; that is, surrounded by "<?" and ">". As in | |||
| instructions; that is, surrounded by "<?" and ">". As in Appendix A, | Appendix A, the local newline convention (the single character | |||
| the local newline convention (the single character represented by | represented by "\n") is assumed by this program, but special CRLF | |||
| "\n") is assumed by this program, but special CRLF handling might be | handling might be necessary on some systems. | |||
| necessary on some systems. | ||||
| #include <ctype.h> | #include <ctype.h> | |||
| #include <stdio.h> | #include <stdio.h> | |||
| #include <stdlib.h> | #include <stdlib.h> | |||
| #include <string.h> | #include <string.h> | |||
| main() { | main() { | |||
| int c, i, paramct=0, nofill=0; | int c, i, paramct=0, nofill=0; | |||
| char token[62], *p; | char token[62], *p; | |||
| while((c=getc(stdin)) != EOF) { | while((c=getc(stdin)) != EOF) { | |||
| if(c == '<') { | if(c == '<') { | |||
| c = getc(stdin); | c = getc(stdin); | |||
| if(c == '<') { | if(c == '<') { | |||
| fputs("<", stdout); | fputs("<", stdout); | |||
| } else { | } else { | |||
| ungetc(c, stdin); | ungetc(c, stdin); | |||
| for (i=0, p=token; (c=getc(stdin)) != EOF && c != '>'; i++) { | for (i=0, p=token; (c=getc(stdin)) != EOF && c != '>'; i++) { | |||
| if (i < sizeof(token)-1) *p++ = isupper(c) ? tolower(c) : c; | if (i < sizeof(token)-1) *p++ = isupper(c) ? tolower(c) : c; | |||
| } | } | |||
| *p = '\0'; | *p = '\0'; | |||
| if(c == EOF) break; | if(c == EOF) break; | |||
| if(strcmp(token, "/param") == 0) { | if(strcmp(token, "/param") == 0) { | |||
| paramct--; | paramct--; | |||
| putc('>', stdout); | putc('>', stdout); | |||
| } else if(paramct > 0) { | } else if(paramct > 0) { | |||
| fputs("<", stdout); | fputs("<", stdout); | |||
| fputs(token, stdout); | fputs(token, stdout); | |||
| fputs(">", stdout); | fputs(">", stdout); | |||
| } else { | ||||
| putc('<', stdout); | ||||
| if(strcmp(token, "nofill") == 0) { | ||||
| nofill++; | ||||
| fputs("pre", stdout); | ||||
| } else if(strcmp(token, "/nofill") == 0) { | ||||
| nofill--; | ||||
| fputs("/pre", stdout); | ||||
| } else if(strcmp(token, "bold") == 0) { | ||||
| fputs("b", stdout); | ||||
| } else if(strcmp(token, "/bold") == 0) { | ||||
| fputs("/b", stdout); | ||||
| } else if(strcmp(token, "italic") == 0) { | ||||
| fputs("i", stdout); | ||||
| } else if(strcmp(token, "/italic") == 0) { | ||||
| fputs("/i", stdout); | ||||
| } else if(strcmp(token, "fixed") == 0) { | ||||
| fputs("tt", stdout); | ||||
| } else if(strcmp(token, "/fixed") == 0) { | ||||
| fputs("/tt", stdout); | ||||
| } else if(strcmp(token, "excerpt") == 0) { | ||||
| fputs("blockquote", stdout); | ||||
| } else if(strcmp(token, "/excerpt") == 0) { | ||||
| fputs("/blockquote", stdout); | ||||
| } else { | ||||
| putc('?', stdout); | ||||
| fputs(token, stdout); | ||||
| if(strcmp(token, "param") == 0) { | ||||
| paramct++; | ||||
| putc(' ', stdout); | ||||
| continue; | ||||
| } | ||||
| } | ||||
| putc('>', stdout); | ||||
| } | ||||
| } | ||||
| } else if(c == '>') { | ||||
| fputs(">", stdout); | ||||
| } else { | } else { | |||
| putc('<', stdout); | if(c == '\n' && nofill <= 0 && paramct <= 0) { | |||
| if(strcmp(token, "nofill") == 0) { | while((i=getc(stdin)) == '\n') fputs("<br>", stdout); | |||
| nofill++; | ungetc(i, stdin); | |||
| fputs("pre", stdout); | ||||
| } else if(strcmp(token, "/nofill") == 0) { | ||||
| nofill--; | ||||
| fputs("/pre", stdout); | ||||
| } else if(strcmp(token, "bold") == 0) { | ||||
| fputs("b", stdout); | ||||
| } else if(strcmp(token, "/bold") == 0) { | ||||
| fputs("/b", stdout); | ||||
| } else if(strcmp(token, "italic") == 0) { | ||||
| fputs("i", stdout); | ||||
| } else if(strcmp(token, "/italic") == 0) { | ||||
| fputs("/i", stdout); | ||||
| } else if(strcmp(token, "fixed") == 0) { | ||||
| fputs("tt", stdout); | ||||
| } else if(strcmp(token, "/fixed") == 0) { | ||||
| fputs("/tt", stdout); | ||||
| } else if(strcmp(token, "excerpt") == 0) { | ||||
| fputs("blockquote", stdout); | ||||
| } else if(strcmp(token, "/excerpt") == 0) { | ||||
| fputs("/blockquote", stdout); | ||||
| } else { | ||||
| putc('?', stdout); | ||||
| fputs(token, stdout); | ||||
| if(strcmp(token, "param") == 0) { | ||||
| paramct++; | ||||
| putc(' ', stdout); | ||||
| continue; | ||||
| } | } | |||
| } | putc(c, stdout); | |||
| putc('>', stdout); | ||||
| } | } | |||
| } | ||||
| } else if(c == '>') { | ||||
| fputs(">", stdout); | ||||
| } else { | ||||
| if(c == '\n' && nofill <= 0 && paramct <= 0) { | ||||
| while((i=getc(stdin)) == '\n') fputs("<br>", stdout); | ||||
| ungetc(i, stdin); | ||||
| } | ||||
| putc(c, stdout); | ||||
| } | } | |||
| } | /* The following line is only needed with line-buffering */ | |||
| /* The following line is only needed with line-buffering */ | putc('\n', stdout); | |||
| putc('\n', stdout); | exit(0); | |||
| exit(0); | ||||
| } | } | |||
| End of changes. 100 change blocks. | ||||
| 598 lines changed or deleted | 592 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||