idnits 2.17.1 

draft-seantek-text-nfo-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document is more than 15 pages and seems to lack a Table of Contents.

  == It seems as if not all pages are separated by form feeds - found 13 form
     feeds but 437 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Introduction section.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (March 13, 2017) is 2591 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

     No issues found here.

     Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                         S. Leonard
3	Internet-Draft                                             Penango, Inc.
4	Intended Status: Informational                            March 13, 2017
5	Expires: September 14, 2017

7	                        The text/nfo Media Type
8	                       draft-seantek-text-nfo-04

10	Abstract

12	   This document registers the text/nfo media type for use with release
13	   iNFOrmation. While compatible with text/plain, ".NFO" files and
14	   content have distinguishing characteristics from typical plain text
15	   because they are meant to be output to IBM PC-compatible system
16	   consoles that support certain "ANSI" escape sequences.

18	Status of this Memo

20	   This Internet-Draft is submitted in full conformance with the
21	   provisions of BCP 78 and BCP 79.

23	   Internet-Drafts are working documents of the Internet Engineering
24	   Task Force (IETF).  Note that other groups may also distribute
25	   working documents as Internet-Drafts.  The list of current Internet-
26	   Drafts is at http://datatracker.ietf.org/drafts/current/.

28	   Internet-Drafts are draft documents valid for a maximum of six months
29	   and may be updated, replaced, or obsoleted by other documents at any
30	   time.  It is inappropriate to use Internet-Drafts as reference
31	   material or to cite them other than as "work in progress."

33	Copyright Notice

35	   Copyright (c) 2017 IETF Trust and the persons identified as the
36	   document authors. All rights reserved.

38	   This document is subject to BCP 78 and the IETF Trust's Legal
39	   Provisions Relating to IETF Documents
40	   (http://trustee.ietf.org/license-info) in effect on the date of
41	   publication of this document. Please review these documents
42	   carefully, as they describe your rights and restrictions with respect
43	   to this document. Code Components extracted from this document must
44	   include Simplified BSD License text as described in Section 4.e of
45	   the Trust Legal Provisions and are provided without warranty as
46	   described in the Simplified BSD License.

48	1. iNFOrmation

50	   Packagers of files or other bundled content commonly include a common
51	   human-readable manifest that describes their packages. While an
52	   obvious solution is to include a README, in an archive such as a ZIP
53	   file, READMEs are generally written for software applications and
54	   provide late-breaking instructions on how to configure and install
55	   the software, along with known bugs and changelogs. (Plain) text
56	   READMEs are also generally limited to printable US-ASCII characters.

58	   Starting from circa 1990, packagers of various types of content
59	   settled upon the Release iNFOrmation format (NFO, commonly pronounced
60	   "EN-foe" or "info") to describe their releases. An NFO file serves
61	   similar purposes to a README, but with several nuanced differences.
62	   NFOs usually contain release information about the media, rather than
63	   about software per-se. NFOs credit the releasers or packagers. Much
64	   like the Received: Internet Message header [RFC5322], intermediates
65	   ("couriers") can also insert NFOs.

67	   Most distinctly, NFOs have come to contain elaborate ASCII or ANSI
68	   artwork that is remarkable in its own right in the pantheon of the
69	   postmodern computing culture. Many NFOs have been authored with the
70	   intent of displaying them on a terminal display with monospaced,
71	   inverted text (black background, gray or off-white foreground); some
72	   NFOs even include escape sequences to generate animations or color.
73	   The widely accepted encoding for NFOs is "OEM Code Page 437", the
74	   character set of the original IBM PC and MS-DOS.

76	   When served in the same manner as plain text (text/plain), a lot of
77	   the elaborate artwork in NFOs is lost, garbled, or misaligned on
78	   display. As NFOs are still in considerable use, the goal of this
79	   registration is to rectify these interchange problems and reclaim
80	   this piece of living computer history.

82	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
83	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
84	   document are to be interpreted as described in [RFC2119].

86	2. Release iNFOrmation Media Type Registration Application

88	   Type name: text

90	   Subtype name: nfo

92	   Required parameters:

94	    charset: Per Section 4.2.1 of [RFC6838], charset is REQUIRED. Unlike
95	     most other text types, the default value is the character set of
96	     the original IBM PC and MS-DOS, called OEM Code Page 437, and named
97	     "oem437". Implementations MUST support OEM Code Page 437.
98	     Unfortunately, the simple application of the IANA registered
99	     character set "IBM437" (aka "cp437") [RFC1345] will miss some
100	     important characters, so conformant implementations MUST support
101	     OEM Code Page 437 as specified in Section 3. NFOs authored for more
102	     modern computing environments are known to use ISO-8859-1, ISO-
103	     8859-15 (including support for the Euro sign), or UTF-8; however,
104	     for maximum interoperability, these or any other character sets
105	     MUST be declared by the sender. When absent, a receiver MAY guess,
106	     unless UTF-8 encoding is patently obvious. A RECOMMENDED detection
107	     algorithm is provided in Appendix A.

109	   Optional parameters:

111	    baud: A natural number (integer greater than 0) indicating the gross
112	     bit rate ("symbol rate") at which the NFO is supposed to be
113	     rendered to screen. This optional parameter provides a nostalgic
114	     effect from the days of dialup modems and fixed-speed serial lines.
115	     It also controls the animation rate, to the extent that the NFO
116	     employs optional escape sequences. While the term "bps" might be
117	     more accurate, this parameter is meant to be interpreted the way
118	     that an end user would experience the real-world conditions that a
119	     dialup modem would provide on the eve of Y2K. (The term "baud" is
120	     also used by a couple of popular modern viewers of this format.)
121	     For example, a conforming implementation could implement "57600" as
122	     if the data were being downloaded using a V.92 modem, replete with
123	     random stalls due to retransmission attempts on account of noise on
124	     the line.

126	   Encoding considerations:

128	     Text with 8-bit code points; all 8-bit combinations (including NUL)
129	     are possible.

131	   Security considerations:

133	     It's just text; this format provides no facilities for
134	     confidentiality or integrity. The ANSI escape sequence "CSI 5 m"
135	     could, however, blink you to death. As only a subset of ANSI escape
136	     sequences MUST be interpreted; interpreting a greater range than
137	     the subset prescribed in this registration may introduce other
138	     security issues, such as transmitting operating system commands.

140	     Some code points in oem437 have been used ambiguously in practice,
141	     so implementations SHOULD NOT assume that the mapping between this
142	     charset and Unicode is bijective. When displayed, codes 00, 20, and
143	     FF MAY appear to be similar, i.e., as a blank space.

145	   Interoperability considerations:

147	     NFOs are plain text but look best when read in a terminal view or
148	     with a dedicated NFO viewer that can emulate terminal features. As
149	     a result, they SHOULD be treated differently than text/plain files.
150	     The reference environment for NFO viewers to emulate is an IBM
151	     PC-compatible machine running MS-DOS 6.22 with the ANSI.SYS MS-DOS
152	     device driver loaded, where the NFO is displayed as if it were
153	     output to the terminal using the "TYPE" command.

155	   Published specification: [[Note to RFC Editor: Insert number here.]]

157	   Applications that use this media type:

159	     NFO viewers; text editors; terminals.

161	   Fragment identifier considerations:

163	     Same as text/plain [RFC5147].

165	   Additional information:

167	     Deprecated alias names for this type: text/x-nfo
168	     File extension(s): .nfo
169	     Macintosh file type code(s):
170	       TEXT. A uniform type identifier (UTI) of "public.nfo", which
171	       conforms to "public.plain-text", is RECOMMENDED.

173	   Person & email address to contact for further information:

175	     Sean Leonard <dev+ietf@seantek.com>

177	   Restrictions on usage: None.

179	   Author/Change controller: Sean Leonard <dev+ietf@seantek.com>

181	   Intended usage: COMMON

183	   Provisional registration? No

185	   "OEM Code Page 437" refers to the character set of the original IBM
186	   PC and MS-DOS. The code page actually represents two related things:
187	   the set of 256 graphemes stored in video read-only memory (ROM) that
188	   are accessed with a single 8-bit code, and an 8-bit encoding for text
189	   content that displays the graphemes or causes other behavior as
190	   defined by the code, the operating system, and the loaded device
191	   drivers. NFO is encoded with the aforementioned 8-bit encoding, which
192	   means that not all 256 graphemes are directly available for use.

194	   For example: the sequence 0D 0A (CR LF) identifies a new line; the
195	   code 1A (SUB) is the MS-DOS end-of-file marker. The code 0D cannot be
196	   used directly to express the grapheme U+266A EIGHTH NOTE; the code 0A
197	   cannot be used directly to express the grapheme U+25D9 INVERSE WHITE
198	   CIRCLE; the code 1A cannot be used to express U+2191 RIGHTWARDS
199	   ARROW.

201	   The registration for IBM437 [RFC1345] is used as a basis for this
202	   specification, which only elaborates upon the differences. Suggested
203	   mappings to Unicode characters are included; however, the mapping is
204	   not bijective. Octets are in hexadecimal. The symbols below next to
205	   the octets match [RFC1345], although the actual character has the
206	   meaning described here rather than the [RFC1345] meaning.

208	3.1. Low-Order Codes (00-7F)

210	   The codes in the 20-7E range are the same as in US-ASCII and IBM437.

212	   01-06, 0B, 0C, 0E-19, and 1C-1F are displayed as their corresponding
213	   ROM graphemes.

215	   00 NUL is displayed (and treated) as a space. Depending on the output
216	          environment, an implementation MAY map this code to U+0000
217	          NULL, or U+0020 SPACE.

219	   07 BEL MAY cause an audible bell sound (beep) to be emitted. Actually
220	          emitting a sound is not required for conformance. However,
221	          implementations that progressively render the output MUST
222	          pause for this code as if a sound were emitted.

224	   08 BS  causes the prior character to be erased: the prior grapheme is
225	          displayed and treated as a regular or non-breaking space (SP
226	          or NBSP), depending on whether the prior character would have
227	          been breaking or non-breaking.

229	   09 HT  causes horizontal tabbing, which for purposes of conformance,
230	          SHOULD produce the equivalent spaces so that the subsequent
231	          text is aligned on the next 8-character boundary.

233	   0A LF  causes a new line to be created and the text insertion point
234	          ("cursor") to be moved to the beginning of that line.

236	   0D CR  causes the text insertion point ("cursor") to be moved to the
237	          beginning of the current line. Subsequent text will overwrite
238	          the characters on the current line, until the cursor moves
239	          somewhere else. (0A creates and moves the cursor to a new
240	          line; therefore, 0A in the middle of overwriting the current
241	          line will not insert or erase any characters that might
242	          otherwise be on that line.)

244	   1A SUB is the MS-DOS end-of-file (EOF) marker; it ends the display.
245	          Codes after 1A MUST NOT be displayed. 1A can be used to
246	          delimit metadata from the main NFO content, although this
247	          practice is rarely used for NFOs. A well-known metadata format
248	          in this technology area is SAUCE (Standard Architecture for
249	          Universal Comment Extensions) [SAUCE], which implementations
250	          MAY support. A SAUCE record can specify a different code page.
251	          An implementation that supports SAUCE SHOULD support following
252	          the code page directive in the SAUCE record when the MIME
253	          entity's charset is oem437.

255	   1B ESC may be the start of an ANSI ESC sequence. If no valid ESC
256	          sequence is recognized, output the corresponding ROM grapheme
257	          (U+2190 LEFTWARDS ARROW) and continue normal processing with
258	          the next code.

260	   7F DEL is displayed as the corresponding ROM grapheme (U+2302 HOUSE).

262	3.2. High-Order Codes (80-FF)

264	   The codes in the 80-AF range are a selection of Latin characters;
265	   they are the same as in IBM437. A conformant implementation MUST NOT
266	   treat these codes as C1 control characters.

268	   The codes in the B0-DF range are box drawing and block characters;
269	   they are the same as in IBM437.

271	   The codes in the E0-FF range are for mathematical symbols, which are
272	   the same as in IBM437, with the following exceptions. The preferred
273	   Unicode mapping in Microsoft's OEM Code Page 437 documentation is
274	   designated with [OEMCP437]:

276	   E1 b*  can be either U+03B2 GREEK SMALL LETTER BETA, or U+00DF LATIN
277	          SMALL LETTER SHARP S (German Eszett) [OEMCP437]. The two were
278	          undistinguishable at low resolution on the original IBM
279	          hardware. Newer grapheme sets, including those of the IBM EGA
280	          and VGA graphics cards, display this code as the Eszett.
281	          Unfortunately only context can determine the proper character
282	          to use.

284	   E3 p*  can be U+03C0 GREEK SMALL LETTER PI [OEMCP437], U+03A0 GREEK
285	          CAPITAL LETTER PI, or U+220F N-ARY PRODUCT, depending on the
286	          particular grapheme used.

288	   E4 S*  can be either U+03A3 GREEK CAPITAL LETTER SIGMA [OEMCP437] or
289	          U+2211 N-ARY SUMMATION.

291	   E6 m*  can be either U+00B5 MICRO SIGN [OEMCP437] or U+03BC GREEK
292	          SMALL LETTER MU.

294	   EA W*  can be either U+2126 OHM SIGN or U+03A9 GREEK CAPITAL LETTER
295	          OMEGA [OEMCP437].

297	   EB d*  is U+03B4 GREEK SMALL LETTER DELTA [OEMCP437]. However, it can
298	          be used as a surrogate for U+00F0 LATIN SMALL LETTER ETH
299	          (Icelandic, Faroese, Old English, IPA) or U+2202 PARTIAL
300	          DIFFERENTIAL.

302	   ED /0  is U+03C6 GREEK SMALL LETTER PHI [OEMCP437], but in MS-DOS was
303	          mainly used as U+2205 EMPTY SET. Other possible meanings
304	          include U+03D5 GREEK PHI SYMBOL (used as a technical symbol,
305	          with a stroked glyph) (to name angles), U+2300 DIAMETER SIGN,
306	          or U+00F8 SMALL LETTER O WITH STROKE (as a surrogate).

308	   EE e*  is U+03B5 GREEK SMALL LETTER EPSILON [OEMCP437] or U+2208
309	          ELEMENT OF.

311	   FF NS  is NBSP, also known as U+00A0 NO-BREAK SPACE. The ROM grapheme
312	          is the same as SP (SPACE), i.e., it is blank.

314	3.3. ANSI Escape Sequences

316	   To support NFO content containing colors and other goodies, an NFO
317	   viewer MUST support a subset of "ANSI" escape sequences. (The
318	   required sequences are not directly related to ANSI, but rather to
319	   [ANSI.SYS].)

321	   [ANSI.SYS] supports cursor positioning, erasing, Set Graphics Mode
322	   (SGR), mode switching, and keyboard remapping. Of these functions, a
323	   conforming implementation MUST support the Set Graphics Mode (SGR)
324	   escape sequence. An implementation MUST support setting foreground
325	   colors (30-37) and background colors (40-47), which are also in
326	   [ISO6429]. An implementation MUST support all of the [ANSI.SYS] text
327	   attributes (0, 1, 4, (5 and/or 6), 7, and 8). Text attribute 5 is
328	   "Blink: Slow" (less than 150 per minute); text attribute 6 is "Blink:
329	   Fast" (more than 150 per minute). While [ANSI.SYS] does not document
330	   attribute 6, that was the behavior of the actual ANSI.SYS. An
331	   implementation SHOULD reproduce similar functionality.

333	   The other [ANSI.SYS] escape sequences are OPTIONAL. An implementation
334	   MAY support standard or vendor-specific escape sequences. For a list
335	   of standard sequences, see, e.g., [ISO6429] and [ISO8613].

337	3.4. Accessing Hidden Grapheme Codes

339	   There is no obvious way to encode the graphemes that are inaccessible
340	   at the values 07, 08, 09, 0A, 0D, 1A, and 1B. This specification
341	   provides a technique to access these graphemes in the context of OEM
342	   Code Page 437. This technique is RECOMMENDED, but not required.

344	   Although MS-DOS and ANSI.SYS did not conform to [ISO2022], that
345	   standard defines escape sequences to switch to other character sets.
346	   Unicode contains appropriate code points for all of the inaccessible
347	   graphemes (characters). Accordingly, the escape sequence:
348	      ESC % G
349	   switches the code to UTF-8 (with unspecified implementation level)
350	   [REG196]. While in UTF-8, the escape sequence:
351	      ESC % @
352	   reverts the code back to the original [ISO2022]. Normally the code
353	   would be [ISO2022], but given the starting context of OEM Code Page
354	   437, the code returns to OEM Code Page 437. The codes are as follows:

356	   ROM grapheme number
357	   |  IBM437 symbol
358	   |  |   Unicode code point
359	   |  |   |      Unicode name: UTF-8 encoding
360	   |  |   |      |
361	   07 BEL U+2022 BULLET: E2 80 A2

363	   08 BS  U+25D8 INVERSE BULLET: E2 97 98

365	   09 HT  U+25CB WHITE CIRCLE: E2 97 8B

367	   0A LF  U+25D9 INVERSE WHITE CIRCLE: E2 97 99

369	   0D CR  U+266A EIGHTH NOTE: E2 99 AA

371	   1A SUB U+2192 RIGHTWARDS ARROW: E2 86 92

373	   1B ESC U+2190 LEFTWARDS ARROW: E2 86 90

375	3.5. UTF-8/Unicode Processing

377	   When NFO content is encoded in UTF-8 or another Unicode encoding
378	   [UTF], the C0 and C1 code points may be present. These codes MUST be
379	   treated as control codes, not graphemes. They have the same behavior
380	   as specified for the special low-order codes described in Section
381	   3.1. For example, 1A ends the display, and 09 emits spaces sufficient
382	   for 8-column tabbing. 1B is ALWAYS treated as the start of an ESC
383	   sequence; if the sequence is not recognized, 1B does NOT revert to
384	   outputting a LEFTWARDS ARROW grapheme. Instead, nothing is displayed.
385	   For LEFTWARDS ARROW, encode U+2190 instead.

387	   The C1 control code 9B (CSI: Control Sequence Introducer) (Unicode
388	   code point U+009B) MUST be recognized as such; it is equivalent to 1B
389	   5B (ESC [).

391	3.6. Grapheme Reference

393	   The following figure is a reference of all 256 graphemes in the IBM
394	   PC ROM. The figure is a MIME (base64)-encoded PNG image.

396	MIME-Version: 1.0
397	Content-Type: image/png
398	Content-Disposition: attachment; filename="Codepage-437.png"
399	Content-Transfer-Encoding: base64

401	iVBORw0KGgoAAAANSUhEUgAAASwAAACMCAMAAADxyGQdAAAAGXRFWHRTb2Z0d2FyZQBBZG9i
402	ZSBJbWFnZVJlYWR5ccllPAAAAAZQTFRFqKioAAAAmKDP8QAACYZJREFUeNrsXYt24zoIhP//
403	6XvO7daVmBlAfrRJq+xuk20SWxojBMPD5vvRftiGYIO1wdpgbbA2WJ3D/Xtkv9lg3QyWLY3y
404	u66G8emOL+4DKz4f5yDfmo8an4YR2ud/4kjki9vAss+nXCLUCArJsmNen6hNc/fh53yIGerj
405	3z+kyCin6/EcWPbxYxrKcT49lJ5keQQLJCvI3tfrQeQjWG4oV7g+lsAaJd/6kjUvoxksEL4B
406	mmFqg+igZI1vD3+nc9FlSC9nZxlqUbMBpYtgDc8fP4xJxPHjQOd4ikoHwApiFLQZ01l+RmeR
407	k8PAjGwdZBn+P4SvA+K4GVggR3ZcqeGAcfGiZI1HLiTLw0rsgcWmA8ti2pIysP5NNSLCJMth
408	o+uBxZ+jZFU66wBrVF2JqlqQrLgl9XdD9s3kuqlliGrdp43Kgs6Su+EsPuOBiJUySTOT5/lb
409	n0oV1XTXdBgwWjAdoqyhZM0XIei+hp01avYgWVSdt7fD6TirRqk1rJjadDBjYjiANb6VGDha
410	Z3WX4Vu7O1en8WPuznakN1gbrNcGy/42V2gpMNIs/pswmpw3gkWWE+M/zjj5rjkWj26aRRsE
411	bUj0Ok7umLm7g7TR7H2BtQjQ3GPXgE87ezrAjwn3iX6rXhnohhVgHXzF7KraaDeagcWNInZG
412	2sfrxY5IHFI6YwGx8Ba5z9ACK9r+UfadsW4IFljnK0KlpoQifwks7T/OE6p19SA+zGEB7xAv
413	zxKxOxOBiZcXuPjIRYYxt0hytRJPgAXaO9IDDKzAXpjwV1EcAawY0DDABsUnk7xCXZDVcA4s
414	uWFeliycKl3DKRcllU4lWWY41gIsxVfgpWqBxXSWRC1yOAVpcw6sTpSMjnBBsvh+3IgUdASf
415	6XK+oOBy0bcYWIBwuRteWIaHHh0VkkQrW3MSLGbuxchF4Oe55QkMvqUbtxDc5WX4gFPzrr7R
416	dqR/jqLZYO3HBmsdlw2WMi6bdhY6lWSDhuSXSOPEDX4IvaLP5CEUbGANqN/MobBg6tAXMbLJ
417	6SkzDRaGnanTjlQXMVGF404syq8Mgfh24zdjBHsFrGisGnPjedIX+QxJNHM0kscXM9W1ANZX
418	os4JsMbEAAFWnIV50+nMFBnxGNCfmUTM8eReghVphJjV1AGLe58KLCR+HdMBbBksHC4jbefF
419	Pidnucg2wNSG0Y/6yl6TZ5eZLQ2wSK4DPttZsEg2Ykey4mx7YPmQM2JMYxD9wMmJCFbGHaLL
420	b5clKwOrChAQhkaweaChK1rBPSUkQPHzxRde2TJYrsFKFPwCWPPXm2BZteGyMY9akBCKDsrP
421	FhV8QhAlpoN7ORNhcHTBsjZYLP1bU4lMsXfAQoXaNUqHS2Ni3FNSIISLpFEaB0ZzGGMkBTOa
422	nab2glGqSK93c3dehhh6D7Ds5848RZC207wpmmfBsg3bBuuZZYgZ7Q1Va8m3wktjBvMPKxET
423	LrVxQmslEMoQrfNCmA34IiE1XSciZOUqWG7d8b8PWAqgRvFEYjHPBvzMDcCyI27aWDkYKOPh
424	N/IzmHuUE6SiJi8Dy6qINHVDnCY9kApCkjkJHMvkNlnFiwJqKpmnYkrxgJVWnvUMr91xRtFA
425	ITdLp+F5ftEVi0lYPbAwbbEGi0RSOkWuXNeyZIkaLOarfwdYnE+hYDG+VKZdviJYlPZbB0vX
426	1PFac1NRm4fA4rNZBotkWS6DxasYCZtkUme1s9qKgvJxX8O9z/qsU6jcZW8hWPyk4lzuccfM
427	+Czab8OSOM+jjnSWsLluELXDU99ql902itcH6/JxbhtHVgN1Biuovr+Bxtt81h8m/5oRvBcE
428	K/brwcVvueepJi0ahvAys3rQoiyMWkzMdDBqMeAUsS+HzAUgJhShfUxazMSqMCsYJuGI6VOw
429	8HNVohDK3ky0jhjjhq6oqayHkKW1K0K2aM1FZy/TmWYyUwagDjHKsfsSsV5pHjxtPXHXdrii
430	qngvptMWSLIMEd8saotgwVoxzXA5rKfCBEdEpkUn6BWHxMDUyk8zzajkoKvhQzJHEyxa9ZIR
431	sdkFLHyxJNdhaOgTE96KRM02WJhcR0oZK7B0TbZJypQ22yqZSTpBg5y31vwvgBVUPiE7gs7S
432	CTLpOSmfo/2WsgFYzCX0p8Ca9wSZqMR3w5Ng6druHliephy5PwCW44URXae8tLOKEYhkvFt1
433	1hSnuB8siJv0wYKIA6Wx1Av33m7IX7ApjQyZjXmnhQ7FEepuGFhqPu7/Umdd9sZeuTfP+ZEF
434	+9fupDB+G1bRQ/ntFM2tMrD5rKfBEkkm94/sI7H76HWq+6lkTn/tiJO3ihK6hBRix8Z9ras6
435	ek1Wjn8ff74azL4CWA2sUKC6Z6nN1LeSrBZWrApjESyb6SD96ZeWrNX9lJiDq4zS2+osFqU1
436	Uo0ythZfs2qokxQmOQx2QbKAoeBUtEEZ2lAngm/lkpXUpUVqgeSTYSS8T9HQq36XZLFazjQ/
437	sidZfbC0rccpcFH5R4Z/4FFLFvcRUWoSyXJOvamuiqS5UwcspaOSjnKKun5GslBGkvLyrrN0
438	FqysnRlr2hkyaZDev6CzfiNYB0Hsbm5lydrrSpabGakTjCRRvfcX7IVqc3W7nfUoWN/Nljxt
439	Z70/WNbvlbgl6xtZhz8F1otL1lIfys4oBZFd/GbFy8tbrhf9Pg3a+sN/LHxmg7XB2mBtsP4W
440	WEf41mLDI4sNsqYPhwf9DJtSqGkU90QlX39Nyarp9YbBsiZZ8pZrj0kWf3TeYg24pPh0JCt5
441	aPLR3KuvpjNFsBY4+I6saRXTorobctTRYh2jlHrzG6wN1gZrg7XB2mBtsDZYPwLWGz2+J8Xz
442	14NVBJvGdhnr1fcm67IuhCmY87Uwp/rY2fosnVRsOtQES5UbdeHDe3tXhAp2Pa+lR7vvV8Aq
443	V3MNFuk/zj5N2lDojjkaK19q3NTgOloxX2fNxzVYhr07PBkRtvAtiybiPuSkeobGusUt8FIc
444	mynCpL1WA6wxBZL0AuJCw2pcst4vE6039SIy0rgJ1rV3Ou+E6SyChf3VJFjuuWTxNa7Ampqs
445	uZMWRKGQsFyGtWRZU+dVwrcAFtYuXpCsFKxYppgvwzpfL0VCphs+Ahaa0vTmiUlGs7otaApW
446	vhs6y3tsLcNrYFkOVriHi+fqI6kRL1Q33kK0vfWvmfLZTVqv7oZzDaAAi93/2sUtBufqgdS4
447	Mx7JaZd33GMJXzLRf7zo7psH8N6+4QZrg7XB+muPDdYGa4O1wdpg/dLHfwIMAORIgm4Mk35I
448	AAAAAElFTkSuQmCC

450	               Figure 1: Code Page 437 Grapheme Reference

452	3.7. Charset Registration Template

454	     To: ietf-charsets@iana.org
455	     Subject: Registration of new charset oem437

457	     Charset name: oem437

459	     Charset aliases: None.

461	     Suitability for use in MIME text: Suitable.

463	     Published specification(s): This specification; [OEMCP437].

465	     ISO 10646 equivalency table:

467	       This table is taken from the IBM437 registration in [RFC1345],
468	       with modifications based on actual implementations of [OEMCP437],
469	       as discussed in this document. Character mnemonic symbols
470	       generally map to the Unicode code points listed in Section 3
471	       of [RFC1345], with the following exceptions. The symbol suffix
472	       $ (for example, HT$) means that the Unicode code point
473	       mapping is essentially correct, but an implementation might
474	       need to perform additional or special processing as discussed
475	       in this document, depending on the output environment.

477	       The symbol $$ means that this code point has special
478	       considerations as discussed in this document, so no
479	       single, definitive Unicode code point mapping can be given.
480	       Finally, three characters have no corresponding mnemonic
481	       symbols in Section 3 of [RFC1345], so symbols are defined here:

483	         $>   25ba   BLACK RIGHT-POINTING POINTER
484	         $<   25c4   BLACK LEFT-POINTING POINTER
485	         $B   21a8   UP DOWN ARROW WITH BASE

487	       NU$ 0u 0U cH- cD- cC cS BL$ BS$ HT$ LF$ Ml Fm CR$ M2 SU
488	       $> $< UD !*2 PI SE SR $B -! -v $$ EC$ -L <> UT Dt
489	       SP !  "  Nb DO %  &  '  (  )  *  +  ,  -  .  /
490	       0  1  2  3  4  5  6  7  8  9  :  ;  <  =  >  ?
491	       At A  B  C  D  E  F  G  H  I  J  K  L  M  N  O
492	       P  Q  R  S  T  U  V  W  X  Y  Z  <( // )> '> _
493	       '! a  b  c  d  e  f  g  h  i  j  k  l  m  n  o
494	       p  q  r  s  t  u  v  w  x  y  z  (! !! !) '? Eh
495	       C, u: e' a> a: a! aa c, e> e: e! i: i> i! A: AA
496	       E' ae AE o> o: o! u> u! y: O: U: Ct Pd Ye Pt Fl
497	       a' i' o' u' n? N? -a -o ?I NI NO 12 14 !I << >>
498	       .S :S ?S vv vl vL Vl Dl dL VL VV LD UL Ul uL dl
499	       ur uh dh vr hh vh vR Vr UR DR UH DH VR HH VH uH
500	       Uh dH Dh Ur uR dR Dr Vh vH ul dr FB LB lB RB TB
501	       a* $$ G* $$ $$ s* $$ t* F* H* $$ $$ 00 $$ $$ (U
502	       =3 +- >= =< Iu Il -: ?2 Ob .M Sb RT nS 2S fS NS$

504	     Additional information:

506	       See this document for details on how to handle particular codes
507	       that correspond both to graphemes in the IBM PC ROM, and
508	       to control characters.

510	     Person & email address to contact for further information:

512	       Sean Leonard <dev+ietf@seantek.com>

514	     Intended usage: COMMON

516	4.  Example

518	   The following example is a RELEASE.NFO file as an e-mail attachment,
519	   with base64 encoding. Note that the character set is (correctly)
520	   assumed to be OEM Code Page 437.

522	MIME-Version: 1.0
523	Content-Type: text/nfo
524	Content-Disposition: attachment; filename="RELEASE.NFO"
525	Content-Transfer-Encoding: base64

527	TODO/PutInBase64EncodedContentHere==

529	5.  IANA Considerations

531	   IANA is asked to register the media type text/nfo in the Standards
532	   tree using the application provided in Section 2 of this document.

534	   IANA is asked to register the charset oem437 in the Character Sets
535	   registry using the application provided in Section 3 of this
536	   document.

538	6. Security Considerations

540	   It's just text; this format provides no facilities for
541	   confidentiality or integrity. The ANSI escape sequence "CSI 5 m"
542	   could, however, blink you to death. As only a subset of ANSI escape
543	   sequences MUST be interpreted; interpreting a greater range than the
544	   subset prescribed in this registration may introduce other security
545	   issues, such as transmitting operating system commands.

547	   Some code points in oem437 have been used ambiguously in practice, so
548	   implementations SHOULD NOT assume that the mapping between this
549	   charset and Unicode is bijective. When displayed, codes 00, 20, and
550	   FF MAY appear to be similar, i.e., as a blank space.

552	7. References

554	7.1. Normative References

556	   [ANSI.SYS] Microsoft Corporation, "ANSI.SYS", MSDN ID cc722862, 1994,
557	              <http://technet.microsoft.com/library/cc722862>.

559	   [OEMCP437] Microsoft Corporation, "OEM 437", MSDN ID cc305156, 2014,
560	              <http://msdn.microsoft.com/goglobal/cc305156>.

562	   [RFC1345]  Simonsen, K., "Character Mnemonics and Character Sets",
563	              RFC 1345, June 1992.

565	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
566	              Requirement Levels", BCP 14, RFC 2119, March 1997.

568	   [RFC5147]  Wilde, E. and M. Duerst, "URI Fragment Identifiers for the
569	              text/plain Media Type", RFC 5147, April 2008.

571	   [RFC6838]  Freed, N., Klensin, J., and T. Hansen, "Media Type
572	              Specifications and Registration Procedures", BCP 13, RFC
573	              6838, January 2013.

575	   [UTF]      The Unicode Consortium, "The Unicode Standard, Version
576	              8.0.0", Chapter 3: "Conformance", The Unicode Consortium,
577	              August 2015.

579	7.2. Informative References

581	   [ISO2022]  International Organization for Standardization, "Character
582	              Code Structure and Extension Techniques, 6th edition", ISO
583	              Standard 2022, ECMA-35, December 1994.

585	   [ISO6429]  International Organization for Standardization,
586	              "Information Technology - Control Functions for Coded
587	              Character Sets, 3rd edition", ISO Standard 6429, December
588	              1992.

590	   [ISO8613]  International Organization for Standardization,
591	              "Information Technology - Open Document Architecture (ODA)
592	              and Interchange Format: Character Content Architectures",
593	              ISO Standard 8613-6, ITU-T T.416, March 1993.

595	   [REG196]   International Organization for Standardization,
596	              "International Register of Coded Character Sets: UTF-8
597	              without implementation level", Sec. 2.8.1, Reg. 196, April
598	              1996, <http://kikaku.itscj.ipsj.or.jp/ISO-IR/196.pdf>.

600	   [RFC5322]  Resnick, P., Ed., "Internet Message Format", RFC 5322,
601	              October 2008.

603	   [SAUCE]    O. "Tasmaniac" Reubens / ACiD, "SAUCE--Standard
604	              Architecture for Universal Comment Extensions", 00.5,
605	              November 2013, <http://www.acid.org/info/sauce/sauce.htm>.

607	Appendix A.  IBM Code Page 437 vs. UTF-8 Detection Algorithm

609	   In cases of ambiguity, the following algorithm SHOULD be used to
610	   detect UTF-8 encoded data in text/nfo content:

612	   If the octets EF BB BF are present at the beginning => UTF-8.

614	   Considering all octets in the content:

616	     If no octets are greater than 7F => oem437.
617	     If any octets are F5 - FF, C0, or C1 => oem437.
618	     If any UTF-8 encodings are "ill-formed" => oem437.
619	     If any UTF-8 encodings represent illegal code points
620	       (e.g., surrogate code points) => oem437.

622	     Ragged line tests:

624	       If display characters decoded with oem437
625	         result in identical line widths => oem437.
626	       If display characters decoded with UTF-8
627	         result in identical line widths => UTF-8.

629	   Finally:
630	   => UTF-8 or oem437; prefer oem437.

632	Author's Address

634	   Sean Leonard
635	   Penango, Inc.
636	   5900 Wilshire Boulevard
637	   21st Floor
638	   Los Angeles, CA  90036
639	   USA

641	   EMail: dev+ietf@seantek.com
642	   URI:   http://www.penango.com/