Parsing Example

idnits 2.17.1 draft-ietf-html-spec-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 3550 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 31 instances of too long lines in the document, the longest one being 15 characters in excess of 72. == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 6, 1995) is 10583 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '
' and
     '' lines.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Missing reference section? 'IMEDIA' on line 3041 looks like a reference

  -- Missing reference section? 'MIME' on line 3025 looks like a reference

  -- Missing reference section? 'SGML' on line 3072 looks like a reference

  -- Missing reference section? 'IANA' on line 3045 looks like a reference

  -- Missing reference section? 'RELURL' on line 3032 looks like a reference

  -- Missing reference section? 'HTTP' on line 3018 looks like a reference

  -- Missing reference section? 'URL' on line 3012 looks like a reference

  -- Missing reference section? 'URI' on line 3005 looks like a reference

  -- Missing reference section? 'GOLD90' on line 3037 looks like a reference

  -- Missing reference section? 'SQ91' on line 3049 looks like a reference

  -- Missing reference section? 'US-ASCII' on line 3053 looks like a reference

  -- Missing reference section? 'ISO-8859-1' on line 3058 looks like a
     reference


     Summary: 8 errors (**), 0 flaws (~~), 3 warnings (==), 15 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	HTML Working Group                                         T. Berners-Lee
2	INTERNET-DRAFT                                                    MIT/W3C
3	                                 D. Connolly
4	Expires: In six months                                        May 6, 1995

6	                   Hypertext Markup Language - 2.0

8	                               CONTENTS

10	     1.  Introduction
11	     2.  HTML as an Application of SGML
12	     3.  HTML as an Internet Media Type
13	     4.  Document Structure Elements
14	     5.  Character Content
15	     6.  Data Elements
16	     7.  Character Format Elements
17	     8.  Hyperlink Elements
18	     9.  Block Structuring Elements
19	     10.  Form-based Input Elements
20	     11.  HTML Public Text
21	     12.  Glossary
22	     13.  Bibliography
23	     14.  Appendices
24	     15.  Acknowledgments

26	Status of this Memo

28	This document is an Internet-Draft. Internet-Drafts are working
29	documents of the Internet Engineering Task Force (IETF), its areas,
30	and its working groups. Note that other groups may also distribute
31	working documents as Internet-Drafts.

33	Internet-Drafts are draft documents valid for a maximum of six months
34	and may be updated, replaced, or obsoleted by other documents at any
35	time. It is inappropriate to use Internet-Drafts as reference material
36	or to cite them other than as ``work in progress.''

38	To learn the current status of any Internet-Draft, please check the
39	1id-abstracts.txt listing contained in the Internet-Drafts Shadow
40	Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
41	munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
42	ftp.isi.edu (US West Coast).

44	Distribution of this document is unlimited. Please send comments to
45	the HTML working group (HTML-WG) of the Internet Engineering Task
46	Force (IETF) at . Discussions of the group are
47	archived at .

49	                          ABSTRACT

51	     The Hypertext Markup Language (HTML) is a simple markup
52	     language used to create hypertext documents that are
53	     platform independent. HTML documents are SGML documents with
54	     generic semantics that are appropriate for representing
55	     information from a wide range of domains. HTML markup can
56	     represent hypertext news, mail, documentation, and
57	     hypermedia; menus of options; database query results; simple
58	     structured documents with in-lined graphics; and hypertext
59	     views of existing bodies of information.

61	     HTML has been in use by the World Wide Web (WWW) global
62	     information initiative since 1990. This specification
63	     roughly corresponds to the capabilities of HTML in common
64	     use prior to June 1994. HTML is an application of ISO
65	     Standard 8879:1986 Information Processing Text and Office
66	     Systems; Standard Generalized Markup Language (SGML).

68	     The `"text/html; version=2.0"' Internet Media Type (RFC
69	     1590) and MIME Content Type (RFC 1521) is defined by this
70	     specification.

72	1. Introduction

74	     The HyperText Markup Language (HTML) is a simple data format
75	     used to create hypertext documents that are portable from
76	     one platform to another. HTML documents are SGML documents
77	     with generic semantics that are appropriate for representing
78	     information from a wide range of domains.

80	1.1. Scope

82	     HTML has been in use by the World-Wide Web (WWW) global
83	     information initiative since 1990. This specification
84	     corresponds to the capabilities of HTML in common use prior
85	     to June 1994 and referred to as ``HTML 2.0''.

87	     HTML is an application of ISO Standard 8879:1986
88	     _Information Processing Text and Office Systems; Standard
89	     Generalized Markup Language_ (SGML). The HTML Document Type
90	     Definition (DTD) is a formal definition of the HTML syntax
91	     in terms of SGML.

93	     This specification also defines HTML as an Internet Media
94	     Type[IMEDIA] and MIME Content Type[MIME] called `text/html',
95	     or `text/html; version=2.0'. As such, it defines the
96	     semantics of the HTML syntax and how that syntax should be
97	     interpreted by user agents.

99	1.2. Conformance

101	     This specification governs the syntax of HTML documents and
102	     the behaviour of HTML user agents.

104	1.2.1. Documents

106	     A document is a conforming HTML document only if:

108	          * It is a conforming SGML document, and it conforms to
109	          the HTML DTD (see 11.1, "HTML DTD")
110	          * It conforms to the application conventions in this
111	          specification. For example, the value of the `HREF'
112	          attribute of the  element must conform to the URI
113	          syntax.
114	          * Its document character set includes ISO-8859-1 and
115	          agrees with ISO10646; that is, each code position
116	          listed in 14.1, "The ISO-8859-1 Coded Character Set" is
117	          included, and each code position in the document
118	          character set is mapped to the same character as
119	          ISO10646 designates for that code position.
120	          NOTE - The document character set is somewhat
121	          independent of the character encoding scheme used to
122	          represent a document. For example, the ISO-2022-JP
123	          character encoding scheme can be used for HTML
124	          documents, since its repertoire is a subset of the
125	          ISO10646 repertoire. The crititcal distinction is that
126	          numeric character references agree with ISO10646
127	          regardless of how the document is encoded.

129	          NOTE - There are a number of syntactic idioms that are
130	          not supported or are supported inconsistently in some
131	          historical user agent implementations. These idioms are
132	          called out in notes like this throughout this
133	          specification.

135	          HTML documents should not contain these idioms, at
136	          least until such time as support for them is widely
137	          deployed.

139	     The HTML DTD defines a standard HTML document type and
140	     several variations, based on feature test entities:

142	     HTML.Recommended
143	                    Certain features of the language are necessary for
144	                    compatibility with widespread usage, but they may
145	                    compromise the structural integrity of a document.
146	                    This feature test entity enables a more
147	                    prescriptive document type definition that
148	                    eliminates those features.

150	                    For example, in order to preserve the structure of
151	                    a document, an editing user agent may translate
152	                    HTML documents to the recommended subset, or it
153	                    may require that the documents be in the
154	                    recommended subset for import.

156	     HTML.Deprecated
157	                    Certain features of the language are necessary for
158	                    compatibility with earlier versions of the
159	                    specification, but they tend to be used an
160	                    implemented inconsistently, and their use is
161	                    deprecated. This feature test entity enables a
162	                    document type definition that eliminates these
163	                    features.

165	                    Documents generated by tranlation software or
166	                    editing software should not contain these idioms.

168	1.2.2. User Agents

170	     An HTML user agent conforms to this specification if:

172	          * It parses the characters of an HTML document into
173	          data characters and markup as per [SGML].
174	          * It supports the ISO-8859-1 character encoding scheme,
175	          and processes each character in the ISO Latin Alphabet
176	          Nr. 1 as specified in 5.1, "The ISO Latin 1 Character
177	          Repertoire".
178	          NOTE - To support non-western writing systems, HTML
179	          user agents should support the Unicode-1-1-UTF-8 and
180	          Unicode-1-1-UCS-2 encodings and as much of the
181	          character repertoire of ISO10646 as is possible as
182	          well.
183	          * It behaves identically for documents whose parsed
184	          token sequences are identical.
185	          For example, comments and the whitespace in tags
186	          disappear during tokenization, and hence they do not
187	          influence the behaviour of conforming user agents.
188	          * It allows the user to traverse (or at least attempt
189	          to traverse, resources permitting) all hyperlinks in an
190	          HTML document.
191	          * It allows the user to express all form field values
192	          specified in an HTML document and to (attempt to)
193	          submit the values as requests to information services.

195	          NOTE - In the interest of robustness and extensibility,
196	          there are a number of widely deployed conventions for
197	          handling non-conforming documents. See 3.2.1,
198	          "Undeclared Markup Error Handling" for details.

200	2. HTML as an Application of SGML

202	     HTML is an application of ISO Standard 8879:1986 - Standard
203	     Generalized Markup Language (SGML). SGML is a system for
204	     defining structured document types and markup languages to
205	     represent instances of those document types[SGML]. The
206	     public text -- DTD and SGML declaration -- of the HTML
207	     document type definition are provided in 11, "HTML Public
208	     Text".

210	     The term _HTML_ refers to both the document type defined
211	     here and the markup language for representing instances of
212	     this document type.

214	2.1. SGML Documents

216	     An HTML document is an SGML document; that is, a sequence of
217	     characters organized physically into a set of entities, and
218	     logically as a hierarchy of elements.

220	     The first production of the SGML grammar separates an SGML
221	     document into three parts: an SGML declaration, a prologue,
222	     and an instance. For the purposes of this specification, the
223	     prologue is a DTD. This DTD describes another grammar: the
224	     start symbol is given in the doctype declaration; the
225	     terminals are data characters and tags, and the productions
226	     are determined by the element declarations. The instance
227	     must conform to the DTD, that is, it must be in the language
228	     defined by this grammar.

230	     The SGML declaration determines the lexicon of the grammar.
231	     It specifies the document character set, which determines a
232	     character repertoire that contains all characters that occur
233	     in all text entities in the document, and the code positions
234	     associated with those characters.

236	     The SGML declaration also specifies the syntax-reference
237	     character set of the document, and a few other parameters
238	     that bind the abstract syntax of SGML to a concrete syntax.
239	     This concrete syntax determines how the sequence of
240	     characters of the document is mapped to a sequence of
241	     terminals in the grammar of the prologue.

243	     For example, consider the following document:

245	     
246	     Parsing Example
247	     Some text. *wow*

249	     An HTML user agent should use the SGML declaration is given
250	     in 11.2, "SGML Declaration for HTML". According to the
251	     document character set there,`*' refers to an asterisk
252	     character.

254	     The instance above is regarded as the following sequence of
255	     terminals:

257	          1. TITLE start-tag
258	          2. data characters: ``Parsing Example''
259	          3. TITLE end-tag
260	          4. P start-tag
261	          5. data characters ``Some text. ''
262	          6. EM start-tag
263	          7. ``*wow*''
264	          8. EM end-tag

266	     The start symbol of the DTD grammar is HTML, and the
267	     productions are given in the public text identified by
268	     `-//IETF//DTD HTML 2.0//EN' (11.1, "HTML DTD"). Hence the
269	     terminals above parse as:

271	        HTML
272	         |
273	         \-HEAD
274	         |  |
275	         |  \-TITLE
276	         |      |
277	         |      \-
278	         |      |
279	         |      \-"Parsing Example"
280	         |      |
281	         |      \-
282	         |
283	         \-BODY
284	           |
285	           \-P
286	             |
287	             \-
288	             |
289	             \-"Some text. "
290	             |
291	             \-EM
292	             |  |
293	             |  \-
294	             |  |
295	             |  \-"*wow*"
296	             |  |
297	             |  \-
298	             |
299	             \-

301	2.2. HTML Lexical Syntax

303	     SGML specifies an abstract syntax and a reference concrete
304	     syntax. Aside from certain quantities and capacities (e.g.
305	     the limit on the length of a name), all HTML documents use
306	     the reference concrete syntax. In particular, all markup
307	     characters are in the ISO-646-IRV character repertoire. Data
308	     characters are drawn from the document character set (see 5,
309	     "Character Content").

311	     A complete discussion of SGML parsing, e.g. the mapping of a
312	     sequence of characters to a sequence of tags and data is
313	     left to the SGML standard[SGML]. This section is only a
314	     summary.

316	2.2.1. Data Characters

318	     Any sequence of characters that do not constitute markup
319	     (see 9.6 ``Delimiter Recognition'' of [SGML]) are mapped
320	     directly to strings of data characters. Some markup also
321	     maps to data character strings. Numeric character references
322	     also map to single-character strings, via the document
323	     character set. Each reference to one of the general entities
324	     defined in the HTML DTD also maps to a single-character
325	     string.

327	     For example,

329	     abc<def    => "abc","<","def"
330	     abc<def   => "abc","<","def"

332	     Note that the terminating semicolon is only necessary when
333	     the character following the reference would otherwise be
334	     recognized as markup:

336	     abc < def     => "abc ","<"," def"
337	     abc < def    => "abc ","<"," def"

339	     And note that an ampersand is only recognized as markup when
340	     it is followed by a letter or digit:

342	     abc & lt def    => "abc & lt def"
343	     abc & 60 def    => "abc & 60 def"

345	     A useful technique for translating plain text to HTML is to
346	     replace each '<', '&', and '>' by an entity reference or
347	     numeric character reference as follows:

349	                      ENTITY      NUMERIC
350	            CHARACTER REFERENCE   CHAR REF     CHARACTER DESCRIPTION
351	              &       &       &        Ampersand
352	              <       <        <        Less than
353	              >       >        >        Greater than

355	          NOTE - There are SGML mechanisms, CDATA and RCDATA, to
356	          allow most `<', `>', and `&' characters to be entered
357	          without the use of entity references. Because these
358	          features tend to be used and implemented
359	          inconsistently, and because they conflict with
360	          techinques for reducing HTML to 7 bit ASCII for
361	          transport, they are not used in this version of the
362	          HTML DTD.

364	2.2.2. Tags

366	     Tags delimit elements such as headings, paragraphs, lists,
367	     character highlighting and links. Most HTML elements are
368	     identified in a document as a start-tag, which gives the
369	     element name and attributes, followed by the content,
370	     followed by the end tag. Start-tags are delimited by `<' and
371	     `>'; end tags are delimited by `'. An example is:

373	     This is a Heading

375	     Some elements only have a start-tag without an end-tag. For
376	     example, to create a line break, you use the `
' tag.
377	     Additionally, the end tags of some other elements, such as
378	     Paragraph (`'), List Item (`'), Definition Term
379	     (`'), and Definition Description (`') elements, may
380	     be omitted.

382	     The content of an element is a sequence of data character
383	     strings and nested elements. Some elements, such as anchors,
384	     cannot be nested. Anchors and character highlighting may be
385	     put inside other constructs. See the HTML DTD, 11.1, "HTML
386	     DTD" for full details.

388	          NOTE - The SGML declaration for HTML specifies SHORTTAG
389	          YES, which means that there are other valid syntaxes
390	          for tags, such as NET tags, `'; and empty end-tags, `'. Until support
392	          for these idioms is widely deployed, their use is
393	          strongly discouraged.

395	2.2.3. Names

397	     A name consists of a letter followed by up to 71 letters,
398	     digits, periods, or hyphens. Element names are not case
399	     sensitive, but entity names are. For example,
400	     `', `', and `' are
401	     equivalent, whereas `&' is different from `&'.

403	     In a start-tag, the element name must immediately follow the
404	     tag open delimiter `<'.

406	2.2.4. Attributes

408	     In a start-tag, white space and attributes are allowed
409	     between the element name and the closing delimiter. An
410	     attribute typically consists of an attribute name, an equal
411	     sign, and a value, though some attributes may be just a
412	     value. White space is allowed around the equal sign.

414	     The value of the attribute may be either:

416	          * A string literal, delimited by single quotes or
417	          double quotes and not containing any occurrences of the
418	          delimiting character.
419	          * A name token (a sequence of letters, digits, periods,
420	          or hyphens)

422	     In this example, img is the element name, `src' is the
423	     attribute name, and `http://host/dir/file.gif' is the
424	     attribute value:

426	     

428	          NOTE - Some historical implementations consider any
429	          occurrence of the `>' character to signal the end of a
430	          tag. For ompatibility with such implementations, when
431	          `>' appears in an attribute value, it should be
432	          represented with a numeric character reference, such as
433	          in: `'.

435	     A useful technique for computing an attribute value literal
436	     for a given string is to replace each quote and space
437	     character by an entity reference or numeric character
438	     reference as follows:

440	                      ENTITY      NUMERIC
441	            CHARACTER REFERENCE   CHAR REF     CHARACTER DESCRIPTION
442	              TAB                 	         Tab
443	              LF                  
        Line Feed
444	              CR                          Carriage Return
445	                                           Space
446	              "       "      "        Quotation mark
447	              &       &       &        Ampersand

449	     For example:

451	     

453	          NOTE - Some historical implementations allow any
454	          character except space or `>' in a name token.
455	          Attributes values must be quoted only if they don't
456	          satisfy the syntax for a name token.

458	     Note that the SGML declaration in section 13.3 limits the
459	     length of an attribute value to 1024 characters.

461	     Attributes such as ISMAP and COMPACT, may be written using a
462	     minimized syntax. The markup:

464	     

466	     can be written using a minimized syntax:

468	     

470	          NOTE - Some historical implementations only understand
471	          the minimized syntax.

473	2.2.5. Comments

475	     To include comments in an HTML document that will be
476	     eliminated in the mapping to terminals, surround them with
477	     `'. After the comment delimiter, all text up
478	     to the next occurrence of `-->' is ignored. Hence comments
479	     cannot be nested. White space is allowed between the closing
480	     `--' and `>', but not between the opening `
485	     HTML Guide: Recommended Usage
486	     
487	     

489	          NOTE - Some historical HTML implementations incorrectly
490	          consider any `>' character to be the termination of a
491	          comment.

493	2.2.6. Example HTML Document

495	     
496	     
497	     
498	     
499	     Structural Example
500	     
501	     First Header
502	     This is a paragraph in the example HTML file. Keep in mind
503	     that the title does not appear in the document text, but that
504	     the header (defined by H1) does.
505	     
506	     First item in an ordered list.
507	     
Second item in an ordered list.
508	       
509	        Note that lists can be nested;
510	       
 Whitespace may be used to assist in reading the
511	            HTML source.
512	       
513	     
Third item in an ordered list.
514	     
515	     This is an additional paragraph. Technically, end tags are
516	     not required for paragraphs, although they are allowed. You can
517	     include character highlighting in a paragraph. This sentence
518	     of the paragraph is emphasized. Note that the </P>
519	     end tag has been omitted.
520	     

521	     
522	     Be sure to read these bold instructions.
523	     

525	3. HTML as an Internet Media Type

527	     An HTML user agent allows users to interact with resources
528	     which have HTML representations. At a minimum, it must allow
529	     users to examine and navigate the content of HTML documents.
530	     HTML user agents should be able to preserve all formatting
531	     distinctions represented in an HTML document, and be able to
532	     simultaneously present resources referred to by IMG
533	     elements. (they may ignore some formatting distinctions or
534	     IMG resources at the request of the user). Conforming HTML
535	     user agents should support form entry and submission.

537	3.1. text/html media type

539	     This specification defines the Internet Media Type[IMEDIA]
540	     (formerly referred to as the Content Type[MIME]) called
541	     `text/html'. The following is to be registered with [IANA].

543	     Media Type name
544	                    text

546	     Media subtype
547	     name
548	                    html

550	     Required
551	     parameters
552	                    none

554	     Optional
555	     parameters
556	                    version, charset

558	     Encoding
559	     considerations
560	                    any encoding is allowed

562	     Security
563	     considerations
564	                    see 3.3, "Security Considerations"

566	     The optional parameters are defined as follows:

568	     Version
569	                    To help avoid future compatibility problems, the
570	                    version parameter may be used to give the version
571	                    number of the specification to which the document
572	                    conforms. The version number appears at the front
573	                    of this document and within the public identifier
574	                    of the HTML DTD. This specification defines
575	                    version 2.0. There is no default.

577	     Charset
578	                    The charset parameter (as defined in section 7.1.1
579	                    of RFC 1521[MIME]) may be given to specify the
580	                    character encoding scheme used to represent the
581	                    HTML document as a sequence of octets. The default
582	                    value is outside the scope of this specification;
583	                    but for example, the default is US-ASCII in the
584	                    context of MIME mail, and ISO-8859-1 in the
585	                    context of HTTP.

587	3.2. HTML Document Representation

589	     A message entity with a content type of `text/html'
590	     represents an HTML document, consisting of a single text
591	     entity. The `charset' parameter (whether implicit or
592	     explicit) identifies a character encoding scheme. The text
593	     entity consists of the characters determined by this
594	     character encoding scheme and the octets of the body of the
595	     message entity.

597	3.2.1. Undeclared Markup Error Handling

599	     To facilitate experimentation and interoperability between
600	     implementations of various versions of HTML, the installed
601	     base of HTML user agents supports a superset of the HTML 2.0
602	     language by reducing it to HTML 2.0: markup in the form of a
603	     start-tag or end-tag whose generic identifier is not
604	     declared is mapped to nothing during tokenization.
605	     Undeclared attributes are treated similarly. The entire
606	     attribute specification of an unknown attribute (i.e., the
607	     unknown attribute and its value, if any) should be ignored.
608	     On the other hand, references to undeclared entities should
609	     be treated as data characters.

611	     For example:

613	     
foo
...
614	       => ,"foo",
,,"..."
615	     xxx 
 yyy
616	       => "xxx ",
," yyy
617	     Let α and β be finite sets.
618	       => "Let α and β be finite sets."

620	     Support for notifying the user of such errors is encouraged.

622	     Information providers are warned that this convention is not
623	     binding: unspecified behavior may result, as such markup is
624	     not conforming to this specification.

626	3.2.2. Conventional Representation of Newlines

628	     SGML specifies that a text entity is a sequence of records,
629	     each beginning with a record start character and ending with
630	     a record end character (code positions 10 and 13
631	     respectively). (section 7.6.1, ``Record Boundaries'' in
632	     [SGML])

634	     [MIME] specifies that a body of type `text/*' is a sequence
635	     of lines, each terminated by CRLF, that is octets 10, 13.

637	     In practice, HTML documents are frequently represented and
638	     transmitted using an end of line convention that depends on
639	     the conventions of the source of the document; frequently,
640	     that representation consists of CR only, LF only, or CR LF
641	     combination. Hence the decoding of the octets will often
642	     result in a text entity with some missing record start and
643	     record end characters.

645	     Since there is no ambiguity, HTML user agents are encouraged
646	     to infer the missing record start and end characters.

648	     An HTML user agent should treat end of line in any of its
649	     variations as a word space in all contexts except
650	     preformatted text. Within preformatted text, an HTML user
651	     agent should expect to treat any of the three common
652	     representations of end-of-line as starting a new line.

654	3.3. Security Considerations

656	     Anchors, embedded images, and all other elements which
657	     contain URIs as parameters may cause the URI to be
658	     dereferenced in response to user input. In this case, the
659	     security considerations of the URI specification apply.

661	     The widely deployed methods for submitting forms requests --
662	     HTTP and SMTP -- provide little assurance of
663	     confidentiality. Information providers who request sensitive
664	     information via forms -- especially by way of the `PASSWORD'
665	     type input field -- should be aware and make their users
666	     aware of the lack of confidentiality.

668	     >

670	4. Document Structure Elements

672	     To identify information as an HTML document conforming to
673	     this specification, each document should start with the
674	     prologue:

676	     

678	          NOTE - If the body of a text/html body part does not
679	          begin with a document type declaration, an HTML user
680	          agent should infer the above document type declaration.

682	     HTML user agents are required to support the above document
683	     type declaration, the following document type declarations,
684	     and no others.

686	     
687	     

689	     In particular, they may support other formal public
690	     identifiers, or document types altogether. They may support
691	     an internal declaration subset with supplemental entity,
692	     element, and other markup declarations, or they may not.

694	4.1. HTML Document Element

696	      ...  Level 0

698	     The HTML document element is organized as a head and a body,
699	     much like a memo or a mail message. Within the head, you can
700	     specify the title and other information about the document.
701	     Within the body, you can structure text into paragraphs and
702	     lists, as well as highlight phrases and create links, using
703	     HTML elements.

705	          NOTE - The start and end tags for HTML, Head, and Body
706	          elements are omissible; however, this is not
707	          recommended since the head/body structure allows an
708	          implementation to determine certain properties of a
709	          document, such as the title, without parsing the entire
710	          document.

712	     <

714	4.2. Head

716	      ...  Level 0

718	     The head of an HTML document is an unordered collection of
719	     information about the document. The Title element is
720	     required.

722	     
723	     Introduction to HTML
724	     

726	4.3. Body

728	      ...  Level 0

730	     The Body element identifies the body component of an HTML
731	     document. Specifically, the body of a document may contain
732	     links, text, and formatting information within  and
733	      tags.

735	4.4. Title

737	      ...  Level 0

739	     Every HTML document must contain a Title element. The title
740	     should identify the contents of the document in a global
741	     context, and may be used in history lists and as a label for
742	     the window displaying the document. Unlike headings, titles
743	     are not rendered in the text of a document itself.

745	     The Title element must occur within the head of the
746	     document, and must not contain anchors, paragraph tags, or
747	     highlighting. Only one title is allowed in a document.

749	          NOTE - The length of a title is not limited; however,
750	          long titles may be truncated in some applications. To
751	          minimize this possibility, titles should be fewer than
752	          64 characters. Also keep in mind that a short title,
753	          such as Introduction, may be meaningless out of
754	          context. An example of a meaningful title might be
755	          ``Introduction to HTML Elements.''

757	4.5. Base

759	      Level 0

761	     The Base element allows the URI of the document itself to be
762	     recorded in situations in which the document may be read out
763	     of context. URIs within the document may be in a ``partial''
764	     form relative to this base address[RELURL].

766	     The Base element has one attribute, HREF, which identifies
767	     the absolute base URI.

769	4.6. Isindex

771	      Level 0

773	     The Isindex element tells the interpreter that the document
774	     is an index. This means that the reader may request a
775	     keyword search on the resource by adding a question mark to
776	     the end of the document address, followed by a list of
777	     keywords separated by plus signs.

779	     The Isindex element is usually generated by the network
780	     server from which the document was obtained via a URI. The
781	     server must have a search engine that supports this feature
782	     for the resource. If the document URI is unknown to the
783	     interpreter,  must be ignored.

785	4.7. Link

787	      Level 0

789	     The Link element indicates a relationship between the
790	     document and some other object. A document may have any
791	     number of Link elements.

793	     The Link element is empty (does not have a closing tag), but
794	     takes the same attributes as the Anchor element.

796	     Typical uses are to indicate authorship, related indexes and
797	     glossaries, older or more recent versions, etc. Links can
798	     indicate a static tree structure in which the document was
799	     authored by pointing to a ``parent'' and ``next'' and
800	     ``previous'' document, for example.

802	     Servers may also allow links to be added by those who do not
803	     have the right to alter the body of a document.

805	4.8. Meta

807	      Level 0

809	     The META element is used within the HEAD element to embed
810	     document metainformation not defined by other HTML elements.
811	     META elements can be extracted by servers and/or clients for
812	     use in identifying, indexing, and cataloging specialized
813	     document metainformation.

815	     Although it is generally preferable to use named elements
816	     which have well-defined semantics for each type of
817	     metainformation (e.g. TITLE), the META element is provided
818	     for situations where strict SGML parsing is necessary and
819	     the local DTD is not extensible. HTML interpreters may use
820	     the META element's content if they recognize and understand
821	     the semantics identified by the NAME or HTTP-EQUIV
822	     attributes, and may treat the content as metainformation
823	     (and not render it) even when they do not recognize the
824	     name.

826	     In addition, HTTP servers may wish to read the content of
827	     the document HEAD to generate header fields corresponding to
828	     any elements defining a value for the attribute HTTP-EQUIV.
829	     Note, however, that the method by which the server extracts
830	     document metainformation is not part of this specification,
831	     nor can it be assumed by authors that any given server will
832	     be capable of extracting it. The META element only provides
833	     an extensible mechanism for identifying and embedding
834	     document metainformation - how it may be used is up to the
835	     individual server implementation and the HTML interpreter.

837	     Attributes of the META element:

839	     HTTP-EQUIV
840	                    This attribute binds the element to an HTTP header
841	                    field. It means that if you know the semantics of
842	                    the HTTP header field named by this attribute,
843	                    then you can process the contents based on a
844	                    well-defined syntactic mapping, whether or not
845	                    your DTD tells you anything about it. HTTP header
846	                    field names are not case sensitive. If not
847	                    present, the attribute NAME should be used to
848	                    identify this metainformation and the content
849	                    should not be used within an HTTP response header.

851	     NAME
852	                    Metainformation name. If the NAME attribute is not
853	                    present, the name can be assumed to be equal to
854	                    the value of HTTP-EQUIV.

856	     CONTENT
857	                    The metainformation content to be associated with
858	                    the given name. If multiple META elements are
859	                    provided with the same name, their combined
860	                    contents-concatenated as a comma-separated list-is
861	                    the value associated with that name.

863	     Examples

865	     If the document contains:

867	     
869	     
870	     

873	     then the server (if so configured) may include the following
874	     headers:

876	     Expires: Tue, 04 Dec 1993 21:29:02 GMT
877	     Keywords: Fred, Barney
878	     Reply-to: fielding@ics.uci.edu (Roy Fielding)

880	     as part of the HTTP response to a GET or HEAD request for
881	     that document.

883	     When the HTTP-EQUIV attribute is not present, the server
884	     should not generate an HTTP response header for the
885	     metainformation; e.g.,

887	     

889	     would never generate an HTTP response header, but would
890	     still allow HTML interpreters to identify and make use of
891	     that metainformation.

893	     The Meta element should never be used to define information
894	     that should be associated with an existing HTML element. An
895	     example of an inappropriate use of the Meta element is:

897	     

900	     Do not name an HTTP-EQUIV equal to a response header that
901	     should normally only be generated by the HTTP server.
902	     Example names that are inappropriate include ``Server'',
903	     ``Date'', and ``Last-modified'' - the exact list of
904	     inappropriate names is dependent on the particular server
905	     implementation. We recommend that servers ignore any META
906	     elements which specify HTTP-equivalents which are equal
907	     (case-insensitively) to their own reserved response headers.

909	4.9. Nextid

911	      Level 0

913	     The Nextid element is a parameter read and generated by text
914	     editing software to create unique identifiers. This tag
915	     takes a single attribute which is the next document-wide
916	     alpha- numeric identifier to be allocated of the form z123:

918	     

920	     When modifying a document, existing anchor identifiers
921	     should not be reused, as these identifiers may be referenced
922	     by other documents. Human writers of HTML usually use
923	     mnemonic alphabetical identifiers.

925	     HTML interpreters may ignore the Nextid element. Support for
926	     the Nextid element does not impact HTML interpreters in any
927	     way.

929	5. Character Content

931	     An HTML user agent should present the body of an HTML
932	     document as a collection of typeset paragraphs and
933	     preformatted text. Except for the  element, each block
934	     structuring element is regarded as a paragraph by taking the
935	     data characters in its content and the content of its
936	     descendant elements, concatenating them, and splitting the
937	     result into words, separated by space, tab, or record end
938	     characters (and perhaps hyphen characters). The sequence of
939	     words is typeset as a paragraph by breaking it into lines.

941	5.1. The ISO Latin 1 Character Repertoire

943	     The minimum character repertoire supported by all conforming
944	     HTML user agents is Latin Alphabet Nr. 1, or simply Latin-1.
945	     Latin-1 includes characters from most Western European
946	     languages, as well as a number of control characters.
947	     Latin-1 also includes a non-breaking space, a soft hyphen
948	     indicator, 93 graphical characters, 8 unassigned characters,
949	     and 25 control characters.

951	          NOTE - Use the non-breaking space and soft hyphen
952	          indicator characters is discouraged because support for
953	          them is not widely deployed.

955	     In SGML applications, the use of control characters is
956	     limited in order to maximize the chance of successful
957	     interchange over heterogeneous networks and operating
958	     systems. In HTML, only three control characters are allowed:
959	     Horizontal Tab (HT, encoded as 9 decimal in US-ASCII and
960	     ISO-8859-1), Carriage Return, and Line Feed.

962	     The HTML DTD references the Added Latin 1 entity set, to
963	     allow mnemonic representation of Latin 1 characters using
964	     only the widely supported ASCII character repertoire. For
965	     example:

967	     Kurt Gödel was a famous logician and mathematician.

969	     See 11.4.2, "ISO Latin 1 Character Entity Set" for a table
970	     of the ``Added Latin 1'' entities, and 14.1, "The ISO-8859-1
971	     Coded Character Set" for a table of the code positions of
972	     ISO-8859-1.

974	6. Data Elements

976	6.1. Line Break

978	     
 Level 0

980	     The Line Break element specifies that a new line must be
981	     started at the given point. A new line indents the same as
982	     that of line-wrapped text.

984	     Example of use:

986	      Pease porridge hot

987	     Pease porridge cold

988	     Pease porridge in the pot

989	     Nine days old.

991	6.2. Horizontal Rule

993	     
 Level 0

995	     A Horizontal Rule element is a divider between sections of
996	     text such as a full width horizontal rule or equivalent
997	     graphic.

999	     Example of use:

1001	     
1002	     February 8, 1995, CERN
1003	     

1005	6.3. Image

1007	      Level 0

1009	     The Image element is used to incorporate in-line graphics
1010	     (typically icons or small graphics) into an HTML document.
1011	     This element cannot be used for embedding other HTML text.

1013	     HTML interpreters that cannot render in-line images ignore
1014	     the Image element unless it contains the ALT attribute. Note
1015	     that some HTML interpreters can render linked graphics but
1016	     not in-line graphics. If a graphic is essential, you may
1017	     want to create a link to it rather than to put it in-line.
1018	     If the graphic is not essential, then the Image element is
1019	     appropriate.

1021	     The Image element, which is empty (no closing tag), has
1022	     these attributes:

1024	     ALIGN
1025	                    The ALIGN attribute accepts the values TOP or
1026	                    MIDDLE or BOTTOM, which specifies if the following
1027	                    line of text is aligned with the top, middle, or
1028	                    bottom of the graphic.

1030	     ALT
1031	                    Optional text as an alternative to the graphic for
1032	                    rendering in non-graphical environments. Alternate
1033	                    text should be provided whenever the graphic is
1034	                    not rendered. Alternate text is mandatory for
1035	                    Level 0 documents. Example of use:

1037	      Be sure
1038	     to read these instructions.

1040	     ISMAP
1041	                    The ISMAP (is map) attribute identifies an image
1042	                    as an image map. Image maps are graphics in which
1043	                    certain regions are mapped to URIs. By clicking on
1044	                    different regions, different resources can be
1045	                    accessed from the same graphic. Example of use:

1047	     
1048	     
1049	     

1051	     SRC
1052	                    The value of the SRC attribute is the URI of the
1053	                    document to be embedded; only images can be
1054	                    embedded, not HTML text. Its syntax is the same as
1055	                    that of the HREF attribute of the `' tag. SRC
1056	                    is mandatory. Image elements are allowed within
1057	                    anchors.

1059	     Example of use:

1061	     Be sure to read these
1062	     instructions.

1064	7. Character Format Elements

1066	     Character format elements are used to specify either the
1067	     logical meaning or the physical appearance of marked text
1068	     without causing a paragraph break. Like most other elements,
1069	     character format elements include both opening and closing
1070	     tags. Only the characters between the tags are affected:

1072	     This is emphasized text.

1074	     Character format tags may be ignored by minimal HTML
1075	     applications.

1077	     Character format tags are interpreted from left to right as
1078	     they appear in the flow of text. Level 1 interpreters must
1079	     render highlighted text distinctly from plain text.
1080	     Additionally, EM content must be rendered as distinct from
1081	     STRONG content, and B content must rendered as distinct from
1082	     I content.

1084	     Character format elements may be nested within the content
1085	     of other character format elements; however, HTML
1086	     interpreters are not required to render nested character
1087	     format elements distinctly from non-nested elements:

1089	     plain bold italic may the rendered
1090	     the same as plain bold italic

1092	7.1. Semantic Format Elements

1094	     Note that typical renderings for semantic format elements
1095	     vary between applications. If a specific rendering is
1096	     necessary - for example, when referring to a specific text
1097	     attribute as in ``The italic parts are mandatory'' - a
1098	     physical formating element can be used to ensure that the
1099	     intended rendered is used where possible.

1101	     Note that different sematic elements may be rendered in the
1102	     same way.

1104	7.1.1. Citation

1106	     ... Level 1

1108	     The Citation element specifies a citation, typically
1109	     rendered as italics.

1111	7.1.2. Code

1113	      ...  Level 1

1115	     The Code element indicates an example of code, typically
1116	     rendered in a monospaced font. This should not be confused
1117	     with the Preformatted Text element.

1119	7.1.3. Emphasis

1121	      ...  Level 1

1123	     The Emphasis element indicates typographic emphasis,
1124	     typically rendered as italics.

1126	7.1.4. Keyboard

1128	      ...  Level 1

1130	     The Keyboard element indicates text typed by a user,
1131	     typically rendered in a monospaced font. This is commonly
1132	     used in instruction manuals.

1134	7.1.5. Sample

1136	      ...  Level 1

1138	     The Sample element indicates a sequence of literal
1139	     characters, typically rendered in a monospaced font.

1141	7.1.6. Strong

1143	      ...  Level 1

1145	     The Strong element indicates strong typographic emphasis,
1146	     typically rendered in bold.

1148	7.1.7. Variable

1150	      ...  Level 1

1152	     The Variable element indicates a variable name, typically
1153	     rendered as italic.

1155	7.2. Physical Format Elements

1157	     Physical format elements are used to specify the format of
1158	     marked text.

1160	7.2.1. Bold

1162	      ...  Level 1

1164	     The Bold element specifies that the text should be rendered
1165	     in boldface, where available. Otherwise, an alternative
1166	     mapping is allowed.

1168	7.2.2. Italic

1170	      ...  Level 1

1172	     The Italic element specifies that the text should be
1173	     rendered in an italic font, where available. Otherwise, an
1174	     alternative mapping is allowed.

1176	7.2.3. Teletype

1178	      ...  Level 1

1180	     The Teletype element specifies that the text should be
1181	     rendered in a fixed-width typewriter font.

1183	8. Hyperlink Elements

1185	8.1. Anchor

1187	      ...  Level 0

1189	     An anchor is a marked section of text that is the start
1190	     and/or destination of a hypertext link. Anchor elements are
1191	     defined by the `' tag. The `' tag accepts several
1192	     attributes; at least one of the NAME and HREF attributes is
1193	     required.

1195	     Attributes of the `' tag:

1197	8.1.1. HREF

1199	     If the HREF attribute is present, the text between the
1200	     opening and closing anchor tags becomes hypertext. If this
1201	     hypertext is selected by readers, they are moved to another
1202	     document, or to a different location in the current
1203	     document, whose network address is defined by the value of
1204	     the HREF attribute.

1206	     Example:

1208	     See HaL's
1209	     information for more details.

1211	     In this example, selecting ``HaL'' takes the reader to a
1212	     document at http://www.hal.com. The format of the network
1213	     address is specified in the URI specification for print
1214	     readers.

1216	     With the HREF attribute, the form HREF=``#identifier'' can
1217	     refer to another anchor in the same document.

1219	     Example:

1221	     The glossary defines
1222	     terms used in this document.

1224	     In this example, selecting ``glossary'' takes the reader to
1225	     another anchor (i.e., Glossary) in
1226	     the same document. The NAME attribute is described below. If
1227	     the anchor is in another document, the HREF attribute may be
1228	     relative to the document's address or the specified base
1229	     address (see 4.5, "Base").

1231	8.1.2. NAME

1233	     If present, the NAME attribute allows the anchor to be the
1234	     target of a link. The value of the NAME attribute is an
1235	     identifier for the anchor. Identifiers are arbitrary strings
1236	     but must be unique within the HTML document.

1238	     Example of use:

1240	     Coffee is an example of ...
1241	     ... An example of this is coffee.

1243	     Another document can then make a reference explicitly to
1244	     this anchor by putting the identifier after the address,
1245	     separated by a hash sign:

1247	     

1249	8.1.3. TITLE

1251	     The TITLE attribute is informational only. If present, the
1252	     TITLE attribute should provide the title of the document
1253	     whose address is given by the HREF attribute. The TITLE
1254	     attribute is useful for at least two reasons. The HTML
1255	     interpreter may display the title of the document prior to
1256	     retrieving it, for example, as a margin note or on a small
1257	     box while the mouse is over the anchor, or while the
1258	     document is being loaded. Another reason is that documents
1259	     that are not marked up text, such as graphics, plain text
1260	     and Gopher menus, do not have titles. The TITLE attribute
1261	     can be used to provide a title to such documents. When using
1262	     the TITLE attribute, the title should be valid and unique
1263	     for the destination document.

1265	8.1.4. REL

1267	     The REL attribute gives the relationship(s) described by the
1268	     hypertext link from the anchor to the target. The value is a
1269	     whitespace-separated list of relationship names.
1270	     Relationship names and their semantics will be registered by
1271	     the W3 Consortium. The default relationship is void. The REL
1272	     attribute is only used when the HREF attribute is present.

1274	8.1.5. REV

1276	     The REV attribute is the same as the REL attribute, but the
1277	     semantics of the link type are in the reverse direction. A
1278	     link from A to B with REL=``X'' expresses the same
1279	     relationship as a link from B to A with REV=``X''. An anchor
1280	     may have both REL and REV attributes.

1282	8.1.6. URN

1284	     If present, the URN attribute specifies a uniform resource
1285	     name (URN) for a target document. The format of URNs is
1286	     under discussion (1995) by various working groups of the
1287	     Internet Engineering Task Force.

1289	8.1.7. METHODS

1291	     The METHODS attributes of anchors and links provide
1292	     information about the functions that the user may perform on
1293	     an object. These are more accurately given by the HTTP
1294	     protocol when it is used, but it may, for similar reasons as
1295	     for the TITLE attribute, be useful to include the
1296	     information in advance in the link. For example, the HTML
1297	     interpreter may chose a different rendering as a function of
1298	     the methods allowed; for example, something that is
1299	     searchable may get a different icon.

1301	     The value of the METHODS attribute is a whitespace-separated
1302	     list of HTTP methods supported by the object for public use.

1304	9. Block Structuring Elements

1306	     The following elements may be included in the body of an
1307	     HTML document:

1309	9.1. Paragraph

1311	      ...  Level 0

1313	     The Paragraph element indicates a paragraph. The exact
1314	     indentation, leading space, etc. of a paragraph is not
1315	     defined and may be a function of other tags, style sheets,
1316	     etc.

1318	     Typically, paragraphs are surrounded by a vertical space of
1319	     one line or half a line. This is typically not the case
1320	     within the Address element and is never the case within the
1321	     Preformatted Text element. With some HTML interpreters, the
1322	     first line in a paragraph is indented.

1324	     Example of use:

1326	     This Heading Precedes the Paragraph
1327	     This is the text of the first paragraph.
1328	     
This is the text of the second paragraph. Although you do not
1329	     need to start paragraphs on new lines, maintaining this
1330	     convention facilitates document maintenance.
1331	     This is the text of a third paragraph.

1333	9.2. Preformatted Text

1335	      ...  Level 0

1337	     The Preformatted Text element presents blocks of text in
1338	     fixed-width font, and so is suitable for text that has been
1339	     formatted on screen.

1341	     The  tag may be used with the optional WIDTH attribute.
1342	     The WIDTH attribute specifies the maximum number of
1343	     characters for a line and allows the HTML interpreter to
1344	     select a suitable font and indentation. If the WIDTH
1345	     attribute is not present, a width of 80 characters is
1346	     assumed. Where the WIDTH attribute is supported, widths of
1347	     40, 80 and 132 characters should be presented optimally,
1348	     with other widths being rounded up.

1350	     Within preformatted text:

1352	          * Line breaks within the text are rendered as a move to
1353	          the beginning of the next line.
1354	          * Anchor elements and character highlighting elements
1355	          may be used.
1356	          * Elements that define paragraph formatting (headings,
1357	          address, etc.) must not be used.
1358	          * The horizontal tab character (encoded in US-ASCII and
1359	          ISO-8859-1 as decimal 9) must be interpreted as the
1360	          smallest positive nonzero number of spaces which will
1361	          leave the number of characters so far on the line as a
1362	          multiple of 8. Its use is not recommended however.

1364	          NOTE - Som historical documents contain  tags in
1365	           elements. User agents are engcouraged to treat
1366	          this a a line break. A  tag followed by a newline
1367	          character should produce only one line break, not a
1368	          line break plus a blank line.

1370	          NOTE - References to the ``beginning of a new line'' do
1371	          not imply that the renderer is forbidden from using a
1372	          constant left indent for rendering preformatted text.
1373	          The left indent may be constrained by the width
1374	          required.

1376	     Example of use:

1378	     
1379	     This is an example line.
1380	     

1382	          NOTE - Within a Preformatted Text element, the
1383	          constraint that the rendering must be on a fixed
1384	          horizontal character pitch may limit or prevent the
1385	          ability of the HTML interpreter to faithfully render
1386	          character formatting elements.

1388	9.3. Address

1390	      ...  Level 0

1392	     The Address element specifies such information as address,
1393	     signature and authorship, often at the top or bottom of a
1394	     document.

1396	     Typically, an Address is rendered in an italic typeface and
1397	     may be indented. The Address element implies a paragraph
1398	     break before and after.

1400	     Example of use:

1402	     
1403	     Newsletter editor

1404	     J.R. Brown

1405	     JimquickPost News, Jumquick, CT 01234

1406	     Tel (123) 456 7890
1407	     

1409	9.4. Blockquote

1411	      ...  Level 0

1413	     The Blockquote element is used to contain text quoted from
1414	     another source.

1416	     A typical rendering might be a slight extra left and right
1417	     indent, and/or italic font. The Blockquote element causes a
1418	     paragraph break, and typically provides space above and
1419	     below the quote.

1421	     Single-font rendition may reflect the quotation style of
1422	     Internet mail by putting a vertical line of graphic
1423	     characters, such as the greater than symbol (>), in the left
1424	     margin.

1426	     Example of use:

1428	     I think the poem ends
1429	     
1430	     Soft you now, the fair Ophelia. Nymph, in thy orisons, be all
1431	     my sins remembered.
1432	     
1433	     but I am not sure.

1435	9.5. Headings

1437	      ... 
 through  ...  Level 0

1439	     HTML defines six levels of heading. A Heading element
1440	     implies all the font changes, paragraph breaks before and
1441	     after, and white space necessary to render the heading.

1443	     The highest level of headings is H1, followed by H2 ... H6.

1445	     Example of use:

1447	     This is a heading
1448	     Here is some text
1449	     Second level heading
1450	     Here is some more text.

1452	     The rendering of headings is determined by the HTML
1453	     interpreter, but typical renderings are:

1455	      ... 
1456	                    Bold, very-large font, centered. One or two blank
1457	                    lines above and below.

1459	      ... 
1460	                    Bold, large font, flush-left. One or two blank
1461	                    lines above and below.

1463	      ... 
1464	                    Italic, large font, slightly indented from the
1465	                    left margin. One or two blank lines above and
1466	                    below.

1468	      ... 
1469	                    Bold, normal font, indented more than H3. One
1470	                    blank line above and below.

1472	      ... 
1473	                    Italic, normal font, indented as H4. One blank
1474	                    line above.

1476	      ... 
1477	                    Bold, indented same as normal text, more than H5.
1478	                    One blank line above.

1480	     Although heading levels can be skipped (for example, from H1
1481	     to H3), this practice is discouraged as skipping heading
1482	     levels may produce unpredictable results when generating
1483	     other representations from HTML.

1485	9.6. List Elements

1487	     HTML supports several types of lists, all of which may be
1488	     nested.

1490	9.6.1. Definition List

1492	      ...  Level 0

1494	     A definition list is a list of terms and corresponding
1495	     definitions. Definition lists are typically formatted with
1496	     the term flush-left and the definition, formatted paragraph
1497	     style, indented after the term.

1499	     Example of use:

1501	     
1502	     Term
This is the definition of the first term.
1503	     
Term
This is the definition of the second term.
1504	     

1506	     If the DT term does not fit in the DT column (one third of
1507	     the display area), it may be extended across the page with
1508	     the DD section moved to the next line, or it may be wrapped
1509	     onto successive lines of the left hand column.

1511	     Single occurrences of a  tag without a subsequent 

1512	     tag are allowed, and have the same significance as if the
1513	      tag had been present with no text.

1515	     The opening list tag must be  and must be immediately
1516	     followed by the first term ().

1518	     The definition list type can take the COMPACT attribute,
1519	     which suggests that a compact rendering be used, because the
1520	     list items are small and/or the entire list is large.

1522	     Unless you provide the COMPACT attribute, the HTML
1523	     interpreter may leave white space between successive DT, DD
1524	     pairs. The COMPACT attribute may also reduce the width of
1525	     the left-hand (DT) column.

1527	     If using the COMPACT attribute, the opening list tag must be
1528	     , which must be immediately followed by the
1529	     first  tag:

1531	     
1532	     Term
This is the first definition in compact format.
1533	     
Term
This is the second definition in compact format.
1534	     

1536	9.6.2. Directory List

1538	      ...  Level 0

1540	     A Directory List element is used to present a list of items
1541	     containing up to 20 characters each. Items in a directory
1542	     list may be arranged in columns, typically 24 characters
1543	     wide. If the HTML interpreter can optimize the column width
1544	     as function of the widths of individual elements, so much
1545	     the better.

1547	     A directory list must begin with the  tag which is
1548	     immediately followed by a  (list item) tag:

1550	     
1551	     A-H
I-M
1552	     
M-R
S-Z
1553	     

1555	9.6.3. Menu List

1557	      ...  Level 0

1559	     A menu list is a list of items with typically one line per
1560	     item. The menu list style is more compact than the style of
1561	     an unordered list.

1563	     A menu list must begin with a  tag which is
1564	     immediately followed by a  (list item) tag:

1566	     
1567	     First item in the list.
1568	     
Second item in the list.
1569	     
Third item in the list.
1570	     

1572	9.6.4. Ordered List

1574	      ...  Level 0

1576	     The Ordered List element is used to present a numbered list
1577	     of items, sorted by sequence or order of importance.

1579	     An ordered list must begin with the  tag which is
1580	     immediately followed by a  (list item) tag:

1582	     
1583	     Click the Web button to open the Open the URI window.
1584	     
Enter the URI number in the text field of the Open URI
1585	     window. The Web document you specified is displayed.
1586	     
Click highlighted text to move from one link to another.
1587	     

1589	     The Ordered List element can take the COMPACT attribute,
1590	     which suggests that a compact rendering be used.

1592	9.6.5. Unordered List

1594	      ...  Level 0

1596	     The Unordered List element is used to present a list of
1597	     items which is typically separated by white space and/or
1598	     marked by bullets.

1600	     An unordered list must begin with the  tag which is
1601	     immediately followed by a  (list item) tag:

1603	     
1604	     First list item
1605	     
Second list item
1606	     
Third list item
1607	     

1609	10. Form-based Input Elements

1611	     Forms are created by placing input fields within paragraphs,
1612	     preformatted/literal text, and lists. This gives
1613	     considerable flexibility in designing the layout of forms.

1615	     The following elements are used to create forms:

1617	     FORM
1618	                    A form within a document.

1620	     INPUT
1621	                    One input field.

1623	     OPTION
1624	                    One option within a Select element.

1626	     SELECT
1627	                    A selection from a finite set of options.

1629	     TEXTAREA
1630	                    A multi-line input field.

1632	     Each variable field is defined by an Input, Textarea, or
1633	     Option element and must have an NAME attribute to identify
1634	     its value in the data returned when the form is submitted.

1636	     Example of use (a questionnaire form):

1638	     Sample Questionnaire
1639	     Please fill out this questionnaire:
1640	     

1641	     Your name: 
1642	     
Male 
1643	     
Female 
1644	     
Number in family: 
1645	     
Cities in which you maintain a residence:
1646	     

1647	     Kent 
1648	     
Miami 
1649	     
Other 
1650	     
1651	     Nickname: 
1652	     Thank you for responding to this questionnaire.
1653	     
 
1654	     

1656	     In the example above, the  and  tags have been used
1657	     to lay out the text and input fields. The HTML interpreter
1658	     is responsible for handling which field will currently get
1659	     keyboard input.

1661	     Many platforms have existing conventions for forms, for
1662	     example, using Tab and Shift keys to move the keyboard focus
1663	     forwards and backwards between fields, and using the Enter
1664	     key to submit the form. In the example, the SUBMIT and RESET
1665	     buttons are specified explicitly with special purpose
1666	     fields. The SUBMIT button is used to e-mail the form or send
1667	     its contents to the server as specified by the ACTION
1668	     attribute, while RESET resets the fields to their initial
1669	     values. When the form consists of a single text field, it
1670	     may be appropriate to leave such buttons out and rely on the
1671	     Enter key.

1673	     The Input element is used for a large variety of types of
1674	     input fields.

1676	     To let users enter more than one line of text, use the
1677	     Textarea element.

1679	     The radio button and checkbox types of input field can be
1680	     used to specify multiple choice forms in which every
1681	     alternative is visible as part of the form. An alternative
1682	     is to use the Select element which is typically rendered in
1683	     a more compact fashion as a pull down combo list.

1685	10.1. Form

1687	      ...  Level 2

1689	     The Form element is used to delimit a data input form. There
1690	     can be several forms in a single document, but the Form
1691	     element can't be nested.

1693	     The ACTION attribute is a URI specifying the location to
1694	     which the contents of the form is submitted to elicit a
1695	     response. If the ACTION attribute is missing, the URI of the
1696	     document itself is assumed. The way data is submitted varies
1697	     with the access protocol of the URI, and with the values of
1698	     the METHOD and ENCTYPE attributes.

1700	     In general:

1702	          * the METHOD attribute selects variations in the
1703	          protocol.
1704	          * the ENCTYPE attribute specifies the format of the
1705	          submitted data in case the protocol does not impose a
1706	          format itself.

1708	     When the ACTION attribute is set to an HTTP URL, the METHOD
1709	     attribute must be set to an HTTP method [HTTP]. The default
1710	     method is GET, although for many applications the POST
1711	     method is preferred. With the POST method, the ENCTYPE
1712	     attribute is a media type specifying the format of the
1713	     posted data; the default is
1714	     ``application/x-www-form-urlencoded''.

1716	     The submitted contents of the form logically consist of
1717	     name/value pairs. The names are usually equal to the NAME
1718	     attributes of the various interactive elements in the form.

1720	          NOTE - The names are not guaranteed to be unique keys,
1721	          nor are the names of form elements required to be
1722	          distinct. The values encode the user's input to the
1723	          corresponding interactive elements. Fields with null
1724	          values may be omitted from the returned list of
1725	          name/value pairs, whereas those with non-null values
1726	          should be included (even if the value was not altered
1727	          by the user). In particular, unselected radio buttons
1728	          and checkboxes should be excluded from the contents
1729	          list.

1731	10.2. Input

1733	      Level 2

1735	     The Input element represents a field whose contents may be
1736	     edited by the user.

1738	     Attributes of the Input element:

1740	     ALIGN
1741	                    Vertical alignment of the image. For use only with
1742	                    TYPE=IMAGE. The possible values are exactly the
1743	                    same as for the ALIGN attribute of the image
1744	                    element.

1746	     CHECKED
1747	                    Indicates that a checkbox or radio button is
1748	                    selected. Unselected checkboxes and radio buttons
1749	                    do not return name/value pairs when the form is
1750	                    submitted.

1752	     MAXLENGTH
1753	                    Indicates the maximum number of characters that
1754	                    can be entered into a text field. This can be
1755	                    greater than specified by the SIZE attribute, in
1756	                    which case the field will scroll appropriately.
1757	                    The default number of characters is unlimited.

1759	     NAME
1760	                    Symbolic name used when transferring the form's
1761	                    contents. The NAME attribute is required for most
1762	                    input types and is normally used to provide a
1763	                    unique identifier for a field, or for a logically
1764	                    related group of fields.

1766	     SIZE
1767	                    Specifies the size or precision of the field
1768	                    according to its type. For example, to specify a
1769	                    field with a visible width of 24 characters:

1771	     INPUT TYPE=text SIZE="24"

1773	     SRC
1774	                    A URI specifying an image. For use only with
1775	                    TYPE=IMAGE.

1777	     TYPE
1778	                    Defines the type of data the field accepts.
1779	                    Defaults to free text. Several types of fields can
1780	                    be defined with the type attribute:

1782	     CHECKBOX
1783	                    Used for simple Boolean attributes, or for
1784	                    attributes that can take multiple values at the
1785	                    same time. The latter is represented by a number
1786	                    of checkbox fields each of which has the same
1787	                    name. Each selected checkbox generates a separate
1788	                    name/value pair in the submitted data, even if
1789	                    this results in duplicate names. The default value
1790	                    for checkboxes is ``on''.

1792	     HIDDEN
1793	                    No field is presented to the user, but the content
1794	                    of the field is sent with the submitted form. This
1795	                    value may be used to transmit state information
1796	                    about client/server interaction.

1798	     IMAGE
1799	                    An image field upon which you can click with a
1800	                    pointing device, causing the form to be
1801	                    immediately submitted. The coordinates of the
1802	                    selected point are measured in pixel units from
1803	                    the upper-left corner of the image, and are
1804	                    returned (along with the other contents of the
1805	                    form) in two name/value pairs. The x-coordinate is
1806	                    submitted under the name of the field with ``.x''
1807	                    appended, and the y-coordinate is submitted under
1808	                    the name of the field with ``.y'' appended. Any
1809	                    VALUE attribute is ignored. The image itself is
1810	                    specified by the SRC attribute, exactly as for the
1811	                    Image element.

1813	          NOTE - In a future version of the HTML specification,
1814	          the IMAGE functionality may be folded into an enhanced
1815	          SUBMIT field.

1817	     PASSWORD
1818	                    The same as the TEXT attribute, except that text
1819	                    is not displayed as it is entered.

1821	     RADIO
1822	                    Used for attributes that accept a single value
1823	                    from a set of alternatives. Each radio button
1824	                    field in the group should be given the same name.
1825	                    Only the selected radio button in the group
1826	                    generates a name/value pair in the submitted data.
1827	                    Radio buttons require an explicit VALUE attribute.

1829	     RESET
1830	                    A button that when pressed resets the form's
1831	                    fields to their specified initial values. The
1832	                    label to be displayed on the button may be
1833	                    specified just as for the SUBMIT button.

1835	     SUBMIT
1836	                    A button that when pressed submits the form. You
1837	                    can use the VALUE attribute to provide a
1838	                    non-editable label to be displayed on the button.
1839	                    The default label is application-specific. If a
1840	                    SUBMIT button is pressed in order to submit the
1841	                    form, and that button has a NAME attribute
1842	                    specified, then that button contributes a
1843	                    name/value pair to the submitted data. Otherwise,
1844	                    a SUBMIT button makes no contribution to the
1845	                    submitted data.

1847	     TEXT
1848	                    Used for a single line text entry fields. Use in
1849	                    conjunction with the SIZE and MAXLENGTH
1850	                    attributes. Use the Textarea element for text
1851	                    fields which can accept multiple lines.

1853	     VALUE
1854	                    The initial displayed value of the field, if it
1855	                    displays a textual or numerical value; or the
1856	                    value to be returned when the field is selected,
1857	                    if it displays a Boolean value. This attribute is
1858	                    required for radio buttons.

1860	10.3. Option

1862