idnits 2.17.1
draft-ietf-html-spec-02.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
this document.
Expected boilerplate is as follows today (2024-04-26) according to
https://trustee.ietf.org/license-info :
IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
This Internet-Draft is submitted in full conformance with the provisions
of BCP 78 and BCP 79.
IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
Copyright (c) 2024 IETF Trust and the persons identified as the document
authors. All rights reserved.
IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
This document is subject to BCP 78 and the IETF Trust's Legal Provisions
Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Simplified BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided
without warranty as described in the Simplified BSD License.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
** Missing expiration date. The document expiration date should appear on
the first and last page.
** The document seems to lack a 1id_guidelines paragraph about
Internet-Drafts being working documents.
** The document seems to lack a 1id_guidelines paragraph about 6 months
document validity -- however, there's a paragraph with a matching
beginning. Boilerplate error?
** The document seems to lack a 1id_guidelines paragraph about the list of
current Internet-Drafts.
** The document seems to lack a 1id_guidelines paragraph about the list of
Shadow Directories.
== No 'Intended status' indicated for this document; assuming Proposed
Standard
== The page length should not exceed 58 lines per page, but there was 1
longer page, the longest (page 1) being 3550 lines
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
** The document seems to lack an IANA Considerations section. (See Section
2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
when there are no actions for IANA.)
** There are 31 instances of too long lines in the document, the longest
one being 15 characters in excess of 72.
== There are 1 instance of lines with non-RFC2606-compliant FQDNs in the
document.
Miscellaneous warnings:
----------------------------------------------------------------------------
-- The document seems to lack a disclaimer for pre-RFC5378 work, but may
have content which was first submitted before 10 November 2008. If you
have contacted all the original authors and they are all willing to grant
the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
this comment. If not, you may need to add the pre-RFC5378 disclaimer.
(See the Legal Provisions document at
https://trustee.ietf.org/license-info for more information.)
-- The document date (May 6, 1995) is 10583 days in the past. Is this
intentional?
-- Found something which looks like a code comment -- if you have code
sections in the document, please surround them with ' Some text. *wow*
288 |
289 \-"Some text. "
290 |
291 \-EM
292 | |
293 | \-
294 | |
295 | \-"*wow*"
296 | |
297 | \-
298 |
299 \-' and
'
' lines.
Checking references for intended status: Proposed Standard
----------------------------------------------------------------------------
(See RFCs 3967 and 4897 for information about using normative references
to lower-maturity documents in RFCs)
-- Missing reference section? 'IMEDIA' on line 3041 looks like a reference
-- Missing reference section? 'MIME' on line 3025 looks like a reference
-- Missing reference section? 'SGML' on line 3072 looks like a reference
-- Missing reference section? 'IANA' on line 3045 looks like a reference
-- Missing reference section? 'RELURL' on line 3032 looks like a reference
-- Missing reference section? 'HTTP' on line 3018 looks like a reference
-- Missing reference section? 'URL' on line 3012 looks like a reference
-- Missing reference section? 'URI' on line 3005 looks like a reference
-- Missing reference section? 'GOLD90' on line 3037 looks like a reference
-- Missing reference section? 'SQ91' on line 3049 looks like a reference
-- Missing reference section? 'US-ASCII' on line 3053 looks like a reference
-- Missing reference section? 'ISO-8859-1' on line 3058 looks like a
reference
Summary: 8 errors (**), 0 flaws (~~), 3 warnings (==), 15 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
1 HTML Working Group T. Berners-Lee
2 INTERNET-DRAFT MIT/W3C
3
This is a Heading
375 Some elements only have a start-tag without an end-tag. For
376 example, to create a line break, you use the `
' tag.
377 Additionally, the end tags of some other elements, such as
378 Paragraph (`
', `', and `' are 401 equivalent, whereas `&' is different from `&'. 403 In a start-tag, the element name must immediately follow the 404 tag open delimiter `<'. 406 2.2.4. Attributes 408 In a start-tag, white space and attributes are allowed 409 between the element name and the closing delimiter. An 410 attribute typically consists of an attribute name, an equal 411 sign, and a value, though some attributes may be just a 412 value. White space is allowed around the equal sign. 414 The value of the attribute may be either: 416 * A string literal, delimited by single quotes or 417 double quotes and not containing any occurrences of the 418 delimiting character. 419 * A name token (a sequence of letters, digits, periods, 420 or hyphens) 422 In this example, img is the element name, `src' is the 423 attribute name, and `http://host/dir/file.gif' is the 424 attribute value: 426 428 NOTE - Some historical implementations consider any 429 occurrence of the `>' character to signal the end of a 430 tag. For ompatibility with such implementations, when 431 `>' appears in an attribute value, it should be 432 represented with a numeric character reference, such as 433 in: `'. 435 A useful technique for computing an attribute value literal 436 for a given string is to replace each quote and space 437 character by an entity reference or numeric character 438 reference as follows: 440 ENTITY NUMERIC 441 CHARACTER REFERENCE CHAR REF CHARACTER DESCRIPTION 442 TAB Tab 443 LF Line Feed 444 CR Carriage Return 445 Space 446 " " " Quotation mark 447 & & & Ampersand 449 For example: 451 453 NOTE - Some historical implementations allow any 454 character except space or `>' in a name token. 455 Attributes values must be quoted only if they don't 456 satisfy the syntax for a name token. 458 Note that the SGML declaration in section 13.3 limits the 459 length of an attribute value to 1024 characters. 461 Attributes such as ISMAP and COMPACT, may be written using a 462 minimized syntax. The markup: 464466 can be written using a minimized syntax: 468
470 NOTE - Some historical implementations only understand 471 the minimized syntax. 473 2.2.5. Comments 475 To include comments in an HTML document that will be 476 eliminated in the mapping to terminals, surround them with 477 `'. After the comment delimiter, all text up 478 to the next occurrence of `-->' is ignored. Hence comments 479 cannot be nested. White space is allowed between the closing 480 `--' and `>', but not between the opening ` 485
HTML Guide: Recommended Usage 486 487 489 NOTE - Some historical HTML implementations incorrectly 490 consider any `>' character to be the termination of a 491 comment. 493 2.2.6. Example HTML Document 495 496 497 498 499Structural Example 500 501First Header
502This is a paragraph in the example HTML file. Keep in mind 503 that the title does not appear in the document text, but that 504 the header (defined by H1) does.
505506
515- First item in an ordered list. 507
- Second item in an ordered list. 508
509
513- Note that lists can be nested; 510
- Whitespace may be used to assist in reading the 511 HTML source. 512
- Third item in an ordered list. 514
This is an additional paragraph. Technically, end tags are 516 not required for paragraphs, although they are allowed. You can 517 include character highlighting in a paragraph. This sentence 518 of the paragraph is emphasized. Note that the </P> 519 end tag has been omitted. 520
521 522 Be sure to read these bold instructions. 523 525 3. HTML as an Internet Media Type 527 An HTML user agent allows users to interact with resources 528 which have HTML representations. At a minimum, it must allow 529 users to examine and navigate the content of HTML documents. 530 HTML user agents should be able to preserve all formatting 531 distinctions represented in an HTML document, and be able to 532 simultaneously present resources referred to by IMG 533 elements. (they may ignore some formatting distinctions or 534 IMG resources at the request of the user). Conforming HTML 535 user agents should support form entry and submission. 537 3.1. text/html media type 539 This specification defines the Internet Media Type[IMEDIA] 540 (formerly referred to as the Content Type[MIME]) called 541 `text/html'. The following is to be registered with [IANA]. 543 Media Type name 544 text 546 Media subtype 547 name 548 html 550 Required 551 parameters 552 none 554 Optional 555 parameters 556 version, charset 558 Encoding 559 considerations 560 any encoding is allowed 562 Security 563 considerations 564 see 3.3, "Security Considerations" 566 The optional parameters are defined as follows: 568 Version 569 To help avoid future compatibility problems, the 570 version parameter may be used to give the version 571 number of the specification to which the document 572 conforms. The version number appears at the front 573 of this document and within the public identifier 574 of the HTML DTD. This specification defines 575 version 2.0. There is no default. 577 Charset 578 The charset parameter (as defined in section 7.1.1 579 of RFC 1521[MIME]) may be given to specify the 580 character encoding scheme used to represent the 581 HTML document as a sequence of octets. The default 582 value is outside the scope of this specification; 583 but for example, the default is US-ASCII in the 584 context of MIME mail, and ISO-8859-1 in the 585 context of HTTP. 587 3.2. HTML Document Representation 589 A message entity with a content type of `text/html' 590 represents an HTML document, consisting of a single text 591 entity. The `charset' parameter (whether implicit or 592 explicit) identifies a character encoding scheme. The text 593 entity consists of the characters determined by this 594 character encoding scheme and the octets of the body of the 595 message entity. 597 3.2.1. Undeclared Markup Error Handling 599 To facilitate experimentation and interoperability between 600 implementations of various versions of HTML, the installed 601 base of HTML user agents supports a superset of the HTML 2.0 602 language by reducing it to HTML 2.0: markup in the form of a 603 start-tag or end-tag whose generic identifier is not 604 declared is mapped to nothing during tokenization. 605 Undeclared attributes are treated similarly. The entire 606 attribute specification of an unknown attribute (i.e., the 607 unknown attribute and its value, if any) should be ignored. 608 On the other hand, references to undeclared entities should 609 be treated as data characters. 611 For example: 613
614 =>foo
...
,"foo",
,,"..." 615 xxx
yyy 616 => "xxx ",
," yyy 617 Let α and β be finite sets. 618 => "Let α and β be finite sets." 620 Support for notifying the user of such errors is encouraged. 622 Information providers are warned that this convention is not 623 binding: unspecified behavior may result, as such markup is 624 not conforming to this specification. 626 3.2.2. Conventional Representation of Newlines 628 SGML specifies that a text entity is a sequence of records, 629 each beginning with a record start character and ending with 630 a record end character (code positions 10 and 13 631 respectively). (section 7.6.1, ``Record Boundaries'' in 632 [SGML]) 634 [MIME] specifies that a body of type `text/*' is a sequence 635 of lines, each terminated by CRLF, that is octets 10, 13. 637 In practice, HTML documents are frequently represented and 638 transmitted using an end of line convention that depends on 639 the conventions of the source of the document; frequently, 640 that representation consists of CR only, LF only, or CR LF 641 combination. Hence the decoding of the octets will often 642 result in a text entity with some missing record start and 643 record end characters. 645 Since there is no ambiguity, HTML user agents are encouraged 646 to infer the missing record start and end characters. 648 An HTML user agent should treat end of line in any of its 649 variations as a word space in all contexts except 650 preformatted text. Within preformatted text, an HTML user 651 agent should expect to treat any of the three common 652 representations of end-of-line as starting a new line. 654 3.3. Security Considerations 656 Anchors, embedded images, and all other elements which 657 contain URIs as parameters may cause the URI to be 658 dereferenced in response to user input. In this case, the 659 security considerations of the URI specification apply. 661 The widely deployed methods for submitting forms requests -- 662 HTTP and SMTP -- provide little assurance of 663 confidentiality. Information providers who request sensitive 664 information via forms -- especially by way of the `PASSWORD' 665 type input field -- should be aware and make their users 666 aware of the lack of confidentiality. 668 > 670 4. Document Structure Elements 672 To identify information as an HTML document conforming to 673 this specification, each document should start with the 674 prologue: 676 678 NOTE - If the body of a text/html body part does not 679 begin with a document type declaration, an HTML user 680 agent should infer the above document type declaration. 682 HTML user agents are required to support the above document 683 type declaration, the following document type declarations, 684 and no others. 686 687 689 In particular, they may support other formal public 690 identifiers, or document types altogether. They may support 691 an internal declaration subset with supplemental entity, 692 element, and other markup declarations, or they may not. 694 4.1. HTML Document Element 696 ... Level 0 698 The HTML document element is organized as a head and a body, 699 much like a memo or a mail message. Within the head, you can 700 specify the title and other information about the document. 701 Within the body, you can structure text into paragraphs and 702 lists, as well as highlight phrases and create links, using 703 HTML elements. 705 NOTE - The start and end tags for HTML, Head, and Body 706 elements are omissible; however, this is not 707 recommended since the head/body structure allows an 708 implementation to determine certain properties of a 709 document, such as the title, without parsing the entire 710 document. 712 < 714 4.2. Head 716
... Level 0 718 The head of an HTML document is an unordered collection of 719 information about the document. The Title element is 720 required. 722 723Introduction to HTML 724 726 4.3. Body 728 ... Level 0 730 The Body element identifies the body component of an HTML 731 document. Specifically, the body of a document may contain 732 links, text, and formatting information within and 733 tags. 735 4.4. Title 737... Level 0 739 Every HTML document must contain a Title element. The title 740 should identify the contents of the document in a global 741 context, and may be used in history lists and as a label for 742 the window displaying the document. Unlike headings, titles 743 are not rendered in the text of a document itself. 745 The Title element must occur within the head of the 746 document, and must not contain anchors, paragraph tags, or 747 highlighting. Only one title is allowed in a document. 749 NOTE - The length of a title is not limited; however, 750 long titles may be truncated in some applications. To 751 minimize this possibility, titles should be fewer than 752 64 characters. Also keep in mind that a short title, 753 such as Introduction, may be meaningless out of 754 context. An example of a meaningful title might be 755 ``Introduction to HTML Elements.'' 757 4.5. Base 759Level 0 761 The Base element allows the URI of the document itself to be 762 recorded in situations in which the document may be read out 763 of context. URIs within the document may be in a ``partial'' 764 form relative to this base address[RELURL]. 766 The Base element has one attribute, HREF, which identifies 767 the absolute base URI. 769 4.6. Isindex 771 Level 0 773 The Isindex element tells the interpreter that the document 774 is an index. This means that the reader may request a 775 keyword search on the resource by adding a question mark to 776 the end of the document address, followed by a list of 777 keywords separated by plus signs. 779 The Isindex element is usually generated by the network 780 server from which the document was obtained via a URI. The 781 server must have a search engine that supports this feature 782 for the resource. If the document URI is unknown to the 783 interpreter, must be ignored. 785 4.7. Link 787 Level 0 789 The Link element indicates a relationship between the 790 document and some other object. A document may have any 791 number of Link elements. 793 The Link element is empty (does not have a closing tag), but 794 takes the same attributes as the Anchor element. 796 Typical uses are to indicate authorship, related indexes and 797 glossaries, older or more recent versions, etc. Links can 798 indicate a static tree structure in which the document was 799 authored by pointing to a ``parent'' and ``next'' and 800 ``previous'' document, for example. 802 Servers may also allow links to be added by those who do not 803 have the right to alter the body of a document. 805 4.8. Meta 807 Level 0 809 The META element is used within the HEAD element to embed 810 document metainformation not defined by other HTML elements. 811 META elements can be extracted by servers and/or clients for 812 use in identifying, indexing, and cataloging specialized 813 document metainformation. 815 Although it is generally preferable to use named elements 816 which have well-defined semantics for each type of 817 metainformation (e.g. TITLE), the META element is provided 818 for situations where strict SGML parsing is necessary and 819 the local DTD is not extensible. HTML interpreters may use 820 the META element's content if they recognize and understand 821 the semantics identified by the NAME or HTTP-EQUIV 822 attributes, and may treat the content as metainformation 823 (and not render it) even when they do not recognize the 824 name. 826 In addition, HTTP servers may wish to read the content of 827 the document HEAD to generate header fields corresponding to 828 any elements defining a value for the attribute HTTP-EQUIV. 829 Note, however, that the method by which the server extracts 830 document metainformation is not part of this specification, 831 nor can it be assumed by authors that any given server will 832 be capable of extracting it. The META element only provides 833 an extensible mechanism for identifying and embedding 834 document metainformation - how it may be used is up to the 835 individual server implementation and the HTML interpreter. 837 Attributes of the META element: 839 HTTP-EQUIV 840 This attribute binds the element to an HTTP header 841 field. It means that if you know the semantics of 842 the HTTP header field named by this attribute, 843 then you can process the contents based on a 844 well-defined syntactic mapping, whether or not 845 your DTD tells you anything about it. HTTP header 846 field names are not case sensitive. If not 847 present, the attribute NAME should be used to 848 identify this metainformation and the content 849 should not be used within an HTTP response header. 851 NAME 852 Metainformation name. If the NAME attribute is not 853 present, the name can be assumed to be equal to 854 the value of HTTP-EQUIV. 856 CONTENT 857 The metainformation content to be associated with 858 the given name. If multiple META elements are 859 provided with the same name, their combined 860 contents-concatenated as a comma-separated list-is 861 the value associated with that name. 863 Examples 865 If the document contains: 867 869 870 873 then the server (if so configured) may include the following 874 headers: 876 Expires: Tue, 04 Dec 1993 21:29:02 GMT 877 Keywords: Fred, Barney 878 Reply-to: fielding@ics.uci.edu (Roy Fielding) 880 as part of the HTTP response to a GET or HEAD request for 881 that document. 883 When the HTTP-EQUIV attribute is not present, the server 884 should not generate an HTTP response header for the 885 metainformation; e.g., 887 889 would never generate an HTTP response header, but would 890 still allow HTML interpreters to identify and make use of 891 that metainformation. 893 The Meta element should never be used to define information 894 that should be associated with an existing HTML element. An 895 example of an inappropriate use of the Meta element is: 897 900 Do not name an HTTP-EQUIV equal to a response header that 901 should normally only be generated by the HTTP server. 902 Example names that are inappropriate include ``Server'', 903 ``Date'', and ``Last-modified'' - the exact list of 904 inappropriate names is dependent on the particular server 905 implementation. We recommend that servers ignore any META 906 elements which specify HTTP-equivalents which are equal 907 (case-insensitively) to their own reserved response headers. 909 4.9. Nextid 911 Level 0 913 The Nextid element is a parameter read and generated by text 914 editing software to create unique identifiers. This tag 915 takes a single attribute which is the next document-wide 916 alpha- numeric identifier to be allocated of the form z123: 918 920 When modifying a document, existing anchor identifiers 921 should not be reused, as these identifiers may be referenced 922 by other documents. Human writers of HTML usually use 923 mnemonic alphabetical identifiers. 925 HTML interpreters may ignore the Nextid element. Support for 926 the Nextid element does not impact HTML interpreters in any 927 way. 929 5. Character Content 931 An HTML user agent should present the body of an HTML 932 document as a collection of typeset paragraphs and 933 preformatted text. Except for the element, each block 934 structuring element is regarded as a paragraph by taking the 935 data characters in its content and the content of its 936 descendant elements, concatenating them, and splitting the 937 result into words, separated by space, tab, or record end 938 characters (and perhaps hyphen characters). The sequence of 939 words is typeset as a paragraph by breaking it into lines. 941 5.1. The ISO Latin 1 Character Repertoire 943 The minimum character repertoire supported by all conforming 944 HTML user agents is Latin Alphabet Nr. 1, or simply Latin-1. 945 Latin-1 includes characters from most Western European 946 languages, as well as a number of control characters. 947 Latin-1 also includes a non-breaking space, a soft hyphen 948 indicator, 93 graphical characters, 8 unassigned characters, 949 and 25 control characters. 951 NOTE - Use the non-breaking space and soft hyphen 952 indicator characters is discouraged because support for 953 them is not widely deployed. 955 In SGML applications, the use of control characters is 956 limited in order to maximize the chance of successful 957 interchange over heterogeneous networks and operating 958 systems. In HTML, only three control characters are allowed: 959 Horizontal Tab (HT, encoded as 9 decimal in US-ASCII and 960 ISO-8859-1), Carriage Return, and Line Feed. 962 The HTML DTD references the Added Latin 1 entity set, to 963 allow mnemonic representation of Latin 1 characters using 964 only the widely supported ASCII character repertoire. For 965 example: 967 Kurt Gödel was a famous logician and mathematician. 969 See 11.4.2, "ISO Latin 1 Character Entity Set" for a table 970 of the ``Added Latin 1'' entities, and 14.1, "The ISO-8859-1 971 Coded Character Set" for a table of the code positions of 972 ISO-8859-1. 974 6. Data Elements 976 6.1. Line Break 978
Level 0 980 The Line Break element specifies that a new line must be 981 started at the given point. A new line indents the same as 982 that of line-wrapped text. 984 Example of use: 986Pease porridge hot
987 Pease porridge cold
988 Pease porridge in the pot
989 Nine days old. 991 6.2. Horizontal Rule 993
Level 0 995 A Horizontal Rule element is a divider between sections of 996 text such as a full width horizontal rule or equivalent 997 graphic. 999 Example of use: 1001
1002 February 8, 1995, CERN 1003