idnits 2.17.1 draft-ietf-mimesgml-exch-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 4 longer pages, the longest (page 13) being 61 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 8 instances of lines with control characters in the document. ** The abstract seems to contain references ([5], [8], [10]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 161 has weird spacing: '...catalog and s...' == Line 171 has weird spacing: '...bsolute path ...' == Line 223 has weird spacing: '...writing syste...' == Line 712 has weird spacing: '...esolved by ei...' == Line 1087 has weird spacing: '... listed below...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 22, 1996) is 10291 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '1' is defined on line 937, but no explicit reference was found in the text == Unused Reference: '7' is defined on line 948, but no explicit reference was found in the text == Unused Reference: '12' is defined on line 957, but no explicit reference was found in the text == Unused Reference: '14' is defined on line 962, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' -- Possible downref: Non-RFC (?) normative reference: ref. '3' ** Obsolete normative reference: RFC 1808 (ref. '4') (Obsoleted by RFC 3986) -- Possible downref: Non-RFC (?) normative reference: ref. '5' -- Possible downref: Non-RFC (?) normative reference: ref. '6' ** Downref: Normative reference to an Informational RFC: RFC 1630 (ref. '7') ** Obsolete normative reference: RFC 1738 (ref. '8') (Obsoleted by RFC 4248, RFC 4266) -- Possible downref: Non-RFC (?) normative reference: ref. '10' -- Possible downref: Non-RFC (?) normative reference: ref. '11' ** Downref: Normative reference to an Informational RFC: RFC 1737 (ref. '12') ** Obsolete normative reference: RFC 1521 (ref. '13') (Obsoleted by RFC 2045, RFC 2046, RFC 2047, RFC 2048, RFC 2049) -- Possible downref: Non-RFC (?) normative reference: ref. '14' ** Obsolete normative reference: RFC 822 (ref. '15') (Obsoleted by RFC 2822) -- Possible downref: Non-RFC (?) normative reference: ref. '16' -- Possible downref: Non-RFC (?) normative reference: ref. '17' Summary: 16 errors (**), 0 flaws (~~), 11 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 MIMESGML Working Group D. Stinchfield 2 INTERNET-DRAFT EBT, Inc. 3 4 Expires August 22, 1996 February 22, 1996 6 Using SGML Open Catalogs and MIME to Exchange SGML Documents 8 Status of this Memo 10 This document is an Internet-Draft. Internet-Drafts are working 11 documents of the Internet Engineering Task Force (IETF), its areas, and 12 its working groups. Note that other groups may also distribute working 13 documents as Internet-Drafts. 15 Internet-Drafts are draft documents valid for a maximum of six months 16 and may be updated, replaced, or made obsolete by other documents at 17 any time. It is inappropriate to use Internet-Drafts as reference 18 material or to cite them other than as "work in progress". 20 To learn the current status of any Internet-Draft, please check the 21 "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow 22 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 23 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 24 ftp.isi.edu (US West Coast). 26 Distribution of this document is unlimited. Please send comments to the 27 HTTP working group at . Discussions of the 28 working group are archived at ftp://ftp.naggum.no/pub/archives/sgml- 29 internet. 31 Abstract 33 This proposal describes how SGML Open catalogs and MIME mechanisms are 34 used to exchange SGML documents on the World Wide Web, or via email. 35 Using the extension mechanism provided by Technical Resolution 36 9401:1995 (TR9401) [10]- TR9401 contains the technical description of 37 SGML Open catalogs - this proposal describes new catalog keywords 38 required for SGML document interchange. In addition, Uniform Resource 39 Locators (URL) [8] are used to allow greater flexibility in the 40 addressing of storage objects. This flexibility is intended to allow 41 addressing of storage objects encapsulated in MIME messages, or 42 addressable via the World Wide Web. A MIME body part containing an SGML 43 Open catalog is tagged with the content type "application/sgml-open- 44 catalog" [5]. 46 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 48 Revision History 50 Changed Application/Catalog to Application/SGML-Open-Catalog. 52 Changed the syntax of Notation and added more to the description. 54 Added ENCODING and SYSTEM keywords. 56 Removed the keywords: CHARSET, BASESET, and CAPACITY. Will assume that 57 the public id's used to define them in the SGML declaration will be 58 unique within the catalog. 60 Changed BASEURL to BASE and added to the definition so that BASE can 61 now be either an absolute URL or an absolute filename. Also described 62 is how multiple BASE catalog entries are used. 64 Added a better description of how system identifiers are to be handled. 66 TR9401:1995 is being referenced instead of TR9401:1994. 68 User-define keywords are no longer required to begin with "X-" but it 69 is strongly recommended. 71 Added appendix D, public identifiers for Notations. 73 Added an attribute to Semantics called "title". 75 Added "Usage Guidelines" section. 77 Removed information that is repeated from TR9401. This includes 78 keyword descriptions and parts of the grammar. 80 Replaced Catalogs with `SGML Open Catalogs' in the title. 82 Require OVERRIDE and ENCODING to have at least one attribute in a 83 catalog entry. 85 Fix FPI for ISONUM in appendix B. 87 Changed ISO publication numbers from "xxxx-yyyyy" to "xxxx:yyyy" 88 format. 90 Changed the name of the SYSTEM keyword to MAPSOI. 92 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 94 1. Introduction.................................................4 95 1.1 Overview....................................................4 96 2. Catalog Description..........................................5 97 2.1 Resolving System Identifiers................................5 98 2.2 Catalog Keywords............................................6 99 2.2.1 NOTATION..................................................7 100 2.2.2 SEMANTICS.................................................8 101 2.2.3 BASE......................................................8 102 2.2.4 OVERRIDE..................................................9 103 2.2.5 ENCODING..................................................9 104 2.2.6 MAPSOI...................................................10 105 2.2.7 User-Defined Keywords....................................10 106 2.3 Storage Object Identifiers.................................11 107 2.3.1 URLs as SOIs.............................................11 108 2.3.2 The Content-ID SOI.......................................11 109 3. Catalog Syntax..............................................11 110 4. Usage Guidelines............................................13 111 5. Examples....................................................14 112 5.1 Sending Only A Catalog.....................................14 113 5.1.1 MIME Message Content.....................................15 114 5.2 Sending a Catalog and a Document Entity....................15 115 5.2.1 MIME Message Content.....................................16 116 5.3 Sending a Catalog and All Document Components..............17 117 5.3.1 MIME Message Content.....................................17 118 5.4 Sending a Catalog for a Non-Document Entity................19 119 6. Security Considerations.....................................20 120 7. Acknowledgments.............................................20 121 8. References..................................................21 122 9. Authors' Address............................................21 123 Appendix A: SGML declaration Used In The Examples..............22 124 Appendix B: DTD Used In The Examples...........................24 125 Appendix C: SGML document Used In The Examples.................25 126 Appendix D: NOTATIONS..........................................27 127 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 129 1. Introduction 131 This proposal describes how SGML Open catalogs and MIME mechanisms are 132 used to exchange SGML documents on the World Wide Web, or via email. 133 Using the extension mechanism provided by Technical Resolution 134 9401:1995 (TR9401) [10]- TR9401 contains the technical description of 135 SGML Open catalogs - this proposal describes new catalog keywords 136 required for SGML document interchange. In addition, Uniform Resource 137 Locators (URL) [8] are used to allow greater flexibility in the 138 addressing of storage objects. This flexibility is intended to allow 139 addressing of storage objects encapsulated in MIME messages, or 140 addressable via the World Wide Web. A MIME body part containing an SGML 141 Open catalog is tagged with the content type "application/sgml-open- 142 catalog" [5]. 144 Some benefits to using SGML Open catalogs to interchange SGML documents 145 are: 147 o a client only needs a catalog to begin processing, 148 it simply fetches the components referenced in the 149 catalog as they are needed; 150 o a client that understands catalogs has a way to fetch 151 components of a document that it doesn't already have; 152 o document components do not have to be modified in order 153 to be referenced in a catalog; 154 o components of a document can be distributed across 155 many servers; 156 o catalogs do not depend on MIME, therefore, they can be 157 used in other packaging schemes; 158 o the impact on MIME is minimized; 159 o catalogs are an implemented, proven, and widely used technology; 160 o a document's system identifiers can be referenced 161 in a catalog and subsequently resolved by a client. 163 1.1 Overview 165 TR9401 defined catalog-keywords identify SGML document components, such 166 as an SGML Declaration, a DTD, or a document entity. Catalogs 167 containing only TR9401 catalog-keywords are useful for sharing 168 documents between applications on a single system. These same catalogs 169 are less useful for sharing documents between applications on remote 170 and heterogeneous systems. For example, a system identifier that 171 describes an absolute path to an MS-DOS file is useful on a PC but is 172 not likely to be very useful on a UNIX system. Using the extension 173 mechanism of TR9401, this proposal defines new catalog-keywords needed 174 to address this problem, and others that are encountered when 175 attempting to interchanging SGML documents over the World Wide Web, or 176 via email. 178 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 180 The new keywords described in this proposal include: NOTATION, BASE, 181 SEMANTICS, OVERRIDE, MAPSOI, and ENCODING. NOTATION is used to describe 182 a document's SGML NOTATION declaration [3][11]. OVERRIDE indicates the 183 TR9401 processing mode[10] for resolving external identifiers. BASE 184 defines an absolute URL and is used for resolving relative URLs found 185 in the catalog. SEMANTICS is used to reference semantic processing 186 information such as stylesheets. MAPSOI provides a mapping between 187 system identifiers and is useful for exchanging documents with SGML 188 systems that are not catalog aware. ENCODING describes the encoding of 189 catalog entries; it is possible for each catalog entry to have its own 190 encoding. 192 In addition to catalog keywords, this proposal describes a system- 193 independent SOI to be either a URL, a relative URL, or a MIME 194 Content-ID. The usefulness of URLs and relative URLs is evident from 195 World Wide Web. A Content-ID SOI identifies a document component 196 contained in a MIME body part [13]. Typically, a Content-ID SOI 197 describes the location of a document component within the same 198 multipart message as the catalog. 200 2. Catalog Description 202 A catalog contains zero or more catalog entries. Each entry consists 203 of a keyword, zero or more keyword attributes, and, usually, a storage 204 object identifier defined as a URL, a relative URL, or a Content-ID. 205 Relative URLs are resolved using the value of the BASE catalog entry. 206 When no BASE entry exists relative URLs are resolved with respect to 207 the location of the catalog. 209 2.1 Resolving System Identifiers 211 An SGML system identifier contains system-specific information used for 212 locating an entity: a filename is an example of system-specific 213 information. The kinds of system identifiers supported by an SGML 214 system depends on the capabilities of its entity manager. Usually, 215 there are two entity managers involved in a document exchange - in 216 this document one of the entity managers is referred to as the sender's 217 entity manager, and the other is referred to as the receiver's entity 218 manager. Typically, the capabilities of the sender's entity manager is 219 different from that of the receiver's, and the sender is usually 220 unaware of the receiver's capabilities. The OVERRIDE and MAPSOI 221 keywords are used to solve this problem. (Note, sometimes, especially 222 for legacy SGML systems, this problem can only be solved by rewriting 223 the document's system identifiers. Algorithms for rewriting system 224 identifiers are beyond the scope of this document.) 225 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 227 Setting the OVERRIDE keyword to "YES" directs the receiving system to 228 use the catalog to re-map system identifiers occurring in the document. 229 For example, the following is an entity declaration with a system 230 identifier of "defc1.sgm": 232 234 When a server creates a multipart message containing entity "c1" and a 235 catalog that references it, the catalog entry for "c1" would look like 236 this: 238 OVERRIDE "YES" 239 ENTITY "c1" "Content-ID:<==toons==>" 241 A receiving SGML system that understands SGML Open catalogs, the 242 extensions proposed in this document, and MIME can simply process the 243 catalog. It is likely that multipart messages received over the World 244 Wide Web will be processed this way. For email, the MUA will likely 245 save the body part defined by "Content-ID:<==toons==>" to a file, and a 246 helper application will rewrite the catalog to reflect the new location 247 of "c1": 249 OVERRIDE "YES" 250 ENTITY "c1" "c:\tmp\blurt.it" 252 It is useful for the helper application to save the body part using the 253 original SOI of "c1". Doing so, for this example, would relieve the 254 receiver's SGML system from having to process the catalog. The MAPSOI 255 keyword facilitates this. For example, the aforementioned catalog 256 contained in the multipart message can be rewritten to look like this: 258 OVERRIDE "YES" 259 ENTITY "c1" "Content-ID:<==toons==>" 260 MAPSOI "defc1.sgm" "Content-ID:<==toons==>" 262 The contents of the entity named "c1" are found in "Content- 263 ID:<==toons==>". The original system identifier for "Content- 264 ID:<==toons==>" is "defc1.sgm". The receiver's helper application can 265 use this information to save the entity using the original SOI, 266 defc1.sgm, defined for it in the document. 268 2.2 Catalog Keywords 270 A catalog contains entries for SGML document components. The order of 271 the entries is important for the OVERRIDE, ENCODING, and BASE keywords. 272 All entries are optional. A catalog can contain multiple entries with 273 the same keyword. The following keywords are defined in TR9401 [10]: 275 SGMLDECL - SGML declaration 276 DOCUMENT - SGML document entity 278 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 280 DOCTYPE - Document type declaration (DTD) 281 LINKTYPE - link type name 282 PUBLIC - public external identifier 283 DTDDECL - SGML declaration plus public identifier meant to 284 match a public identifier given as part of the 285 doctype declaration to reference the external subset. 286 ENTITY - entity name 288 The following keywords are defined in this document using the grammar 289 notation of TR9401: 291 NOTATION - notation name 292 SEMANTICS - name and type of the semantic information 293 BASE - base URL 294 OVERRIDE - defines which TR9401 processing mode to use 295 ENCODING - character encoding 296 MAPSOI - maps a system identifier from a document to a system 297 identifier in the catalog 298 X- - user-define keyword prefix 300 2.2.1 NOTATION 302 The NOTATION catalog keyword refers to data content notations defined 303 or referenced in a document. The syntax for NOTATION is: 305 notation = ("NOTATION", ps+, 306 notation_name, 307 (ps+, storage_object_identifier)? 308 ) 310 The storage object identifier is optional for NOTATION. The following 311 example illustrates how NOTATION could be used for Java scripts: 313 314 315 317 319 input to JuggleBalls script, for example: specify number 320 of items and juggling style. 322 A catalog entry for the NOTATION declaration described above would look 323 like: 325 NOTATION "JuggleBalls" "http://www.bill.com/juggleballs.java" 327 The processing of notations is system dependent- there's no way for a 328 server to guarantee that a client can process a specific notation. The 329 NOTATION keyword in the catalog may only give a hint, possibly a 330 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 332 pointer, to a notation processor. It can be dangerous for the client 333 to resolve a reference to a notation processor- loading and running a 334 notation processor received from a remote, and potentially unsecured, 335 site is dangerous. However, in a secure environment, exchanging 336 scripts may be perfectly safe. See appendix D for some examples of 337 some notations. 339 2.2.2 SEMANTICS 341 There may be semantic information such as stylesheets associated with a 342 document. Semantic information is not required for parsing the 343 document and can be ignored by the client. However, it is often 344 desirable for a client to access appropriate semantic specifications. 345 The syntax for the SEMANTICS keyword is: 347 semantics = ("SEMANTICS", ps+, 348 semantic name, ps+, 349 semantic type, ps+, 350 semantic title, ps+, 351 storage object identifier) 353 Here are two examples: 355 SEMANTICS "large-print" "DSSSL" "Wicked Large Print" 356 "http://www.bill.com/style/large.sty" 358 SEMANTICS "toc" "DSSSL" "Table of Contents" "toc.sty" 360 2.2.3 BASE 362 SOIs may be relative. Relative SOIs can be resolved using a BASE 363 keyword catalog entry. (Relative URLs and their resolution are 364 discussed in [4].) There can be more than one BASE keyword in a 365 catalog. Relative URLs are resolved with respect to the closest 366 previously specified BASE keyword, an example follows the syntax 367 definition. If no BASE entry applies to a catalog entry, then the URL 368 of the catalog is used for relative URL resolution. The syntax for 369 the BASE keyword is: 371 base = ("BASE", ps+, storage_object_identifier) 373 Here's an example of how BASE is used: 375 ENTITY "Legal" "legal.sgm" 376 BASE "http://www.bill.com/docs/memo/mine/" 377 ENTITY "MyEnding" " ending.sgml" 379 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 381 The entity "Legal" is resolved relative to the URL of the catalog. The 382 absolute URL for entity "MyEnding" is 383 "http://www.bill.com/docs/memo/mine/ending.sgml". 385 2.2.4 OVERRIDE 387 The OVERRIDE keyword defines which TR9401 processing mode the SGML 388 system's entity manager will use to resolve external identifiers. When 389 OVERRIDE is "YES" the entity manager uses the catalog to resolve 390 external identifiers, whether or not there is a system identifier 391 defined for it in the document. When OVERRIDE is "NO" the entity 392 manager uses the system identifiers found in the document when 393 resolving references to external identifiers. The value of OVERRIDE is 394 "NO" at the beginning of the catalog. 396 There can be more than one OVERRIDE keyword in a catalog. The OVERRIDE 397 value that applies to an entry is the closest previously specified one. 399 The syntax for the OVERRIDE keyword is: 401 override = ("OVERRIDE", ps+, mode) 403 mode = (LIT, "YES",LIT) | 404 (LITA,"YES",LITA) | 405 (LIT, "NO", LIT) | 406 (LITA,"NO", LITA) 408 Here's an example: 410 ENTITY "Legal" "legal.sgm" 411 OVERRIDE "YES" 412 ENTITY "MyEnding" " ending.sgml" 414 For this example override is "NO" for the entity "Legal" and "YES" for 415 the entity "MyEnding". 417 2.2.5 ENCODING 419 The ENCODING keyword provides a way to include entities in different 420 encodings within a single document. The syntax of the ENCODING keyword 421 is: 423 encoding = ( "ENCODING", ps+, encode_spec ) 425 The ENCODING keyword indicates the encoding of catalog entries that 426 follow. There can be more than one ENCODING entry in a catalog. When 427 an ENCODING entry is found it supersedes the value of any preceding 428 ENCODING entry. 430 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 432 ISO-646 [17] is the default when no ENCODING keyword is specified. 433 ISO-646 was selected as the default value because of the special role 434 it plays in SGML: ISO-646 is the syntax-reference character set of the 435 reference concrete syntax [3], pg 476. 437 For example, the following catalog describes catalog entries, each with 438 a different encoding: 440 ENCODING "SHIFT-JIS" 441 DOCUMENT "http://www.goeast.com/anaxi.sgm" 442 ENCODING "ISO-10646-UTF7" 443 DOCTYPE "MEMO" "http://www.gowest.com/dtds/memo.dtd" 444 ENCODING "SHIFT-JIS" 445 ENTITY "MyEnding" "http://www.goeast.com/ending.sgml" 446 ENTITY "Legal" "http://www.goeast.com/company/legal.sgm" 448 The document entity and the entities MyEnding and Legal are encoded in 449 SHIFT-JIS while the DTD is encoded in ISO-10646-UTF7. 451 2.2.6 MAPSOI 453 The MAPSOI keyword is used to map an original system identifier found 454 in a document to the SOI used for it in the catalog. The MAPSOI 455 keyword is similar to the SYSTEM keyword used by nsgmls [16]. The 456 syntax for the MAPSOI keyword is: 458 mapsoi = ("MAPSOI", 459 original_soi, 460 effective_soi 461 ) 463 For an example of how MAPSOI is used refer to the previous section 464 entitled "Resolving System Identifiers". 466 2.2.7 User-Defined Keywords 468 It is strongly recommended that user-defined keywords begin with "X-". 469 This allows the catalog-parser to easily determine if a keyword is a 470 user-defined keyword. 472 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 474 2.3 Storage Object Identifiers 476 Three types of SOIs are defined in this section. The first defines an 477 SOI in terms of URLs. The second defines an SOI in terms of a MIME 478 Content-ID. And the last defines an SOI using TR9401's definition of 479 an SOI. 481 The syntax for an SOI is: 483 storage object identifier = 484 url object identifier | 485 content id object identifier | 486 TR9401 storage object identifier 488 2.3.1 URLs as SOIs 490 An SOI can be a URL [8] or a relative URL [4]. The SOI will need to be 491 parsed to determine the SOI type. 492 2.3.2 The Content-ID SOI 494 A Content-ID SOI specifies a MIME message body part. Note, the 495 Content-ID SOI is expected to be replaced in the future with the cid 496 URL [8]. The syntax for Content-ID based SOI is: 498 content id object identifier = 499 "Content-ID" ":" msg-id ; as defined in RFC 1521 [13] 500 msg-id = as defined in RFC 822 [15] 502 3. Catalog Syntax 504 catalog = 505 ( ps*, ( (catalog_entry | user_defined), ps+ )* ) 507 catalog_entry = 508 TR9401:1995_keywords | 509 notation | 510 semantics | 511 base | 512 override | 513 mapsoi | 514 encoding 516 TR9401:1995_keywords = refer to TR9401 [10] for a description of 517 keywords for sgmldecl, document, doctype, 518 public, entity, and linktype 520 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 522 notation = ("NOTATION", ps+, 523 notation_name, 524 (ps+, storage_object_identifier)? 525 ) 527 notation_name = entity_name_spec 529 entity_name_spec = as defined in TR9401 [10] 531 semantics = ("SEMANTICS", ps+, 532 semantic_name, ps+, 533 semantic_type, ps+, 534 semantic_title, ps+, 535 storage_object_identifier) 537 semantic_name = entity_name_spec 539 semantic_type = entity_name_spec 541 semantic_title = entity_name_spec 543 base = ("BASE", ps+, storage_object_identifier) 545 override = ("OVERRIDE", ps+, mode) 547 mode = (LIT, "YES",LIT) | 548 (LITA, "YES",LITA) | 549 (LIT, "NO", LIT) | 550 (LITA, "NO", LITA) 552 encoding = ( "ENCODING" , ps+, encode_spec ) 554 encode_spec = entity_name_spec 556 mapsoi = ("MAPSOI", ps+, 557 original_soi, ps+, 558 effective_soi 559 ) 561 original_soi = storage_object_identifier 563 effective_soi = storage_object_identifier 565 user_defined = ("X-", keyword) 566 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 568 storage_object_identifier = 569 url_object_identifier | 570 content_id_object_identifier | 571 TR9401_storage_object_identifier 573 url_object_identifier = 574 as defined in RFC 1738[8] 576 content_id_object_identifier = 577 "Content-ID" ":" msg-id 579 msg-id = as defined in RFC 822 [15] 581 TR9401_storage_object_identifier = 582 "storage object identifier" as defined in TR9401 [10] 584 keyword = as defined by TR9401 586 ps = as defined in TR9401 [10] 588 4. Usage Guidelines 590 There are some ambiguities in TR9401 which can be easily avoided by 591 adhering to these guidelines: 593 quote all keyword-attributes using either single or double quotes. 594 This allows the parser to determine what is an attribute, and what 595 is not, following a user-defined keyword; 597 surround comments with whitespace. In other words, don't start 598 a comment just after a token, doing so can lead to parsing 599 ambiguities. 601 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 603 5. Examples 605 The SGML document used in all the examples is composed of the following 606 components: 608 o an SGML declaration, defined in Appendix A; 609 o a Document type declaration (DTD), defined in Appendix B; 610 o an SGML document entity, defined in Appendix C; 611 o two SGML entities, defined in Appendix C; 612 o a figure entity, not defined in this draft. 614 In all examples the components of this SGML document are spread across 615 multiple servers, except for the example entitled "Sending a Catalog 616 and All of its Components". 618 Each example defines its own unique catalog. The catalog varies from 619 example to example depending on the number of document components sent 620 along with it. The sender decides how many components to include in the 621 MIME message. 623 A document component that is not included in the MIME message can be 624 resolved in one of two ways: 1) the client requests the component from 625 its cache, that is, the component had been fetched while processing a 626 previous request, or 2) the client requests the component using SOI 627 defined for it in the catalog. 629 The definitions for the following external identifiers are not included 630 in this document: 632 formal public identifiers: 634 ISO 646:1983//CHARSET 635 International Reference Version (IRV)//ESC 2/5 4/0 636 ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN 638 system identifier: 639 ../style/all.sty" - DSSSL style sheet 641 5.1 Sending Only A Catalog 643 In this example only the catalog is sent to the client. When the 644 client's SGML system is capable of handling SGML Open catalogs, along 645 with the extensions proposed in this document, the catalog can be 646 passed, without modification, to the client's SGML System. Some pre- 647 processing may be required when the client's SGML system cannot handle 648 these kinds of catalogs. For example, a catalog-aware SGML system that 649 does not understand the keywords proposed in this documents or URLs 650 would need pre-processor to do the following: 652 1. fetch all of the components referenced in the catalog, 654 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 656 2. store the components locally, and 657 3. update the catalog to reflect the new locations. 659 Here are some reasons for sending just the catalog: 661 o the server might only store catalogs, that is, the server does 662 not store any document components; 663 o the client may have requested only the catalog. Perhaps 664 the client wants to compare the contents of this catalog 665 with the contents of a different catalog. Or maybe the 666 client already has most, if not all, of the document's 667 components cached; 668 o the server may want to keep network traffic down by increasing 669 the likelihood that the client will get a cache hit on 670 catalog entries. 672 5.1.1 MIME Message Content 674 MIME-Version: 1.0 675 Content-Type: Application/SGML-Catalog; charset=us-ascii 677 SGMLDECL "http://www.ebt.com/decl/ebtsgml.dcl" 678 OVERRIDE "YES" 679 PUBLIC "ISO 646:1983//CHARSET 680 International Reference Version (IRV)//ESC 2/5 4/0" 681 "http://www.iso.ch/charset/6461983.cha" 682 PUBLIC "ISO Registration Number 100//CHARSET 683 ECMA-94 Right-hand Part of Latin Alphabet Nr.1//ESC 2/13 4/1" 684 "http://www.iso.ch/charset/ecma94.cha" 685 PUBLIC "-//EBT//CAPACITY CoolCaps 1.0//" 686 "http://www.ebt.com/decl/coolcaps.cap" 687 PUBLIC "-//EBT//SYNTAX SinSyn 0.1//" 688 "http://www.ebt.com/decl/syntax/sinsyn.syn" 689 BASE "http://www.bill.com/docs/memo/mine/" 690 DOCUMENT "anaxi.sgm" 691 DOCTYPE "MEMO" "../../dtds/memo.dtd" 692 PUBLIC "ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN" 693 "http://www.wcs.com/usr/wcs/isonum.ent" 694 ENTITY "%ISOnum" "http://www.wcs.com/usr/wcs/isonum.ent" 695 ENTITY "MyEnding" "ending.sgml" 696 ENTITY "Legal" "../company/legal.sgm" 697 SEMANTICS "large-print" "DSSSL" "../style/all.sty" 699 5.2 Sending a Catalog and a Document Entity 701 This example describes how to send a catalog and a document entity 702 using a Multipart message. This is a likely scenario for Web-based 703 browsers where simultaneous rendering and resolution of external 704 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 706 identifiers are necessary. For this example the server assumes the 707 client already has the SGML declaration, the DTD, and a stylesheet - 708 if this is not the, the client can easily request them. The document 709 entity will likely contain enough text for a browser to render 710 meaningful text on the display device, but it won't include the many 711 entities that the text may link to. These external objects, like 712 figures, can be resolved by either the entity manager, while the 713 application is rendering the text, or on user demand, as for 714 hyperlinked information. 716 5.2.1 MIME Message Content 718 MIME-Version: 1.0 719 Content-Type: Multipart/Mixed; boundary=let-go-of-my-leg; 721 --let-go-of-my-leg 722 Content-Type: Application/SGML-Catalog; charset=us-ascii 724 SGMLDECL "http://www.ebt.com/decl/ebtsgml.dcl" 725 OVERRIDE "YES" 726 PUBLIC "ISO 646:1983//CHARSET 727 International Reference Version (IRV)//ESC 2/5 4/0" 728 "http://www.iso.ch/charset/6461983.cha" 729 PUBLIC "ISO Registration Number 100//CHARSET 730 ECMA-94 Right-hand Part of Latin Alphabet Nr.1//ESC 2/13 4/1" 731 "http://www.iso.ch/charset/ecma94.cha" 732 PUBLIC "-//EBT//CAPACITY CoolCaps 1.0//" 733 "http://www.ebt.com/decl/coolcaps.cap" 734 PUBLIC "-//EBT//SYNTAX SinSyn 0.1//" 735 "http://www.ebt.com/decl/syntax/sinsyn.syn" 736 BASE "http://www.bill.com/docs/memo/mine/" 737 DOCUMENT "Content-ID:" 738 DOCTYPE "MEMO" "../../dtds/memo.dtd" 739 PUBLIC "ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN" 740 "http://www.wcs.com/usr/wcs/isonum.ent" 741 ENTITY "%ISOnum" "http://www.wcs.com/usr/wcs/isonum.ent" 742 ENTITY "MyEnding" "ending.sgml" 743 ENTITY "Legal" "../company/legal.sgm" 744 SEMANTICS "large-print" "DSSSL" "../style/all.sty" 745 MAPSOI "anaix.sgm" "Content-ID:" 747 --let-go-of-my-leg 748 Content-Type: Application/SGML; charset=us-ascii 749 Content-ID: 750 Content-Disposition: attachment; filename="anaxi.sgm" 752 include document entity from Appendix C 753 --let-go-of-my-leg-- 754 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 756 5.3 Sending a Catalog and All Document Components 758 Sending a catalog and all of a document's components at once is done 759 using a Multipart message. 761 5.3.1 MIME Message Content 763 MIME-Version: 1.0 764 Content-Type: Multipart/Mixed; boundary=go-speed-racer 766 --go-speed-racer 767 Content-Type: Application/SGML-Catalog; charset=us-ascii 769 SGMLDECL "Content-ID:" 770 OVERRIDE "YES" 771 PUBLIC "ISO 646:1983//CHARSET 772 International Reference Version (IRV)//ESC 2/5 4/0" 773 "Content-ID:" 774 PUBLIC "ISO Registration Number 100//CHARSET 775 ECMA-94 Right-hand Part of Latin Alphabet Nr.1//ESC 2/13 4/1" 776 "Content-ID:" 777 PUBLIC "-//EBT//CAPACITY CoolCaps 1.0//" 778 "Content-ID:" 779 PUBLIC "-//EBT//SYNTAX SinSyn 0.1//" 780 "Content-ID:" 781 DOCUMENT "Content-ID:" 782 DOCTYPE "MEMO" "Content-ID:" 783 PUBLIC "ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN" 784 "Content-ID:" 785 ENTITY "%ISOnum" "Content-ID=" 786 ENTITY "MyEnding" "Content-ID:" 787 ENTITY "Legal" "Content-ID:" 788 SEMANTICS "large-print" "DSSSL" "Content-ID:" 789 MAPSOI "anaix.sgm" "Content-ID:" 790 MAPSOI "/usr/wcs/dtd/memo.dtd" "Content-ID:" 791 MAPSOI "/usr/des/ending.sgm" "Content-ID:" 793 --go-speed-racer 794 Content-Type:Application/SGML; charset=us-ascii 795 Content-ID:"" 797 description of SGML declaration from Appendix A is included here 799 --go-speed-racer 800 Content-Type:Application/SGML; charset=us-ascii 801 Content-ID:"" 803 ISO 646 character set definition included here 804 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 806 --go-speed-racer 807 Content-Type:Application/SGML; charset=us-ascii 808 Content-ID:"" 810 description of Capacity from Appendix A is included here 812 --go-speed-racer 813 Content-Type:Application/SGML; charset=us-ascii 814 Content-ID:"" 816 description of Syntax from Appendix A is included here 818 --go-speed-racer 819 Content-Type:Application/SGML; charset=us-ascii 820 Content-ID:"" 822 Contents of ISO Registration Number 100//CHARSET ECMA-94 Right-hand 823 Part of Latin Alphabet Nr.1//ESC 2/13 4/1 included here 825 --go-speed-racer 826 Content-Type: Application/SGML; charset=us-ascii 827 Content-ID: 828 Content-Disposition: attachment; filename="anaxi.sgm" 830 include Document entity as described in Appendix C 832 --go-speed-racer 833 Content-Type: Application/SGML; charset=us-ascii 834 Content-ID: 835 Content-Disposition: attachment; filename="memo.dtd" 837 include DTD as described from Appendix B 839 --go-speed-racer 840 Content-Type: Application/SGML; charset=us-ascii 841 Content-ID: 843 ISO 8879:1986 Entity set included here 845 --go-speed-racer 846 Content-Type: Application/SGML; charset=us-ascii 847 Content-ID: 849 include entity set defined for %ISOnum 851 --go-speed-racer 852 Content-Type: Application/SGML; charset=us-ascii 853 Content-ID: 854 Content-Disposition: attachment; filename="ending.sgm" 856 include entity MyEnding as described in Appendix C 857 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 859 --go-speed-racer 860 Content-Type: Application/SGML; charset=us-ascii 861 Content-ID: 863 include entity Legal as described in Appendix C 865 --go-speed-racer 866 Content-Type: Application/SGML; charset=us-ascii 867 Content-ID: 869 included here is a bunch of DSSSL-Lite 871 --go-speed-racer-- 873 5.4 Sending a Catalog for a Non-Document Entity 875 This example describes what a server might send in response to a 876 request for a non-document entity. All of the previous examples assume 877 that the original request was for a document entity. SGML documents 878 can get very deep and contain a large number of external identifier 879 references. Likewise, the complete catalog for a document could get 880 very large: a "complete catalog" contains all of the external 881 identifiers referenced in all of a document's entities. There is no 882 need to send the complete. All that's needed are enough entries in the 883 catalog for the client system to resolve references declared in the 884 entity being transferred. For example, Appendix C.2 defines an entity 885 called "Legal" which includes a reference to an entity called 886 "MyEnding". A request for "Legal" would result in a Multipart/Related 887 message that looks like this: 889 MIME-Version: 1.0 890 Content-Type: Multipart/Mixed; boundary=let-go-of-my-leg 892 --let-go-of-my-leg 893 Content-Type: Application/SGML-Catalog; charset=us-ascii 895 BASE "http://www.bill.com/docs/memo/mine/" 896 OVERRIDE "YES" 897 ENTITY "Legal" "Content-ID:" 898 ENTITY "MyEnding" "ending.sgml" 900 --let-go-of-my-leg 901 Content-Type: Application/SGML; charset=us-ascii 902 Content-ID: 904 include entity "Legal" from Appendix C.2 906 --let-go-of-my-leg-- 908 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 910 6. Security Considerations 912 SGML documents, like other compound documents, may contain entities 913 whose media-types present security concerns, e.g. 914 Application/PostScript. Further, SGML may contain explicit processing 915 instructions for a presentation or composition system; use of such 916 instructions present concerns similar to those of 917 Application/PostScript. 919 The use of active media-types with Notation declarations can provide an 920 opportunity for the sender to execute a script or other code on the 921 recipient's machine. 923 7. Acknowledgments 925 Thanks to Andre Alguero, Jeff Cutler-Stamm, Steve DeRose, Chris Maden, 926 Gavin Nicol, and Bill Smith for helping me with the content and 927 structure of this document. Thanks to Martin Bryan, James Clark, John 928 Klensin, and Ed Levinson for the many discussions and debates that 929 helped me to clarify, I hope, many of the ideas contained in this 930 document. Thanks also to Wayne Wohler of IBM for his help on SGML 931 declarations. 933 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 935 8. References 937 [1] Wayne Wohler, "SGML declarations", 938 http://www.sil.org/sgml/wlw11.html 939 [2] Eric van Herwijnen, "Practical SGML", Second Edition, 940 Kluwer Academic Publishers, 1994, ISBN 0-7923-9434-8 941 [3] Charles F. Goldfarb, "The SGML Handbook", 942 Oxford University Press, 1994, ISBN 0-19-853737-9 943 [4] R. Fielding, "Relative Uniform Resource Locators", RFC 1808, 944 ftp://ds.internic.net/rfc/rfc1808.txt 945 [5] P. Grosso, "The Application/SGML-Open-Catalog Content Type", RFC 946 [6] Daniel W. Connolly, HTML 2.0 SGML declaration found at 947 http://www.w3.org/hypertext/WWW/MarkUp/html-spec/html.decl 948 [7] T. Berners-Lee, "Universal Resource Identifiers in WWW: 949 A Unifying Syntax for the Expression of Names and Addresses of 950 Objects on the Network as used in the World-Wide Web", RFC 1630 951 [8] T. Berners-Lee, L. Masinter, and M. McCahill, 952 "Uniform Resource Locators (URL)", RFC 1738 953 [10] Paul Grosso, "Entity Management", SGML Open 954 Technical Resolution 9401:1995 (Amendment 1 to TR9401), 955 http://www.sgmlopen.org/sgml/docs/library/9401.htm 956 [11] Charles F. Goldfarb, "Entity Management in SGML", 11/30/93 957 [12] Sollins, K. and Masinter, L., 958 "Functional Requirements for Uniform Resource Names", RFC 1737 959 [13] N.Borenstein, "MIME (Multipurpose Internet Mail Extensions) 960 Part One: Mechanisms for Specifying and Describing the Format of 961 Internet Message Bodies", RFC 1521 962 [14] "ISO 8879:1986 Information processing - The and office systems - 963 Standard Generalized Markup Language (SGML)", 964 Geneva, 15 October 1986 965 [15] D. H. Crocker, "Standard for the Format of ARPA Internet Text 966 Messages", RFC 822 967 [16] J. Clark "nsgmls- a validating sgml parser ", 968 http://www.jclark.com/nsgmls.txt 970 [17] ISO 646- ISO 7-bit coded character set for information interchange 972 9. Authors' Address 974 Don Stinchfield 975 Electronic Book Technologies, Inc. 976 One Richmond Square 977 Providence, RI 02906 978 (401) 421-9550 x280 979 des@ebt.com 980 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 982 Appendix A: SGML declaration Used In The Examples 984 This Appendix contains the definitions for the SGML declaration, for 985 the CAPACITY parameter, and for the SYNTAX parameter. The SGML 986 declaration is a modified version of the one used for HTML 2.0 [6] - I 987 changed the CAPACITY and SYNTAX declarations so that they referenced 988 public identifiers. The following external identifiers are reference 989 in the SGML declaration: 991 o BASESET "ISO 646:1983//CHARSET 992 International Reference Version (IRV)//ESC 2/5 4/0" 993 o BASESET "ISO Registration Number 100//CHARSET 994 ECMA-94 Right-hand Part of Latin Alphabet Nr.1//ESC 995 2/13 4/1" 996 o CAPACITY PUBLIC "-//EBT//CAPACITY CoolCaps 1.0//" 997 o SYNTAX PUBLIC "-//EBT//SYNTAX SinSyn 0.1//" 999 A.1 SGML declaration 1001 1083 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 1085 Appendix B: DTD Used In The Examples 1087 The DTD listed below is a modified version of the one found on page 1088 33 of Eric van Herwijnen's book called "Practical SGML" [2]. The 1089 following external identifier is used in the DTD: 1091 1095 The above definition is for a parameter entity and it contains both a 1096 public identifier and a system identifier. The examples have both in 1097 the catalog. 1099 B.1 DTD 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1119 %ISOnum; 1120 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 1122 Appendix C: SGML document Used In The Examples 1124 The SGML document defined in this appendix is broken up into 3 parts: 1125 an SGML document entity and two SGML Entities. The SGML document 1126 entity contains references to external identifiers in the DOCTYPE and 1127 ENTITY declarations: 1129 o This one contains both a public identifier and a 1130 system identifier: 1132 1135 o This ENTITY declaration has system identifier and a 1136 system identifiers parameter: 1138 1140 o This one specifies a system identifier without specifying a 1141 system identifier parameter (this is provided for in the SGML 1142 Standard for implementers that want to resolve System 1143 Identifiers from the entity name alone [3, p378]): 1145 1147 C.1 SGML document entity 1149 1153 1154 ] > 1155 1156 Anaximander 1157 Cool Papa Shad 1158 1159 &Legal; 1160

Yo Anax, you've got a bizarre name!

1161

&MyEnding;

1162 1163 1164
1165 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 1167 C.2 Entity Named "Legal" 1169

If you or anyone you know tries to read this email then you're in 1170 really big trouble!

1171

You know this is the end of the document when you see &MyEnding;

1173 C.3 Entity Named "MyEnding" 1175 Regards, Don 1176 INTERNET-DRAFT SGML Open Catalogs and MIME 2/21/96 1178 Appendix D: NOTATIONS 1180 D.1 Useful Notations 1182 For Tcl I have taken the ISBN number from John K. Ousterhout's book 1183 "Tcl and the Tk Toolkit" to create a Formal Public Identifier: 1185 1197 1201 1205