idnits 2.17.1 draft-lundberg-app-tei-xml-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 111 has weird spacing: '...iCorpus xmlns...' -- The document date (November 12, 2010) is 4912 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 3023 (Obsoleted by RFC 7303) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group L. Romary 3 Internet-Draft TEI Consortium and INRIA 4 Intended status: Informational S. Lundberg 5 Expires: May 16, 2011 The Royal Library, Copenhagen 6 November 12, 2010 8 The 'application/tei+xml' mediatype 9 draft-lundberg-app-tei-xml-09 11 Abstract 13 This document defines the 'application/tei+xml' media type for markup 14 languages defined in accordance with the Text Encoding and 15 Interchange guidelines 17 Status of this Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at http://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on May 16, 2011. 34 Copyright Notice 36 Copyright (c) 2010 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (http://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 52 2. Recognizing TEI files . . . . . . . . . . . . . . . . . . . . 4 53 3. Fragment identifier . . . . . . . . . . . . . . . . . . . . . 6 54 4. Security considerations . . . . . . . . . . . . . . . . . . . 7 55 4.1. Harmful content . . . . . . . . . . . . . . . . . . . . . 7 56 4.2. Intellectual Property Rights . . . . . . . . . . . . . . . 7 57 4.3. Authenticity and confidentiality . . . . . . . . . . . . . 7 58 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 59 5.1. Registration of MIME type 'application/tei+xml' . . . . . 9 60 6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 61 6.1. Normative References . . . . . . . . . . . . . . . . . . . 11 62 6.2. Informative References . . . . . . . . . . . . . . . . . . 11 63 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13 65 1. Introduction 67 Text Encoding and Interchange (TEI) is an international and 68 interdisciplinary standard that is widely used by libraries, museums, 69 publishers, and individual scholars to represent all kinds of textual 70 material for online research and teaching.[TEI] 72 This document defines the 'application/tei+xml' media type in 73 accordance with [RFC3023] in order enable generic processing of such 74 documents on the Internet using eXtensible Markup Language 75 (XML)[W3C.REC-xml-20081126] technologies. 77 2. Recognizing TEI files 79 TEI files are XML documents or fragments having the root element (as 80 defined in [W3C.REC-xml-20081126]) in a TEI namespace. TEI namespace 81 names are defined as an Universal Resource Identifier (URI) [RFC3986] 82 in accordance with [W3C.REC-xml-names-20091208] and begins with 83 http://www.tei-c.org/ns/ followed by the version number of the 84 namespace. The current namespace is http://www.tei-c.org/ns/1.0 86 The most common root element names for TEI documents are 88 90 92 The teiCorpus documents give the possibility to bundle multiple 93 documents into a single file. 95 Examples: 97 A document having root element 99 100 101 102 ... 103 104 105 ... 106 107 109 A document having root element 110 111 112 113 ... 114 115 116 117 ... 118 119 120 ... 121 122 123 124 ... second document ... 125 126 127 ... third document ... 128 129 131 TEI and teiCorpus files are often given the extensions .tei and 132 .teiCorpus, respectively. There is a third type of file, which often 133 is given the suffix .odd. ODD ('One Document Does it All') is a TEI 134 XML document which include schema fragments, prose documentation, and 135 reference documentation. It is used for the definition and 136 documentation of XML based languages, and primarily for the TEI 137 Guidelines.[ODD] In other words, ODD files does not differ from other 138 TEI file in syntax, only in function. 140 3. Fragment identifier 142 Documents having the media type 'application/tei+xml', use the 143 fragment identifier notation as specified in [RFC3023] for the media 144 type 'application/xml'. 146 4. Security considerations 148 An XML resource does not in itself compromise data security. When 149 being available on a network simply through the dereferencing of an 150 Internationalized Resource Identifier (IRI) [RFC3987] or an URI, care 151 must be taken to properly interpret the data to prevent unintended 152 access. Hence the security issues of [RFC3986], section 7, apply. 153 In addition, as this media type uses the "+xml" convention, it shares 154 the same security considerations as described in RFC 3023 [RFC3023], 155 section 10. In general, security issues related to the use of XML in 156 IETF protocols are treated in RFC 3470[RFC3470], section 7. We will 157 not try to duplicate this material, but review some aspects that are 158 important for document centric XML as applied to text encoding. 160 4.1. Harmful content 162 Any application accepting submitted or retrieving TEI XML for 163 processing has to be aware of risks connected with injection of 164 harmful scripts and executable XML. XML 165 inclusion[W3C.REC-xinclude-20061115] and the use of external 166 entities, are vulnerable to various forms of spoofing but can also 167 reveal aspects of a service in a way that may compromise its 168 security. Any vulnerability of these kinds are, however, application 169 specific. The TEI namespaces do not contain such elements. 171 4.2. Intellectual Property Rights 173 TEI documents often arise in digitization of cultural heritage 174 materials. Texts made accessible in TEI format may be unrestricted 175 in the sense that their distribution may be unlimited by Digital 176 Rights Management[DRM] or Intellectual Property Rights[IPR] 177 constraints. However, TEI documents are heterogeneous. Some parts 178 of a document may be unrestricted, whereas other, such as editorial 179 text and annotations may be subject to DRM restrictions. 181 The TEI format provides means for highly granular attribution, down 182 to the content of individual XML elements. Software agents 183 participating in the exchange or processing TEI may be required to 184 honour markup of this kind. Even when there are no IPR constraints, 185 intellectual property attribution alone requires that document users 186 are able to tell the difference between content from different 187 sources. 189 4.3. Authenticity and confidentiality 191 Historical archival records are often encoded in TEI and legal 192 document may be binding centuries after they were written. 193 Digitization and encoding of legal texts may require technologies for 194 assuring authenticity, such as cryptographic check sums and 195 electronic signatures. 197 Similarly, historical documents may in part or in their entirety be 198 confidential. This may be required by law or by the terms and 199 conditions such as in the case of donated or deposited text from 200 private sources. A text archive may need content filtering or 201 cryptographic technologies to meet such requirements. 203 5. IANA Considerations 205 5.1. Registration of MIME type 'application/tei+xml' 207 MIME media type name: application 209 MIME subtype name: tei+xml 211 Required parameters: None 213 Optional parameters: charset 215 the parameter has identical semantics to the charset parameter 216 of the "application/xml" media type as specified in RFC 3023 217 [RFC3023]. 219 Encoding considerations: 221 Identical to those for 'application/xml'. See RFC 3023 222 [RFC3023], Section 3.2. 224 Security considerations: 226 See Security considerations (Section 4) in this specification. 228 Interoperability considerations: 230 TEI documents are often given the extension '.xml', which is 231 not uncommon for other XML document formats. 233 Published specification: 235 This media type registration is for TEI documents[TEI] as 236 described in here. TEI syntax is defined in a 237 schema.[TEIschema] 239 Applications which use this media type: 241 There are currently no known applications using the media type 242 'application/tei+xml'. 244 Additional information: 246 Magic number(s): 248 There is no single initial octet sequence that is always 249 present in TEI documents. 251 file extension(s): 253 Common extensions are '.tei', '.teiCorpus' and '.odd'. See 254 Recognizing TEI files (Section 2) in this specification. 256 Macintosh File Type Code(s) 258 TEXT 260 Object Identifier(s) or OID(s) 262 Not applicable 264 6. References 266 6.1. Normative References 268 [RFC3023] Murata, M., St. Laurent, S., and D. Kohn, "XML Media 269 Types", RFC 3023, January 2001. 271 [RFC3470] Hollenbeck, S., Rose, M., and L. Masinter, "Guidelines for 272 the Use of Extensible Markup Language (XML) 273 within IETF Protocols", BCP 70, RFC 3470, January 2003. 275 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 276 Resource Identifier (URI): Generic Syntax", STD 66, 277 RFC 3986, January 2005. 279 [RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource 280 Identifiers (IRIs)", RFC 3987, January 2005. 282 [TEI] "TEI Guidelines", . 285 [TEIschema] 286 "Schema generated from ODD source", . 289 [W3C.REC-xml-20081126] 290 Maler, E., Yergeau, F., Sperberg-McQueen, C., Paoli, J., 291 and T. Bray, "Extensible Markup Language (XML) 1.0 (Fifth 292 Edition)", World Wide Web Consortium Recommendation REC- 293 xml-20081126, November 2008, 294 . 296 [W3C.REC-xml-names-20091208] 297 Thompson, H., Hollander, D., Tobin, R., Bray, T., and A. 298 Layman, "Namespaces in XML 1.0 (Third Edition)", World 299 Wide Web Consortium Recommendation REC-xml-names-20091208, 300 December 2009, 301 . 303 6.2. Informative References 305 [DRM] "Digital rights management", 306 . 308 [IPR] "Intellectual property", . 311 [ODD] "Getting Started with P5 ODDs", 312 . 314 [W3C.REC-xinclude-20061115] 315 Orchard, D., Marsh, J., and D. Veillard, "XML Inclusions 316 (XInclude) Version 1.0 (Second Edition)", World Wide Web 317 Consortium Recommendation REC-xinclude-20061115, 318 November 2006, 319 . 321 Authors' Addresses 323 Laurent Romary 324 TEI Consortium and INRIA 326 Email: laurent.romary@inria.fr 327 URI: http://www.tei-c.org/ 329 Sigfrid Lundberg 330 The Royal Library, Copenhagen 331 Postbox 2149 332 1016 Koebenhavn K 333 Denmark 335 Email: slu@kb.dk 336 URI: http://sigfrid-lundberg.se/