idnits 2.17.1 

draft-hausenblas-csv-fragment-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([12]), which it shouldn't. 
     Please replace those with straight textual mentions of the documents in
     question.

  -- The draft header indicates that this document updates RFC4180, but the
     abstract doesn't seem to mention this, which it should.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
     (Using the creation date from RFC4180, updated by this document, for
     RFC5378 checks: 2005-02-03)

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (December 29, 2012) is 4135 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Downref: Normative reference to an Informational RFC: RFC 4180 (ref. '1')

  ** Obsolete normative reference: RFC 4234 (ref. '6') (Obsoleted by RFC 5234)

  -- Possible downref: Non-RFC (?) normative reference: ref. '7'

  -- Obsolete informational reference (is this intentional?): RFC 4288 (ref.
     '11') (Obsoleted by RFC 6838)


     Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                      M. Hausenblas
3	Internet-Draft                                          DERI, NUI Galway
4	Updates: 4180 (if approved)                                     E. Wilde
5	Intended status: Standards Track                         EMC Corporation
6	Expires: July 2, 2013                                        J. Tennison
7	                                                     Open Data Institute
8	                                                       December 29, 2012

10	          URI Fragment Identifiers for the text/csv Media Type
11	                    draft-hausenblas-csv-fragment-01

13	Abstract

15	   This memo defines URI fragment identifiers for text/csv MIME
16	   entities.  These fragment identifiers make it possible to refer to
17	   parts of a text/csv MIME entity, identified by cell, row, column, or
18	   slice.

20	Note to Readers

22	   This draft should be discussed on the apps-discuss mailing list [12].

24	Status of this Memo

26	   This Internet-Draft is submitted in full conformance with the
27	   provisions of BCP 78 and BCP 79.

29	   Internet-Drafts are working documents of the Internet Engineering
30	   Task Force (IETF).  Note that other groups may also distribute
31	   working documents as Internet-Drafts.  The list of current Internet-
32	   Drafts is at http://datatracker.ietf.org/drafts/current/.

34	   Internet-Drafts are draft documents valid for a maximum of six months
35	   and may be updated, replaced, or obsoleted by other documents at any
36	   time.  It is inappropriate to use Internet-Drafts as reference
37	   material or to cite them other than as "work in progress."

39	   This Internet-Draft will expire on July 2, 2013.

41	Copyright Notice

43	   Copyright (c) 2012 IETF Trust and the persons identified as the
44	   document authors.  All rights reserved.

46	   This document is subject to BCP 78 and the IETF Trust's Legal
47	   Provisions Relating to IETF Documents
48	   (http://trustee.ietf.org/license-info) in effect on the date of
49	   publication of this document.  Please review these documents
50	   carefully, as they describe your rights and restrictions with respect
51	   to this document.  Code Components extracted from this document must
52	   include Simplified BSD License text as described in Section 4.e of
53	   the Trust Legal Provisions and are provided without warranty as
54	   described in the Simplified BSD License.

56	Table of Contents

58	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . 3
59	     1.1.  What is text/csv? . . . . . . . . . . . . . . . . . . . . . 3
60	     1.2.  Why text/csv Fragment Identifiers?  . . . . . . . . . . . . 3
61	       1.2.1.  Motivation  . . . . . . . . . . . . . . . . . . . . . . 3
62	       1.2.2.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . 4
63	     1.3.  Incremental Deployment  . . . . . . . . . . . . . . . . . . 4
64	     1.4.  Notation Used in this Memo  . . . . . . . . . . . . . . . . 4
65	   2.  Fragment Identification Methods . . . . . . . . . . . . . . . . 4
66	     2.1.  Header  . . . . . . . . . . . . . . . . . . . . . . . . . . 5
67	     2.2.  Row-based selection . . . . . . . . . . . . . . . . . . . . 5
68	     2.3.  Column-based selection  . . . . . . . . . . . . . . . . . . 5
69	     2.4.  Cell-based selection  . . . . . . . . . . . . . . . . . . . 6
70	     2.5.  Slice-based selection . . . . . . . . . . . . . . . . . . . 6
71	   3.  Fragment Identification Syntax  . . . . . . . . . . . . . . . . 6
72	   4.  Fragment Identifier Processing  . . . . . . . . . . . . . . . . 7
73	     4.1.  Syntax Errors in Fragment Identifiers . . . . . . . . . . . 7
74	   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7
75	   6.  Security Considerations . . . . . . . . . . . . . . . . . . . . 8
76	   7.  Change Log  . . . . . . . . . . . . . . . . . . . . . . . . . . 8
77	     7.1.  From -00 to -01 . . . . . . . . . . . . . . . . . . . . . . 8
78	   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . . . 8
79	     8.1.  Normative References  . . . . . . . . . . . . . . . . . . . 8
80	     8.2.  Non-Normative References  . . . . . . . . . . . . . . . . . 9
81	   Appendix A.  Acknowledgements . . . . . . . . . . . . . . . . . . . 9
82	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . . . 9

84	1.  Introduction

86	   This memo updates the text/csv media type defined in RFC 4180 [1] by
87	   defining URI fragment identifiers for text/csv MIME entities.

89	   This section gives an introduction to the general concepts of text/
90	   csv MIME entities and URI fragment identifiers, and discusses the
91	   need for fragment identifiers for text/csv and deployment issues.
92	   Section 2 discusses the principles and methods on which this memo is
93	   based.  Section 3 defines the syntax, and Section 4 discusses
94	   processing of text/csv fragment identifiers.

96	1.1.  What is text/csv?

98	   Internet Media Types (often referred to as "MIME types") as defined
99	   in RFC 2045 [2] and RFC 2046 [3] are used to identify different types
100	   and sub-types of media.  The text/csv media type is defined in RFC
101	   4180 [1], using US-ASCII [9] as the default character encoding (other
102	   character encodings can be used as well).

104	1.2.  Why text/csv Fragment Identifiers?

106	   URIs are the identification mechanism for resources on the Web. The
107	   URI syntax specified in RFC 3986 [4] optionally includes a so-called
108	   "fragment identifier", separated by a number sign ('#').  The
109	   fragment identifier consists of additional reference information to
110	   be interpreted by the user agent after the retrieval action has been
111	   successfully completed.  The semantics of a fragment identifier is a
112	   property of the data resulting from a retrieval action, regardless of
113	   the type of URI used in the reference.  Therefore, the format and
114	   interpretation of fragment identifiers is dependent on the media type
115	   of the retrieval result.

117	1.2.1.  Motivation

119	   Similar to the motivation in RFC 5147 [10], referring to specific
120	   parts of a resource can be very useful, because it enables users and
121	   applications to create more specific references.  Users can create
122	   references to the part they really are interested in or want to talk
123	   about, rather than always pointing to a complete resource.  Even
124	   though it is suggested that fragment identification methods are
125	   specified in a media type's MIME registration (see [11]), many media
126	   types do not have fragment identification methods associated with
127	   them.

129	   Fragment identifiers are only useful if supported by the client,
130	   because they are only interpreted by the client.  Therefore, a new
131	   fragment identification method will require some time to be adopted
132	   by clients, and older clients will not support it.  However, because
133	   the URI still works even if the fragment identifier is not supported
134	   (the resource is retrieved, but the fragment identifier is not
135	   interpreted), rapid adoption is not highly critical to ensure the
136	   success of a new fragment identification method.

138	1.2.2.  Use Cases

140	   Fragment identifiers for text/csv as defined in this memo make it
141	   possible to refer to specific parts of a text/csv MIME entity.  Use
142	   cases include, but are not limited to, discovery (what column
143	   headings or how many rows are available), selecting a part for visual
144	   rendering, stream processing, making assertions about a certain value
145	   (provenance, confidence, etc.), or data integration.

147	1.3.  Incremental Deployment

149	   As long as text/csv fragment identifiers are not supported
150	   universally, it is important to consider the implications of
151	   incremental deployment.  Clients (for example, Web browsers) not
152	   supporting the text/csv fragment identifier described in this memo
153	   will work with URI references to text/csv MIME entities, but they
154	   will fail to to understand the identification of the sub-resource
155	   specified by the fragment identifier, and thus will behave as if the
156	   complete resource was referenced.  This is a reasonable fallback
157	   behavior, and in general users should take into account the
158	   possibility that a program interpreting a given URI will fail to
159	   interpret the fragment identifier part.  Since fragment identifier
160	   evaluation is local to the client (and happens after retrieving the
161	   MIME entity), there is no reliable way for a server to determine
162	   whether a requesting client is using a URI containing a fragment
163	   identifier.

165	1.4.  Notation Used in this Memo

167	   The capitalized key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
168	   "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
169	   "OPTIONAL" in this document are to be interpreted as described in RFC
170	   2119 [5].

172	2.  Fragment Identification Methods

174	   This memo specifies fragment identification using following methods:
175	   header, row, column, cell and slice.  As of RFC 4180 [1] the header
176	   line is optional and hence the application of the method is dependent
177	   on the actual format of the text/csv MIME entity.

179	   Throughout the sections below the following table in CSV is used:
180	   date,temperature,place
181	   2011-01-01,1,Galway
182	   2011-01-02,-1,Galway
183	   2011-01-03,0,Galway
184	   2011-01-01,6,Berkeley
185	   2011-01-02,8,Berkeley
186	   2011-01-03,5,Berkeley

188	2.1.  Header

190	   For discovery purposes, the "head" scheme is used, returning the
191	   first row.  If the "header" parameter per RFC 4180 [1] is available
192	   and its value is "present" the client can reliable determine that it
193	   is a header.
194	   http://example.com/data.csv#head

196	   Applied to the reference table, the above CSV fragment would select
197	   the header row, yielding:
198	   date,temperature,place

200	2.2.  Row-based selection

202	   To select a specific record, the "row" scheme followed by a single
203	   number is used (the first record has the index 0).  If the fragment
204	   is given in the form row:*, then no record is selected but the
205	   overall number of records is returned.
206	   http://example.com/data.csv#row:2

208	   The above CSV fragment yields: while the following computes the
209	   number of records (which equals 6, in the reference table)
210	   2011-01-03,0,Galway

212	   The following computes the number of records (which equals 6, in the
213	   reference table):
214	   http://example.com/data.csv#row:*

216	2.3.  Column-based selection

218	   To select values from a certain column, the "col" scheme, followed
219	   either by a single number or the value of a header field is used.
220	   http://example.com/data.csv#col:temperature

222	   The above CSV fragment addresses a column by name, yielding:
223	   1,-1,0,6,8,5

225	   A column can also be addressed by position as shown in the next
226	   example:

228	   http://example.com/data.csv#col:2

230	   The above CSV fragment selects the third column:
231	   Galway,Galway,Galway,Berkeley,Berkeley,Berkeley

233	2.4.  Cell-based selection

235	   To select a particular field within a row, use the "cell" scheme,
236	   followed by a row number, a comma, and either a single number or the
237	   value of a header field.
238	   http://example.com/data.csv#cell:2,date

240	   The above CSV fragment addresses the field in the date column within
241	   the third row, yeilding:
242	   2011-01-03

244	   A field can also be addressed by position as shown in the next
245	   example:
246	   http://example.com/data.csv#cell:3,1

248	   The above CSV fragment selects the second column in the fourth row:
249	   6

251	2.5.  Slice-based selection

253	   To select a part of table, called a slice in the following, the
254	   "where" scheme is used.  The allowed values are a comma-separated
255	   list of header fields with corresponding field values in the table.
256	   http://example.com/data.csv#where:date=2011-01-01

258	   The above CSV fragment selects a slice, yielding another CSV table as
259	   follows:
260	   temperature,place
261	   1,Galway
262	   6,Berkeley

264	3.  Fragment Identification Syntax

266	   The syntax for the text/csv fragment identifiers is as follows.

268	   The following syntax definition uses ABNF as defined in RFC 4234 [6],
269	   including the rules DIGIT and HEXDIG.  The mime-charset rule is
270	   defined in RFC 2978 [7].

272	   NOTE:  In the descriptions that follow, specified text values MUST be
273	      used exactly as given, using exactly the indicated lower-case
274	      letters.  In this respect, the ABNF usage differs from [6].

276	   csv-fragment =  headersel / wheresel / colsel / rowsel / cellsel
277	   headersel = "head"
278	   rowsel   = "row:" rowspec
279	   colsel   = "col:" colspec
280	   cellsel  = "cell:" cellspec
281	   wheresel = "where:" kvpairs
282	   kvpairs = 1*( col "=" val 0*1(",") )
283	   col = 1*TEXTDATA
284	   val = 1*TEXTDATA
285	   colspec = column
286	   rowspec = "*" / rownum
287	   cellspec = rownum "," column
288	   column = 1*TEXTDATA / 1*DIGIT
289	   rownum = 1*DIGIT
290	   TEXTDATA =  %x23-2B / %x2D-3C / %x3E-7E
291	   DIGIT =  %x30-39

293	4.  Fragment Identifier Processing

295	   Applications implementing support for the mechanism described in this
296	   memo MUST behave as described in the following sections.

298	4.1.  Syntax Errors in Fragment Identifiers

300	   If a fragment identifier contains a syntax error (i.e., does not
301	   conform to the syntax specified in Section 3), then it MUST be
302	   ignored by clients.  Clients MUST NOT make any attempt to correct or
303	   guess fragment identifiers.  Syntax errors MAY be reported by
304	   clients.

306	5.  IANA Considerations

308	   Note to RFC Editor: Please change this section to read as follows
309	   after the IANA action has been completed: "IANA has added a reference
310	   to this specification in the text/csv Media Type registration."

312	   IANA is requested to update the registration of the MIME Media type
313	   text/csv at http://www.iana.org/assignments/media-types/text/ with
314	   the fragment identifier defined in this memo by adding a reference to
315	   this memo (with the appropriate RFC number once it is known).

317	6.  Security Considerations

319	   The fact that software implementing fragment identifiers for CSV and
320	   software not implementing them differs in behavior, and the fact that
321	   different software may show documents or fragments to users in
322	   different ways, can lead to misunderstandings on the part of users.
323	   Such misunderstandings might be exploited in a way similar to
324	   spoofing or phishing.

326	   ...

328	   Implementers and users of fragment identifiers for CSV text should
329	   also be aware of the security considerations in RFC 3986 [4] and RFC
330	   3987 [8].

332	7.  Change Log

334	   Note to RFC Editor: Please remove this section before publication.

336	7.1.  From -00 to -01

338	   o  Added cell-based selections.

340	   o  Added Jeni Tennison as author; updated Erik Wilde's affiliation to
341	      EMC.

343	8.  References

345	8.1.  Normative References

347	   [1]   Shafranovich, Y., "Common Format and MIME Type for Comma-
348	         Separated Values (CSV) Files", RFC 4180, October 2005.

350	   [2]   Freed, N. and N. Borenstein, "Multipurpose Internet Mail
351	         Extensions (MIME) Part One: Format of Internet Message Bodies",
352	         RFC 2045, November 1996.

354	   [3]   Freed, N. and N. Borenstein, "Multipurpose Internet Mail
355	         Extensions (MIME) Part Two: Media Types", RFC 2046,
356	         November 1996.

358	   [4]   Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
359	         Resource Identifier (URI): Generic Syntax", RFC 3986,
360	         January 2005.

362	   [5]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
363	         Levels", RFC 2119, March 1997.

365	   [6]   Crocker, D. and P. Overell, "Augmented BNF for Syntax
366	         Specifications: ABNF", RFC 4234, October 2005.

368	   [7]   Freed, N. and J. Postel, "IANA Charset Registration
369	         Procedures", BCP 19, October 2000.

371	   [8]   Duerst, M. and M. Suignard, "Internationalized Resource
372	         Identifiers (IRI)", RFC 3987, January 2005.

374	8.2.  Non-Normative References

376	   [9]   ANSI X3.4-1986, "Coded Character Set - 7-Bit American National
377	         Standard Code for Information Interchange", STD 63, RFC 3629,
378	         1992.

380	   [10]  Wilde, E. and M. Duerst, "URI Fragment Identifiers for the
381	         text/plain Media Type", RFC 5147, April 2008.

383	   [11]  Freed, N. and J. Klensin, "Media Type Specifications and
384	         Registration Procedures", RFC 4288, December 2005.

386	URIs

388	   [12]  <https://www.ietf.org/mailman/listinfo/apps-discuss>

390	Appendix A.  Acknowledgements

392	   Thanks for comments and suggestions provided by Richard, Ian, Gannon.

394	Authors' Addresses

396	   Michael Hausenblas
397	   DERI, NUI Galway
398	   IDA Business Park
399	   Galway
400	   Ireland

402	   Phone: +353-91-495730
403	   Email: michael.hausenblas@deri.org
404	   URI:   http://sw-app.org/about.html
405	   Erik Wilde
406	   EMC Corporation
407	   6801 Koll Center Parkway
408	   Pleasanton, CA 94566
409	   U.S.A.

411	   Phone: +1-925-6006244
412	   Email: erik.wilde@emc.com
413	   URI:   http://dret.net/netdret/

415	   Jeni Tennison
416	   Open Data Institute
417	   65 Clifton Street
418	   London EC2A 4JE
419	   U.K.

421	   Phone: +44-797-4420482
422	   Email: jeni@jenitennison.com
423	   URI:   http://www.jenitennison.com/blog/