![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
I have been selected as the General Area Review Team (Gen-ART) reviewer for this draft (for background on Gen-ART, please see http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html).
Please resolve these comments along with any other Last Call comments you may receive.
Document: draft-wilde-text-fragment-06 Reviewer: Spencer Dawkins Review Date: 2007-02-19 IETF LC End Date: 2007-03-14 IESG Telechat date: (if known)
Summary:
Comments:
Thanks,
Spencer
1.1. What is text/plain?
The biggest advantage of text/plain MIME entities is their ease of use and their portability among different platforms. As long as they use popular character encodings (such as US-ASCII or UTF-8), they can be displayed and processed on virtually every computer system. The only remaining interoperability issue is the representation of line endindings, which is discussed in Section 4.1.
Spencer (Nit): s/endind/end/
2. Fragment Identification Methods
The identification of fragments of text/plain MIME entities can be based on different foundations. Since it is not possible to insert explicit, invisible identifiers into a text/plain MIME entity (as for example used in HTML documents, implemented through dedicated attributes), fragment identification has to rely on certain inherent properties of the MIME entity. This memo specifies fragment identification using six different methods, which are character positions and ranges, line positions and ranges, regular expression matching, and a mechanism for improving the robustness of fragment
identifiers (entity hashes).
2.2.1. Character Position
To identify a character position (i.e., a fragment of length zero between two characters), the 'char' scheme followed by a single number is used. Rather than identifying a fragment consisting of a
number of characters, this method identifies a position between two characters (or before the first or after the last character). Character position counting starts with 0, so the character position before the first character of a text/plain MIME entity has the character position 0, and a MIME entity containing n distinct characters has n+1 distinct character positions, the last one having the character position n.
2.5. Fragment Identifier Robustness
Hash sums may specify the character encoding that has been used when creating the hash sums, and if such a specification is present, clients MUST check whether the character encoding specified for the hash sum and the character encoding of the retrieved MIME entity are equal, and clients MUST NOT check the hash sum if these values differ. However, clients MAY choose to transcode the retrieved MIME entity in the case of differing character encodings, and after doing so, check the hash sum. Please note that this method is inhererently unreliable, because certain characters or character sequences may have been lost or normalized due to restrictions in one of the character encodings used.
3. Fragment Identification Syntax
The syntax for the fragment identifiers is straightforward. The syntax defines four schemes, 'char', 'line', 'match', and hash (which can either be 'length' or 'md5'). The 'char' and 'line' schemes can be used in two different variants, either the position variant (with a single number), or the range variant (with two comma-separated numbers). The 'match' scheme has a regular expression as its parameter, which must be specified as a string with escaped semicolons (because the semicolon is used to concatenate multiple fragment identification scheme parts). The hash scheme can either use the 'length' or the 'md5' scheme to specify a hash value.
The following syntax definition uses ABNF as defined in RFC 4234 [7], including the rules DIGIT and HEXDIG.
4.3. Handling of Hash Sums
Clients are not required to implement the handling of hash sums, so they MAY choose to ignore hash sum information altogether. However, if they do implement hash sum handling, the following applies:
If a fragment identifier contains a hash sum, and a client retrieves a MIME entity and detects that the hash sum has changed (observing the character encoding specification as described in Section 3.2, if present), then the client SHOULD NOT interpret any other text/plain
Spencer: why SHOULD NOT, and not MUST NOT?
fragment identifier scheme part. A client MAY signal this situation to the user.
4.4. Syntax Errors in Fragment Identifiers
If a fragment identifier contains a syntax error (i.e., does not conform to the syntax specified in Section 3), then it MUST be ignored by clients. Clients SHOULD NOT make any attempt to correct
Spencer: again, why SHOULD NOT, and not MUST NOT?
or guess fragment identifiers. Syntax errors MAY be reported by clients.
5. Examples
The following examples show some usages for the fragment identifiers defined in this memo.
Spencer: this section is very helpful. Thank you for including it.
ftp://example.com/text.txt#line=10,20;length=9876,UTF-8
As in the second example, this URI identifies lines 11 to 20 of the text.txt MIME entity. The additional length hash sum specifies that the MIME entity has a length of 9876 characters when encoded in UTF-8. If the client supports the length hash sum scheme, it may test the retrieved MIME entity for its length, but only if the retrieved MIME entity uses the UTF-8 encoding or has been locally trancoded into this encoding. If the length of the retrieved MIME entity does not match the length specified in the fragment identifier, the client SHOULD NOT interpret the line part and MAY signal this to the user.
_______________________________________________ Ietf mailing list Ietf at ietf.org https://www1.ietf.org/mailman/listinfo/ietf
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.